- AFE (analog Front End): pre-process and digitizes the analog signal.
- Automatic gain control (via a variable gain amplifier) on Luma or Chroma
- For luma, AGC may operate based on the depth of the horizontal sync pulse, or a fixed manual gain, etc.
- For chroma, AGC may operate based on the depth of the horiontal sync pulse on the luma channel, etc
- Color killer: usually there is a threshold, if the color carrier of the incoming video signal is less than the threshold, the decoder switchs off the color processing.
- Clamping circuit: restore the proper DC level.
- Video characteristic adjustement
- Color saturation
- CTI (color transient improvement): to enhance the color bandwidth
8 bit 2’s complement form:
if there are 16 levels, 0 ~ 15. We can do this mapping for positive values: reg_val = (usr_val * 15) / 128 + 1 for positive numbers.
Brightness -128 ~ 127, usually set in the register as (8-bit) 2’s complement form. 1(1), 0x7F(127) is the brightest. 0x80(-128) is the darkest, 0xFF (-1).
Contrast (luminance contrast gain) -128 ~ 127, 100(64h) means a gain of 1. Increase of contrast by “1” means adjustment of gain by %1.
Saturation(chroma gain, for U and V) -128 ~ 127
Hue +36 to –36. Only effective on NTSC system. The positive value gives greenish tone and negative value gives purplish tone.
Sharpness (for luma) 0 ~ 15. 15 provide the strongest sharpness.
Original problem: under high loaded CPU, some captured frames have artifacts.
As the log “out of order field received” is printed every 1s100ms, it is obvious that a field might be missing/dropped somewhere.
In general, this capture driver works this way:
- buffer handling.
- The driver starts capturing into buffer when a field with specified field order(top, or bottom) comes. Then, it switches to a new buffer after two fields have been captured into one buffer.
- If the “field order” field indicates an unexpected value, the capture driver tears down the capture interface, and set frame flag of the current buffer to “error”, then set up a new buffer for future capturing.
- The content of the “error” frame might be an old (good) frame (if the unexpected field order is detected right after switching to this buffer), or the previous (good) field is combined/weaved with an fairly old field (if the unexpected field order occurs after a field has been stored in the buffer). in the former case, the video has out-of-sequence issue while playing; in the latter case, the individual frame has artifacts.
- IRQ handling
- The capture thread is “receive blocked”, waiting for an frame complete event(pulse).
- After a frame is complete, ISR handler disables the capture interface and delivers this pulse to the thread, which will unblock the capture thread.
- After the capture thread gets scheduled to run, it sets up a new buffer for capturing, and enable the capture interface (practically, the ISR is re-armed).
It is impossible that an interrupt is not raised due to high-load CPU, and it is nearly impossible that ISR handler is not entered (as ISR handler carries higher priority than any threads). However, it is possible that the capture ISR is re-armed too late, which leads to the field missing.
The kernel trace proved this assumption. According to the kernel trace,
In normal case,
- the time between two NTSC fields (FE of the first one, FS of the second one) is 560us ~ 1ms;
- the time between ISR returning an event and the event being passed to the thread is 5 ~ 10us;
- once get a chance to run, the capture thread spends 60us to do it’s work (buffer handling, re-arm interrupts)
In the false case,
- It takes too long for the sigevent/pulse to be delivered to the thread (by kernel) after 600us ~ 800us.
- the pulse is delivered immediately, and the thread gets ready soon. However, the kernel waits about 600~800us to schedule it to run.
- in either case, when the capture interface is reenabled by the capture thread, it has missed the FS signal of a field. and can only starting capturing until the FS signal of the next field arrives.
After analyzing the kernel trace with one kernel expert, we found two reasons for this long time delay.
- The pulse was initialized with the priority “SIGEV_PULSE_PRIO_INHERIT”, which indicates the pulse inherits the priority from the process, and the driver “wants the thread that receives the pulse to run at the initial priority of the process”(per QNX document).
- the pulse inherits a priority “10”.
- There is a kernel thread with priority 10 is running on the same CPU when ISR handler returns a pulse.
- The kernel sees a priority 10 pulse with SIGEV_PULSE_PRIO_INHERIT defined, and knows a priority 10 thread is running. Therefore, it doesn’t deliver the pulse to the thread until it knows the current thread is going to be blocked(the kernel is lazy, until it has to do something?)
This could be fixed by setting an explicit priority number to the pulse (the same as the thread priority).
- Adaptive Partitioning is used( Note that APS isn’t a strictly priority-based scheduler). At some points, the partition hw_capture_thread belongs to was out of budget, then hw_capture _thread waited too long to be scheduled to run.
- the Partition Summary view shows partition 4, to which capture thread belongs, has 3% budget but consumed 90% of cpu. This is unreasonable.
- Therefore, the system designer should reconsider which threads belong to which partition, partition budgets, etc. They can consider moving the thread into a different partition, or changing partition budget, or marking thread as critical – depending on what they are trying to accomplish with APS. “
Usually, ISR handler only do some register read/write, and whatever is mandatary. If there are more things to do, it schedules a thread to do the actual work: ISR handler returns a pointer to a const struct sigevent. Then, the kernel looks at the structure and delivers the event to the destination thread.