Original problem: under high loaded CPU, some captured frames have artifacts.
As the log “out of order field received” is printed every 1s100ms, it is obvious that a field might be missing/dropped somewhere.
In general, this capture driver works this way:
- buffer handling.
- The driver starts capturing into buffer when a field with specified field order(top, or bottom) comes. Then, it switches to a new buffer after two fields have been captured into one buffer.
- If the “field order” field indicates an unexpected value, the capture driver tears down the capture interface, and set frame flag of the current buffer to “error”, then set up a new buffer for future capturing.
- The content of the “error” frame might be an old (good) frame (if the unexpected field order is detected right after switching to this buffer), or the previous (good) field is combined/weaved with an fairly old field (if the unexpected field order occurs after a field has been stored in the buffer). in the former case, the video has out-of-sequence issue while playing; in the latter case, the individual frame has artifacts.
- IRQ handling
- The capture thread is “receive blocked”, waiting for an frame complete event(pulse).
- After a frame is complete, ISR handler disables the capture interface and delivers this pulse to the thread, which will unblock the capture thread.
- After the capture thread gets scheduled to run, it sets up a new buffer for capturing, and enable the capture interface (practically, the ISR is re-armed).
It is impossible that an interrupt is not raised due to high-load CPU, and it is nearly impossible that ISR handler is not entered (as ISR handler carries higher priority than any threads). However, it is possible that the capture ISR is re-armed too late, which leads to the field missing.
The kernel trace proved this assumption. According to the kernel trace,
In normal case,
- the time between two NTSC fields (FE of the first one, FS of the second one) is 560us ~ 1ms;
- the time between ISR returning an event and the event being passed to the thread is 5 ~ 10us;
- once get a chance to run, the capture thread spends 60us to do it’s work (buffer handling, re-arm interrupts)
In the false case,
- It takes too long for the sigevent/pulse to be delivered to the thread (by kernel) after 600us ~ 800us.
- the pulse is delivered immediately, and the thread gets ready soon. However, the kernel waits about 600~800us to schedule it to run.
- in either case, when the capture interface is reenabled by the capture thread, it has missed the FS signal of a field. and can only starting capturing until the FS signal of the next field arrives.
After analyzing the kernel trace with one kernel expert, we found two reasons for this long time delay.
- The pulse was initialized with the priority “SIGEV_PULSE_PRIO_INHERIT”, which indicates the pulse inherits the priority from the process, and the driver “wants the thread that receives the pulse to run at the initial priority of the process”(per QNX document).
- the pulse inherits a priority “10”.
- There is a kernel thread with priority 10 is running on the same CPU when ISR handler returns a pulse.
- The kernel sees a priority 10 pulse with SIGEV_PULSE_PRIO_INHERIT defined, and knows a priority 10 thread is running. Therefore, it doesn’t deliver the pulse to the thread until it knows the current thread is going to be blocked(the kernel is lazy, until it has to do something?)
This could be fixed by setting an explicit priority number to the pulse (the same as the thread priority).
- Adaptive Partitioning is used( Note that APS isn’t a strictly priority-based scheduler). At some points, the partition hw_capture_thread belongs to was out of budget, then hw_capture _thread waited too long to be scheduled to run.
- the Partition Summary view shows partition 4, to which capture thread belongs, has 3% budget but consumed 90% of cpu. This is unreasonable.
- Therefore, the system designer should reconsider which threads belong to which partition, partition budgets, etc. They can consider moving the thread into a different partition, or changing partition budget, or marking thread as critical – depending on what they are trying to accomplish with APS. “
Usually, ISR handler only do some register read/write, and whatever is mandatary. If there are more things to do, it schedules a thread to do the actual work: ISR handler returns a pointer to a const struct sigevent. Then, the kernel looks at the structure and delivers the event to the destination thread.