Video timing parameters

Note that back porch and front porch refer to the relative positions to SYNC pulse, instead of active data.

Therefore, when it comes to the image, the upper or left margin is resulted by back porch, instead of front porch.

Below are a few example timing parameters.

{
        .hpixels = 800, .hfp= 137, .hsw= 4, .hbp= 44,
        .vlines  = 480, .vfp=  64, .vsw= 2, .vbp= 76,
}
horDisplayPeriod =720
horPulseStart = 816
horPulseEnd = 856
horTotal = 880
vertDisplayPeriod = 480
vertPulseStart = 512
vertPulseEnd = 515
vertTotal = 525
horDisplayPeriod720
horPulseStarthorFrontPorch80080
horPulseEndhorSyncPulseWidth84040
horTotalhorBackPorch86424
vertDisplayPeriod480
vertPulseStartvertFrontPorch52040
vertPulseEndvertSyncPulseWidth5233
vertTotalvertBackPorch53310

Notes:

(1) When it comes to “total”, it usually contains syncPulseWidth. For instance, horTotal = HorDisplayPeriod + horFrontPorch + horBackPorch + horSyncPulseWidth. However, syncPulseWidth is usually eliminated when calculating the image raw size! The main reason: HSYNC pulse is carried on HSYNC line, to the data/pixel line!

(2) the position 0 locates at the start of active/display image. Using the last HORIZONTAL timing params as an example,

840 (864)0 720 800 840

|–backporch–|—— active pixel data —- —|–frontporch–| — hsync –|

snprintf() and strncpy()

snprintf() is used to Write formatted output to a character array, up to a given maximum number of characters.

strncpy() is to copy a string, to a maximum length.

Although both allow a programmer to select a number, e, g. n, as the maximum, there are some differences.

snprintf() copies at most n – 1 characters into the output buffer and then appends a terminating null character ( a character with all its bits set to zero). Therefore, the string in the output buffer is appropriated terminated.

strncpy() copies no more than n characters into destination string, and it doesn’t guarantee the termination of the destination string. Details are as below:

– If the source string  is shorter than n characters, null characters are appended to the destination string, until n characters have been written. – If the source string is longer than n characters, then n characters will be copied to the destination string.

With this in mind, it’s better to terminate the string explicitly:

        strncpy (buffer, name, sizeof (buffer));

        buffer[sizeof (buffer) – 1] = ‘\0’;

if the buffer was allocated using calloc(), or has been initialized to all 0, an alternative is:

   strncpy (buffer, name, sizeof (buffer) – 1);

Linux GPIO(2) – IMPLEMENT and ACCESS EXPANDED gpio

drivers/gpio/gpio-pca953x.c is a driver for tca9539 and similar TI I2C I/O expander.

In the function pca953x_probe(),

  • set up the list of the callback functions (direction_output() etc), and other properties, such as base, ngpio.
  • register the gpio chip via gpiochip_add_data().

In the probe function of max9286 driver,

  • read the device tree line “pwdn-gpios = <&tca9539 2 0> “

sensor->pwdn_gpio = devm_gpiod_get_optional(&client->dev, “pwdn”, GPIOD_IN);

  • power on max9286: gpiod_direction_output(sensor->pwdn_gpio, 1);

Note:

driver/gpio/gpiolib.c is the core of GPIO implementation.

  • A new GPIO chip registers itself via calling the function gpiochip_add_data();
  • the gpio calls from users, such as gpiod_get(), gpiod_direction_output(), goes here first, then the chip-registered callback functions will be invoked.
  • devm_gpiod_xx() functions are resource-managed version of gpiod_xx() functions.

For more details, refer to General Purpose Input/Output (GPIO) — The Linux Kernel documentation

I2C(2) — QNX implementation

There is a resource manager interface, for applications to access a master. The resource manager layer registers a device name (usually /dev/i2c0).

Interface between applications and resmgr:  devctl() commands, such as

  • DCMD_I2C_SEND It executes a master send transaction and returns when the transaction is complete.
  • DCMD_I2C_RECV command executes a master receive transaction. It returns when the transaction is complete.
  • DCMD_I2C_SENDRECV command executes a send followed by a receive. This sequence is typically used to read a slave device’s register value. When multiple applications access the same slave device, it is necessary to execute this sequence atomically to prevent register reads from being interrupted.

A typical one-byte write function in an application:

struct send_recv
{
    i2c_send_t hdr;
    uint8_t buf[2];
} i2c_data;
	int error = EOK;

	i2c_data.buf[0] = reg_addr;
	i2c_data.buf[1] = data;
	i2c_data.hdr.len = 2;
	i2c_data.hdr.slave.addr = slave_addr;
	i2c_data.hdr.slave.fmt = I2C_ADDRFMT_7BIT;
	i2c_data.hdr.stop = 1;

	error = devctl(fd, DCMD_I2C_SEND, &i2c_data, sizeof (i2c_data), NULL);

A typical one-byte read function in an application:

struct send_recv
{
    i2c_sendrecv_t hdr;
    uint8_t buf[2];
} i2c_data;

int error = EOK;
int retry, timeout = 10;

i2c_data.buf[0] = reg_addr;
i2c_data.hdr.send_len = 1;
i2c_data.hdr.recv_len = 1;
i2c_data.hdr.slave.addr = dev_addr;
i2c_data.hdr.slave.fmt = I2C_ADDRFMT_7BIT;
i2c_data.hdr.stop = 1;
error = devctl(fd, DCMD_I2C_SENDRECV, &i2c_data, sizeof (i2c_data), NULL);

Interface between resource manager and the hardware driver: a list of functions.

  • upon receiving DCMD_I2C_SEND, resmgr calls h/w function send().
  • upon receiving DCMD_I2C_SENDRECV, resmgr calls .send() first, then .recv() function.

I2C hardware driver implements the list of functions. The implementation is h/w dependent. The description below is an example which is done for a specific I2C controller.

  • .send function
    • send the slave address to the bus via a few out32()s .
    • wait for ACK (could be an interrupt).
    • send the register address to the bus via out32() calls.-— buf[0]
    • wait for ACK (could be interrupt).
    • send each data byte to the bus via out32() calls. — buf[1], buf[2],,
    • wait for ACK (could be interrupt).
  • . recv function
    • send the slave address to the bus: a few out32()s.
    • wait for ACK (could be an interrupt).
    • initiate READ via out32() for each byte requested.
    • wait for the READ done.

More about the implementation

  • The resmgr is single-threaded, and each DCMD is handled in a synchronized way (returns until the requested transaction is complete) . As a result,
    • when multiple applications try to access different I2C slaves, respectively, at the same time, there is no issue — the DCMD commands are taken care of one by one, without being interrupted by others.
    • when multiple applications try to access the same I2C slave at the same time, although the individual command can be done without interruption, there could be race conditions. This needs to be avoided.
  • The existing resmgr and i2c drivers only work in single-thread resmgr environment. If there are multiple threads handling CDMD commands, it will be a total mess..
    • DCMD_I2C_SEND/RECV leads to a send or receive transaction. Although the transaction in the physical level can be considered “atomic”, the implementation in resmgr and h/w driver is not atomic at all: there are quite some lines of code, there are a few calls to out32()s, and most critically, there is no mutex or other protections in either resmgr or driver.
  • If there are multiple masters, a separate instance of the resource manager should be run for each master.
    • the option ‘c’ looks for this purpose: ‘c’ is used to specify the number of threads. once specified, each thread will start a resource manager.
    • in this case, although the arbitration is supported in the physical layer, some synchronization will be needed between different master/resmgr.

Linux GPIO(1) – device tree

There are 2 types of GPIOs: system GPIO, expanded GPIO via I2C I/O expander. Although they use different hardware components hence the low-level drivers are different, Linux provides the same interfaces to access them.

A popular way to map GPIOs to devices and functions is via device tree. Below is an example of expanded GPIO in the device tree. There are 2 devices attached to I2C1 bus: tca9539, max9286.

i2c1:i2c@400a0000 {
	tca9539:gpio@74 {
		compatible = "ti,tca9539";
		reg = <0x74>;
		gpio-controller;
	};
   max9286:max9286@48 {
        compatible = "max,max9286";
        reg = <0x48>;
        pwdn-gpios = <&tca9539 2 0>;
    }
};

max9286@48 is a node, named using the convention node-name@unit-address. node-name describes the general class of device; unit-address must match the first address specified in reg property of the node if the node has reg property.

compatible property is used for device driver selection. The recommended format is “manufacturer,model“.

pwdn-gpios is a property which describes a GPIO pin. The name follows the convention <function>-gpios. Note that <function>-gpio is accepted as well, however it has been deprecated and is mainly used for compatibility reasons.

in <&tca9539 2 0>, tca9539 is the GPIO chip, 2 is the GPIO pin, and 0 indicates GPIO_ACTIVE_HIGH.

in max9286 driver,

static const struct of_device_id max9286_dt_ids[] = {
	{.compatible = "max,max9286"},
	{ /* sentinel */ }
};
static struct i2c_driver max9286_i2c_driver = {
	.driver = {
		   .name = "max9286",
		   .of_match_table = max9286_dt_ids,
		   },
	.id_table = max9286_id,
	.probe = max9286_probe,
	.remove = max9286_remove,
};

The probe function is called when an entry in the id_table name field matches the device’s name.

For implementation of the expanded GPIO in Linux, see the post LINUX GPIO(2) — implement and access expanded GPIO

deinterlace

Background

Interlacing was first introduced in analog television system, where a full-line frame is transmitted via 2 interlaced fields, with one containing the odd lines and the other one containing the even lines.

Since the analog television system had been widely used for quite long time, lots of other media devices use this as a standard to generate video, e.g. DVDs, cameras.

CRT-based displays were able to display interlaced video correctly due to their complete analog nature. However, these days, displays are going digital and getting bigger, displaying interlaced video directly on these displays will lead to noticeable visual defect. This is why the deinterlacing process is needed.

There is one thing worth mentioning: the fields are captured/filmed at a rate twice that of the nominal frame rate. This means that the two consecutive fields actually contain the images which are taken at different time points. Keeping this in mind will help understand the artifact of WEAVE deinterlace, which we will mention later.

Available methods

1. WEAVE

As the name indicates, this method combine the two consecutive fields together by storing one field to odd lines and storing the other one to even lines. The resulted frame rate is usually halved.

This is ideal for static images or when there are barely changes between fields. When there is movement between fields, the deinterlace frames will show jaggy or combing artifact.

Note: By weaving a field with the previous one to produce one frame, then combining the same field with the next one after it, the original frame rate can be achieved (NTSC 60fps, PAL 50fps). This usually requires some extra DMA transfer.

2. BOB (or line doubler)

This method takes lines of each field and double them. A new added line could be a direct copy of the original line, or an average of two adjacent lines. This method keeps the original frame rate.

It prevents combing artifacts and maintains smooth motion but can cause a noticeable reduction in picture quality from the loss of vertical resolution. If the original input contains tationary objects, they can appear to bob up and down as the odd and even lines alternate.

3. Motion Adaptive deinterlace

It tries to predict the direction and the amount of image motion between consecutive fields. The concept is similar to the block motion compensation used in video compression standards (MPEG4, etc).

It usually done by a dedicated hardware module, which takes 3 fields as input, and output a deinterlaced frame. Usually the resulted frame rate is identical as the original stream.

This achieves the best quality when there are movements between fields.

One use case of fixed-point number — image scaling

One of the image processing procedures is scaling — to resize the original picture to be bigger or smaller. if this is done by a hardware block, usually the software needs tell the hardware about the resizing details:

  1. provide the original picture size and the scaled size only. This is easy for the software, while the hardware needs figure it out by itself which filter (if there are a few) it needs to use in order to achieve the goal.
  2. provide more information, other than just the original size and scaled size. For instance, provide the scaling ratios for each filter which might be involved. When doing this, the software usually need provide a fixed-point scaling ratio value which will be stored in one register.

The background of this scenario

  • The hardware performs only down-scaling, up to the scaling ratio 8.
  • there is a decimal filter available, which can do the scaling ratio 2,4,8.
  • a bilinear filter can handle a maximum scaling ratio 2.

The hardware provides 2 register fields per direction (horizontal, vertical) for the software to set the scaling ratio.

1) DEC_RATIO 2-bit field, it controls the pre-decimation filter.

00b – Pre-decimation filter is disabled.
01b – Decimate by 2
10b – Decimate by 4
11b – Decimate by 8

2) SCALE_FACTOR 14 bit fixed-point, with 12 fractional bits. This scale factor will be provided to bilinear filter.

How does it work

Assume the original size is 1280 and we need scale it down to 256 or 284.

(1) 256

The scaling ratio is 1280/256 = 5.

We need set 4 to DEC_RATIO.

We need set 5/*4 = 1.25 to SCALE_FACTOR.

(2) for 284

The scaling ratio is 1280/284 = 4.50704…

We need set 4 to DEC_RATIO.

WE NEED SET 4.5/4 = 1.12676 to SCALE_FACTOR.

How do we program it

uint32_t width;

uint32_t scaling_width;

(1) DEC_RATIO

uint32_t dec_ratio;

dec_ratio= width / scaling_width; // note we need validate if dec_ratio < 16.0

Then we need find a way to map dec_ratio the DEC_RATIO field. There could be done by using some math, or just an array for mapping since there are no many numbers involved.

static const uint8_t dec_array[] = {0, 0, 1, 1, 2, 2, 2, 2, 3};

if (dec_ratio > 8) {

dec_ratio = 8;

}

write dec_array[dec_ratio] to DEC_RATIO field.

(2) SCALE_FACTOR

(2.1)If the hardware doesn’t have specific requirement about rounding, we simply calculate scale_factor as below.

uint32_t scale_factor;

scale_factor = (width * (1 << 12) )/( scaling_width * (2 << dec_array[dec_ratio]));

// note that the calculation above avoids the floating-point math, so it is faster than calculation below and the result is same.

scale_factor = (uint32_t)((float) width/( scaling_width * (2 << dec_array[dec_ratio])) * (1 << 12);

(2.2) If the hardware wants to round the number to a nearest higher integer, a bit complicated math is needed here, we can either simply use ceil() function, like

float scale_factor;

scale_factor = ((float) width/( scaling_width * (2 << dec_array[dec_ratio])) * (1 << 12);

scale_factor = ceil(scale_factor);

then set (uint32_t)scale_factor to SCALE_FACTOR field.

or we can use fixed-point math to achieve the same result, with floating-point math avoided, as blow.

#define NUM_EXTRA_BITS 4

uint32_t scale_factor;

scale_factor = width * (1 << (12 + NUM_EXTRA_BITS) / (scaled_width * (2 << dec_array[dec_ratio]));

scale_factor = (scale_factor + (1 << NUM_EXTRA_BITS) – 1) >> NUM_EXTRA_BITS;


Conersion between fixed-point and floating-point

Let’s assume we have a 16-bit fixed point number with 12 fractional bits and a float number.

uint16_t fixed_x;

float floating_y;  // note that float type uses 32-bit storage, the double uses 64 bits.

To convert from fixed-point to floating-point, we divide the fixed-point number by (2fractional_bits):

float_y =  (float) fixed_x / (float) (1 << 12);

Note that we need explicit type casting.

To convert from floating-point to fixed-point, we multiply the floating-point number with (2fractional_bits) and round the result to the nearest integer number:

fixed_x =  (uint16_t) (floating_y * ( 1 << 12))

note: the formula above assumes the rounding is done downwards — rounding down. If it a round-up is required, we need a bit more complicated math.

Here is an example.

I have a float number 1.126760 and need to convert it to a fixed-point number, in order to store it in a register, where there are 14 bits in total with 12 fractional bits.

1.126760 * 4096 = 4615.20896

Round it down to 4615

The binary representation of 4615 is: 0001 0010 0000 0111. Therefore, after we write 4615 to the register, the binary content of that register field will be 01 0010 0000 0111.

Note that in this example,

(1) Since there are only 2 integral bits, we can’t store 4 (or bigger value) into this register field.

(2) As12-bit binary number is equivalent to about 4-digit decimal number — remember 212 or (1 << 12) is 4096 , 12-bit fractional bits can preserve 4, or 5 fractional digits, from the precision perspective.

Image format and transparancy

There are many existing image formats. Here we focus on JPG, GIF, PNG and BMP.

A quick explanation about “color depth”. Color depth, bit depth, or pixel depth, refers to the number of bits per pixel to represent a specific color.

For 8-bit color depth, the total number of colors is 28 = 256.

for 24-bit color depth (usually 8-bit Red, 8-bit Green, 8-bit Blue), the total number of color is 224.

for 32-bit color depth (8-bit Red, 8-bit Green, 8-bit Blue, 8-bit Alpha channel), the total number of color is 224, with 8 bits representing the transparency of a color.

1. JPEG (Joint Photographic Experts Group)

JPEG doesn’t support transparency — no alpha channel.

A couple of side notes:

(1) JPEG uses lossy compression, which means encoding always causes a loss in quality.

(2) JPEG 2000 was developed by the same company as JPEG. However, they are totally different. JPEG 2000 provides both lossy and lossless compression and support transparency.

2. GIF (Graphics Interchange Format

GIF support 1-color or 1-bit transparency. While it sounds confusing, it would be easier to understand if we know how GIF stores transparency information.

There is a block called Graphics Control Extension (Started with 0x21 0xF9) which describes the chunk of image data which follows it. Inside the Graphics control extension, there is one bit called “Transparent Color flag”, and one byte which refers to “transparent color index”.

the 1-bit Transparent Color flag tells the decoder if there is transparency. Hence, either opaque or transparent, there are no values between.

The 1-byte transparent color index tells the decoder which color the transparency will be applied to.

Note that there could be a few sub-blocks of image data in one file, so there could be a few colors to be transparent, as long as they are located in different sub-blocks.

A few more to know about GIF —

(1) GIF is palette-based, which means the color used in an image have their RGB values defined in a palette table which can hold up to 256 entries. GIF uses 8-bit color depth, Which means it can use 256 colors at most.

(2) GIF uses lossless compression.

(3) GIF file format

  • header: GIF87a or GIF89a
  • logical screen Descriptor, which specifies the size of the image, whether a global color table presents or not.
  • global color table (optional).
  • sub- blocks
    • graphics control extension (starting with 0x21 0xF9)
    • image descriptor (starting with 0x2C).
    • local color table (optional?)
    • Image data

reference: http://giflib.sourceforge.net/whatsinagif/animation_and_transparency.html

3. PNG (Portable Network Graphics)

PNG supports full transparency, and PNG compression is lossless.

4. BMP (Bitmap Image file)

BMP is defined by Microsoft. It is able to use various color depths, and optionally with data compression, alpha channel and color profiles.

in BMP header,

(1) offset 1Ch tells the number of bits per pixel, it could be 1,2,4,8, 16, 24, 32.

(2)offset 1Eh tells the compression method being used. Note that this field contains more than compression method. popular values for this field:

  • BI_RGB (0), means no compression.
  • BI_BITFILEDS, this basically indicates RGBA, no pixel array compression used.
  • BI_JPEG
  • BI_PNG

(3) if BI_FITFIELDS is used, offset 36 ~ 42h are used to indicate the bit masks for each channel (R, G, B, A).

(4) offset 0x12 and 0x16 indicate the width and height of the picture, in pixels. Note that both values are signed numbers.

A few tricks

(1) If you specify RGBA format (32bit color depth, BI_FIFIELDS), but you don’t specify the bit mask for alpha channel, the decoder will discard the alpha channel value — the picture will be fully opaque ( all the alpha values in the decoded image will be shown as 255).

(2) the pixel data in the BMP files are supposed to be from left to right, and bottom to top, by default. As a result, if you write the height of the image to 0x16 and copy your raw image into a BMP file, you will see an upside-down image when you open it with a img viewer. Supply a negative height value to the offset 0x16 will help get it corrected!

Alpha channel

The alpha channel is a color component of RGBA color model. While R (red), G(green) and B(blue) components represent the color of a pixel, the alpha channel represents the degree of transparency (or opacity) of the color. It is used to determine how a pixel is rendered when blended with another.

When a color (source) is blended with another color (background), e.g., when an image is overlaid onto another image, the alpha value of the source color is used to determine the resulting color. If the alpha value is opaque, the source color overwrites the destination color; if transparent, the source color is invisible, allowing the background color to show through. If the value is in between, the resulting color has a varying degree of transparency/opacity, which creates a translucent effect.

The alpha channel is used primarily in alpha blending and alpha compositing.

see: https://www.techopedia.com/definition/1945/alpha-channel