Applications of Linear Algebra: A Video Screen is One Type of Matrix

A video screen is made up of tiny individual pixels (picture elements), each of which displays a specific color. On a computer monitor, the resolution of the screen is usually some size like 1024 pixels wide by 768 pixels high, or perhaps 800x600 or 640x480. On a television monitor (and in most conventional video images), the resolution of the screen is approximately 640x480, and on computers is typically treated as such. Notice that in all of these cases the so-called aspect ratio of width to height is 4:3. In the wider DV format, the aspect ratio is 3:2, and the image is generally 720x480 pixels. High-Definition Television (HDTV) specifies yet another aspect ratio—16:9.

Relative sizes of different common pixel dimensions

A single frame of standard video (i.e., a single video image at a given moment) is composed of 640x480="307,200" pixels. Each pixel displays a color. In order to represent the color of each pixel numerically, with enough variety to satisfy our eyes, we need a very large range of different possible color values.

There are many different ways to represent colors digitally. A standard way to describe the color of each pixel in computers is to break the color down into its three different color components —red, green, and blue ( a.k.a. RGB)—and an additional transparency/opacity component (known as the alpha channel). Most computer programs therefore store the color of a single pixel as four separate numbers, representing the alpha, red, green, and blue components (or channels). This four-channel color representation scheme is commonly called ARGB or RGBA, depending upon how the pixels are arranged in memory.

Jitter (small rapid variations in a waveform resulting from fluctuations in the voltage supply or mechanical vibrations or other sources) is no exception in this regard. In order for each cell of a matrix to represent one color pixel, each cell actually has to hold four numerical values (alpha, red, green, and blue), not just one. So, a matrix that stores the data for a frame of video will actually contain four values in each cell.

Each cell of a matrix may contain more than one number.

A frame of video is thus represented in Jitter as a two-dimensional matrix, with each cell representing a pixel of the frame, and each cell containing four values representing alpha, red, green, and blue on a scale from 0 to 255. In order to keep this concept of multiple-numbers-per-cell (which is essential for digital video) separate from the concept of dimensions in a matrix, Jitter introduces the idea of planes.

What is a Plane?

When allocating memory for the numbers in a matrix, Jitter needs to know the extent of each dimension—for example, 320x240—and also the number of values to be held in each cell. In order to keep track of the different values in a cell, Jitter uses the idea of each one existing on a separate plane. Each of the values in a cell exists on a particular plane, so we can think of a video frame as being a two-dimensional matrix of four interleaved planes of data.

The values in each cell of this matrix can be thought of as existing on four virtual planes.

Using this conceptual framework, we can treat each plane (and thus each channel of the color information) individually when we need to. For example, if we want to increase the redness of an image, we can simply increase all the values in the red plane of the matrix, and leave the others unchanged.

The normal case for representing video in Jitter is to have a 2D matrix with four planes of data—alpha, red, green, and blue. The planes are numbered from 0 to 3, so the alpha channel is in plane 0, and the RGB channels are in planes 1, 2, and 3.

Reference: http://www.cycling74.com/docs/max5/tutorials/jit-tut/jitterwhatisamatrix.html

Applications of Linear Algebra

Pages

Wednesday, April 27, 2011

A Video Screen is One Type of Matrix

No comments:

Post a Comment