Advertisment

The MPEG Evolution

author-image
PCQ Bureau
New Update

MPEG is everywhere–on your VCDs, DVDs, the Internet. This journey, which started in 1991 with the release of MPEG-1, has been a long and exciting one and promises a lot more in the future. 

Advertisment

The Moving Picture Experts Group (MPEG) is the organization behind all the work done on MPEG standards. After the success of MPEG-1, the group has worked to produce even better and more efficient versions of it. Their first work titled ‘Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s’ formed the basis of what is now known as the MPEG-1 standard. 

Behind the scenes



To understand the motivation behind all the work, consider the NTSC standard as an example. It broadcasts 352 by 240 pixel video at 30 frames/sec and 24-bit pixel depth. Without any compression, it needs more than 60 Mbps of bandwidth to transport all its data, which, by any standards, is enormous. A more practical approach would be to compress this data so that it can be transported in much lower bandwidth, and yet be broadcast in real time. Thus emerged the MPEG-1 standard, which uses a mere 1.5 Mbps of bandwidth to broadcast live audio/video. It can be accommodated in CD-ROMs to create Video-CDs. In fact, the audio compression techniques for MPEG-1 follow three schemes, termed as layers 1, 2 and 3. The MPEG-1 Audio layer 3 has been the most widely adopted, and is today more commonly known as MP3. 

After MPEG-1, work started to produce an even better standard offering higher quality and bit rates. The work was finally released as the MPEG-2 standard in 1994. It’s similar to MPEG-1 as far as encoding goes. It has support for higher bit-rates and thus higher (read broadcast) quality video. It is, therefore, used in DVDs and digital television broadcasts. What naturally followed from the success of MPEG-2 was an effort to bring out the next big standard, called MPEG-3. The objective for this standard was to cover HDTV, which would support higher bit rates to the order of 20-40 Mbps. It was, however, later discovered that MPEG-2 could be tweaked to fulfil this HDTV objective. Therefore, the work on MPEG-3 standard was abandoned. 

Advertisment

The work didn’t end just there. The next big standard, MPEG-4, also called ‘Coding of Audio-visual objects’ was

standardized in 1998. It differs from the earlier versions in that it enables coding of individual objects. It is no longer necessary to think of an image as a series of rectangular blocks; the blocks can be of any arbitrary shape and thus each block can be used to represent individual real life objects like people or ball, which couldn’t be accurately described by a rectangle. This makes the recording of changes of that particular object much simpler as we are restricted to the object itself and not any of its surroundings, which might be the case while using rectangular blocks. One of the major aims of the MPEG-4 standard is to deliver high-quality digital content using as less bandwidth as possible. If it is able to deliver what it promises, it would solve the speed-quality trade off problem faced by digital content consumers today. Products incorporating MPEG-4 are starting to roll out and the recently released Quicktime 6 leads the pack. We can expect MPEG-4 to really take off by early next year.

As the amount of digital audio/video grows, so do the difficulties in archiving, searching and retrieving the required information. Searching audio or video is not easy as searching text and little progress had been made in this regard until MPEG-7 was conceived in 1997. Formally called ‘Multimedia Content Description Interface’, it does not describe any new coding/compression techniques; instead it defines a standard in which to store information about the digital content so as to make it searchable. MPEG-7 can be thought of as a way to store meta-information, ie, information about information. So, it is designed to complement MPEG-4 and its predecessors and not replace them. Work on this standard is still continuing and it would be some time before we see products implementing it.

The big picture



Work on MPEG-21 or ‘Multimedia Framework’ was started in 2000 to define a big picture of the whole multimedia environment. It aims to describe a multimedia framework where interoperability would be the key–the consumer can use the content without worrying about media formats, codecs and the likes. It is a very ambitious attempt and divides the multimedia world into four categories: ‘Users’ (anybody on the network) accessing ‘Digital Items’ (the content itself) and executing on them ‘Actions’ that generate other digital items as part of a ‘Transaction’. MPEG-21 also aims to address the issue of content-protection and licensing by implementing techniques, which uniquely identify any digital content globally. Work on this standard is still in its infancy stage and it would be quite a while before it fulfils its exciting promises.

Advertisment

Kunal Dua

How MPEG works

The compression techniques used in the MPEG-1 standard can broadly be classified into two categories: Intra Frame and

Nont-Intra Frame. The Intra Frame coding techniques restrict themselves to compressing information contained within a particular frame. Non-Intra Frame techniques, on the other hand, take into account information of adjacent frames as well during compression. Intra Frame coding starts with Video Filtering, which involves transformation of the standard RGB (Red, Green, Blue) signals to the YCrCb format. 

Advertisment

Studies have shown that the human eye is more sensitive to changes in luminance (Y) than the chrominance

(CrCb) components. So compression is achieved by discarding some of the information stored in the CrCb components. This is called down sampling of data and is carried out by averaging out the pixel values in the chrominance components in such a way that a single value is shared by multiple pixels.

The next step is DCT (Direct Cosine Transformations). In general, the adjacent pixels in a particular frame tend to be similar; pixels representing a wall, for instance, will (generally) be of the same

colour. So, each image is broken up into blocks of size 8 by 8 pixels and complex mathematical techniques, like Fourier analysis, are applied to each of these blocks and the resultant values obtained. These values are then divided by corresponding values from a quantization matrix. The goal of this exercise is to reduce as many values as possible to zero within the boundaries of the prescribed bit-rate and video quality parameters to achieve maximum compression using Huffmann encoding and run-length encoding. (The details of all these techniques would be more at home in a mathematical journal and are thus skipped.)

Non-Intra Frame coding techniques are based on the knowledge that most frames are similar to the ones preceding as well as succeeding them. This means that most of these frames can be transmitted as differences between their

neighbours, which in-turn means that a lot less information has to be transferred. 

The first frame is (obviously) transferred as it is. This type of a form is self-sufficient and is called an I (Intra) frame. Subsequent frames can either be another I-frame (no relation with the preceding frame, in case the changes are too many and starting afresh would be better), a P (Predicted) frame, which depends on the preceding frame or a B (Bi-directional) frame, which depends on both the preceding as well as the succeeding frame. Frames are divided into rectangular blocks and the difference in each of these blocks are calculated and transmitted depending upon the type of the frame

(I,P or B).

Advertisment