H.263 omvic 5.3 Beta Testing

(Last updated June 12, 2003)

[ FAQ | OpenMash ]

The Open Mash consortium at U.C. Berkeley has completed the first version of an H.263 video encoder and decoder and integrated it into Open Mash Vic (omvic). This web page describes how you can download a beta version of this application and run it, and it provides more details on the design and implementation of the software.

Downloading and Running the Code

This webpage describes the H.263 codec that has been added to Open Mash vic. Binary distributions are available to test this application on Linux and Win32 platforms:

Linux Red Hat 7.3 Distribution (linux-omvic-5.3b1.tar.gz)
Windows Distribution (win32-omvic-5_3b1.zip)

Intallation instructions:

  1. Unpack the archive using the appropriate tool.
  2. cd to the directory
  3. run the vic executable using the usual command line arguments.

The following command will join a test session sourced at Berkeley that contains two streams: a video camera showing a desktop worker and a PPM image of a screen capture.

% vic 233.2.3.4/22334

Description of H.263 CODEC

The H.263 codec in omvic supports INTRA frames and conditional replenishment INTER frames. The coding hueristics are very crude -- it encodes an INTRA frame every n seconds (default 5 seconds) and codes INTER frames in between. The INTER frames are coded with skip-blocks (i.e., no change detected) or i-blocks (i.e., replenish the block because it changed). Currently, it does not use motion vectors (MV) nor does it adjust the quantization scale based on bitrate constraints. MV blocks are displayed as pink blocks by the current decoder. (As an aside, this gives you some idea of how the MV encoding is being performed in the stream.)

The codec supports large and custom-sized images. For example, it supports QCIF, CIF, 4CIF => 704x576, and 16CIF => 1408x1152, which are the sizes specified in the standard. The software also supports custom-sized images which means anything divisible by 16 pixels vertically and horizontally up to 1280x1024. Support for large images required changes to several video device abstractions, including Test (video-test.cc), Video-for-Linux (video-v4l.cc), and Video-for-Windows (video-win32.cc), so the devices could grab and return large-sized images. A significant change was made to the Test device -- it now returns actual size of the image in the file rather than scaling it to CIF or 4CIF. This might break some existing applications if the image you are transmitting is a non-standard size.

We also had to make pretty significant changes to the Video-for-Windows device abstraction. We modified the code so that the device will support image capture larger than CIF and other devices (e.g., firewire cameras). We tested it on the following hardware: a ADS Pyro Firewire Camera, a Hauppauge WIN-TV PCI capture board, a WINNOV Videum 1000 Plus PCI capture board, and a Logitech USB Web Camera. The software will grab QCIF, CIF, and 4CIF non-square pixel sized-images and 640x480 and 320x240 square pixel NTSC sized-images. The firewire camera only captures full-sized images at a maximum of 15 frames per second. So, it can only be used with Motion JPEG and H.263 codecs. Eventually, someone needs to write a full-blown Windows Device Manager (WDM) device abstraction for Open Mash so that random-sized images can be supported.

One optional coding heuristic extension was added to the encoder. The modified quantization extension (annex T in the standard) was implemented to improve coding of high-quality images. At q-scales between 1-5 (1 is best), the first few AC terms in coded blocks sometimes overflow the space available in the coded bitstream. The bitstream allows 8 bits for quantized AC terms. But, at small q-scales the magnitude of a quantized term can be as high as 2040. The modQ extension allows arbitrary precision in AC terms using an escape code. Consequently, image quality can be maintained at very low q-scales. The full-sized TV test pattern that we use for testing produced visible ringing for edges with high contrast (e.g., black-to-white). The modQ extension eliminated this problem.

The modQ extension is turned off by default. A new command line argument is provided to turn-on selected coding heuristics. The following argument turns on this extension:

vic -codec h263:modQ 233.2.3.4/22334

This extension is the only one implemented at the current time, but this command line syntax can be used to specify other optional extensions for other codecs.

We have run some inter-operability tests of this code with some existing H.263 applications. We attempted to play streams produced by omvic using the Apple Quicktime player. This did not work! After investigating the problem, we discovered that the RTP payload format we are using is not the format used by H.263+ encoders/decoders. We currently use the original H.263 payload format described in RFC2190. This payload format was changed in 1998 to accommodate features in H.263+. This payload format is called "h263-1998" and is described in RFC2429. We will implement this encapsulation in the near future.

Interoperability Tests

We ran interoperability tests with uclvic 1.1.3 which supports both H.263 and H.263+ codecs. uclvic can play H.263 streams produced by omvic as long as the streams use image sizes based on non square pixels as specified by the standard. For example, SQCIF (128x96), QCIF (176x144), CIF (352x288), 4CIF (704x576), and 16CIF (1408x1152). uclvic cannot handle custom source formats, which is any format different from the non square pixel formats. This causes a problem with some Win32 capture devices (e.g., USB or 1394 webcams) because the capture code used in omvic cannot scale these to the accepted sizes, yet.

While uclvic can play streams produced by omvic, errors occur when we try to play a stream produced by uclvic. This problem is known - we have not debugged the code to decode MVs. MV bitstream representations are actually quite complicated - the standard uses differential coding of MVs from macroblock to macroblock - so it may take some effort to get all of these working correctly.

What Needs to be Done?

Here is the current TODO list for this code. First, we have to merge this code back into the main Open Mash software. It is currently being developed on a branch, named "h263_branch," in the CVS archive. Other developers are welcome to check-out this code, but if you want a couple of days it will be integrated back into the main trunk of the Open Mash CVS repository.

Second, we must implement the h263-1998 RTP payload format so it will interoperate with H.323-compatible videoconferencing systems and the commercial video streaming playback systems (i.e., Apple Quicktime, Microsoft Windows Media, and RealNetworks RealPlayer).

Third, the current software encodes one frame into a memory buffer and then creates RTP packets and sends them. Because the codec can support high-quality large images the buffer for the coded bitstream must be very large (currently around 1 MB). We need to modify the codec to produce the bitstream into an RTP packet and send it when it is full.

Fourth, we want to implement <0,0> MVs (00MV) in the encoder. The bitrate for 4CIF images at 20+ frames per second coding TV material, which has a lot of motion, ranges up to multiple megabits per second (e.g., 1-5 Mbs). This problem is exacerbated by our crude coding heuristics. 00MVs should reduce this bitrate dramatically. Of course, we will have to add code to confirm that all packets of the prior frame were received so we do not add noise to the images.

Lastly, we want to extend the decoder to support the coding heuristics used in more complicated h.263+/h.263++/mpeg4 streams. We need to capture some examples from common commercial codecs and see what sort of features they are using. We want to add these options to our decoder so we can interoperate with streams produced by them.

History

This codec was developed because many users and researchers want the Mbone tools to support improved quality codecs and full-sized images at acceptable bitrates. The original codecs supported by vic included Motion JPEG and H.261. Motion JPEG supports full-sized images but requires too much computational power to encode/decode the images and too much bandwidth to transmit high frame rate sequences. H.261 has a better trade-off for processing and bandwidth, but images can only be CIF-sized (352x288) or smaller.

The UCL folks added variants of the Telnor H.263 code to uclvic but the encoder was too slow and did not support full-sized images on most platforms. The Open Mash consortium added a rudimentary H.263 encoder in 2001 - essentially coding all frames as INTRA frames which mimics Motion JPEG. We began work on a new high-performance implementation of an H.263 decoder in 2002. This decoder was completed and the encoder was improved this spring. This beta release is the culmination of these efforts.

Needless to say, many people contributed to this code including: Paul Huang, Lloyd Lim, Larry Rowe, and Andrew Swan. If you have questions or comments, please send them to the Open Mash Developers Mailing List.


Open Mash Home Page