Title:
Continuous Media Middleware Consortium

Author(s):
Lawrence A. Rowe

Affiliation:
University of California, Berkeley
Computer Science Division - EECS

Presentation:
Proposal submitted to NSF March 1999



Introduction
An important goal of next generation high-performance networks such as vBNS or Abilene is to experiment with high-bandwidth distributed collaboration and multimedia applications. Example applications are distance learning [Laws96], remote equipment operation, telepresence and virtual environments, and interactive media authoring and playback. To accomplish this goal, the research community needs an open portable toolkit and a collection of applications for streaming media (e.g., audio, video, and animation). We propose to establish a consortium at the University of California at Berkeley to support such a toolkit.

The toolkit will provide support for continuous media capture, transport, and display and for building a variety of distributed applications based on IETF standards (e.g., IP-Multicast, RTP, SAP, SIP, RTSP, etc.). In addition, a suite of applications will be supplied for N-way conferencing based on the Internet Mbone tools. The toolkit and applications must run on the computer platforms used by members of the community (e.g., PC, Macintosh, and Unix systems) so different researchers can share code and build on the work of others rather than having to re-invent yet again the underlying infrastructure required to complete their work.

Models for the consortium are the support groups that developed around the Berkeley Unix, GNU, and Linux communities. In each case, a large group of individuals developed the system but a support group took responsibility for integrating code developed by others, fixing bugs, setting standards, maintaining the source code, developing necessary utilities and applications, and generally acting as a focus for the work by the community. This style of cooperation is referred to as an Open Source community [Raymond98].

The toolkit will be based on the Mash Toolkit which is an outgrowth of the Internet Mbone tools (e.g., sdr, vat, vic, etc.) [McCanne97a]. It will allow the research community to solve problems that industry is unlikely to address because they are focussed on mass-market technologies. The imperatives of the marketplace constrain commercial systems to low-bandwidth, unicast streaming media because bandwidth on the commercial Internet is too expensive and multicast protocols are poorly supported. These constraints are incompatible with technologies appropriate for high-bandwidth and heterogeneous network multicast applications.

The consortium being proposed will be the open source support and development organization for the toolkit and distributed collaboration applications. The following sections describe this proposal in more detail. Section 2 describes several problems not being addressed by industry. Research enabled by this proposal directly addresses these problems. Section 3 describes three sample applications that illustrate the features and capabilities required by the toolkit and applications. Section 4 describes the technical approach we propose to follow and presents a development plan. And, Section 6 presents a management and operations plan.


Research Problems Not Being Addressed
This section discusses problems not adequately being addressed by industry because they are focussed on mass-market applications. Three examples are:

  1. large-scale intelligent systems
  2. distributed collaboration and control
  3. layered multicast and adaptive coding

The remainder of this section discusses each example in more detail.

Large-scale Intelligent Systems
Protocols and applications are needed to support the deployment of massive numbers of intelligent sensors and computing devices connected by wired and wireless networks. These sensors will include many different types of media including audio, video, images, and data (e.g., motion detection, temperatures, stock prices, etc.). Some information sources will be traditional data (e.g., news broadcasts) and some sources will be non-traditional data (e.g., an active badge sensor signaling that a particular individual is nearby). Intelligent control applications are needed that will take advantage of this raw data. For example, a remote person monitoring a building or facility needs a meaningful display from a collection of sensors and video sources without requiring the person to switch sources manually, change bandwidth allocations, and manually move cameras. Another example is an intelligent building in which the heating and lighting system is dynamically adjusted based on the time of day, the date, the number of people in a room or space, the availability of excess heating/cooling capacity, and inferences about the current activities in a room. For example, work patterns in a university building might be deduced by the presence of people and schedules (e.g., class schedules). A classroom might be heated (cooled) in anticipation of the next class, the number of people enrolled, and the current temperature in the room. Instructors typically configure the audio/video equipment, lights, and teaching aids differently. A smart building can automatically configure the room based on the explicit or implicit commands or specifications.

Another example of a research idea in large-scale intelligent systems is to query a large heterogeneous database about past events. Suppose video cameras and position sensors record where you are and what you are doing while at work. Several months later you might ask "what did I do last January 12th?" Or you might ask who borrowed the video camera during the month of May. While this type of data collection and analysis raises important privacy issues, many opportunities to improve our lives are also possible.

These research projects require a supported and shared computing environment. A middleware toolkit that supports streaming media capture, processing, storage, and transport including triggered events is needed to accelerate this research.

Distributed Collaboration and Control
New application- and system-level protocols are needed for distributed collaboration, remote equipment operation, and multi-user virtual environments. Different collaboration styles are being developed as computer-based conferencing is deployed [Agarwal98a, Handley95, Lennox99, Schooler96]. These styles range from highly interactive small group meetings to large moderated lecture broadcasts. Television broadcasters have developed styles for large lectures (e.g., talk shows) and town meetings. But, these styles assume a passive viewer who sees only one video stream. Radically different interaction styles should be explored that exploit the flexibility of computer-based conferencing. For example, a computer-based conference can incorporate both synchronous and asynchronous communication and multiple conversations.

Lecture and meeting capture are important problems being investigated by many research groups [Brotherton98, Davis99, Gabbe94, Ginsberg95, Minneman95, Smith98]. The idea is to capture and analyze live events and to produce rich multimedia content automatically from that material. For example, several groups have developed lecture capture tools that produce titles with random positioning through slide indexes and other search criteria (e.g., keyword indexes, synchronized student notes, etc.). An important element of these systems is that it requires minimal labor to produce the title. Research is needed to assess these titles and to develop better material. For example, tools are need to deduce lecture structures, automate the production of abstracts and summaries, and to speedup the playback and browsing of the material.

Other research problems being addressed are related to large-scale reliable multicast and application-specific control protocols used in collaboration and telepresence applications [Raman98, Hodes99, Agarwal98b].

Layered Multicast and Adaptive Coding
Many researchers have developed scalable CODEC's that trade-off communication bandwidth for media quality (e.g., image size, fidelity, etc.) [Haskell98]. Layered multicast was proposed as an approach to source/channel coding that will seamlessly support heterogeneous receivers by delivering the best quality possible given network bandwidth to that receiver [McCanne96, McCanne97b]. Many opportunities exist to apply application-specific coding and transmission heuristics to improve stream quality. These heuristics may be derived from one stream or multiple streams. For example, session bandwidth can be allocated to streams that participants are watching [Amir97]. Another example is a lecture broadcast composed of two streams: speaker and presentation material. When the presentation material is static, bandwidth can be allocated to the speaker, but when the presentation material changes (e.g., moving to the next slide or playing a video) bandwidth can be allocated to the presentation stream. Protocols and heuristics to support these systems are needed.

Research remains on how best to scale a media stream from very low bandwidths (e.g., under 10 Kbits/sec) to high bandwidths (e.g., 20 Mbits/sec). Researchers are still experimenting with the appropriate number of layers and how they relate. For example, some researchers report that ten to twenty layers may be required to cover the wide bandwidth variations likely to occur in future networks. A flexible, open source toolkit will allow practical testing of the addressing and control required for layered multicast [Swan98] and new coding heuristics and algorithms.



Prototypical Applications
This section describes prototypical distributed collaboration applications that illustrate the features and capabilities the toolkit must support. The applications are live broadcasting, lecture capture, and interactive media authoring.

Distance learning is an important example of distributed collaboration. Current generation distance learning systems that use audio and video communication are severely limited by the underlying implementation technology. These systems use a combination of technologies including television, ITU standards-based video conferencing (e.g., H.320 or H.323), and commercial webcasting (e.g., Cisco IP/TV, Microsoft NetShow, or Real Networks RealSystem/G2). These products limit interaction, the number of concurrent streams, and media stream quality. Moreover, television-based solutions are expensive because of the cost of equipment and operation.

The distance learning approach being pursuing at Berkeley and elsewhere is based on developing a scalable, N-way interactive collaboration system using Internet Mbone technology. Tools are being developed to reduce the cost of producing lecture broadcasts and conferences (e.g., a broadcast manager which allows one person to produce multiple programs [Wu99]) and to improve broadcast quality (e.g., a software-only parallel video effects systems [Mayer-Patel98, Mayer-Patel99] and various floor control tools [Malpani97, Schooler96, Sisalem98]). Programs include multiple simultaneous streams (e.g., speaker, audience, presentation material, etc.) broadcast at varying quality and participation by remote speakers and audience members. In addition, all program streams are recorded for on-demand replay [Almeroth98, Holfeder95, Schuett98].

Researchers elsewhere have modified earlier versions of the Mbone tools to simplify the user interface, implement inter-stream synchronization, and integrate better CODEC's (e.g., H.263 [Stauhlmuller98] and MPEG [Adamson99]). Unfortunately, these improvements are not integrated into a common system so they are not widely available. This problem illustrates the need for an open source shared toolkit. Some group has to integrate the changes, port them to different platforms, document the code so other researchers can modify and extend it, and continue support for the core objects and applications.

The second prototypical application is authoring interactive media. Interactive media, in contrast to non-interactive media, allows the user to alter the playout sequence. Accordingly, interactive media playback cannot depend on simple buffering for continuous quality playback. Smooth playout and interaction requires careful design of the interface behavior, informed choice of appropriate interface controls, and implementations that employ branch-prediction. Authoring interactive media is qualitatively harder than authoring non-interactive media. A century of filmmaking experience and many centuries of theater provides guidance for the design and metaphors of non-interactive media. They constrain the implementation to a linear story. Interactive media authors can draw upon experience with games, but they have the much greater challenge of supporting non-linear exploration of a story space. Evidence of the complexity of authoring interactive media is found in the relative scarcity of innovative applications. Most commercial interactive media is limited to basic slide shows and movies. The few compelling examples of interactivity are in the realm of entertainment and virtual reality. Important research problems remain in the design of high-level story representations, specification languages for non-linear media experiences, and synchronization languages and implementations.

The WWW Consortium has developed a synchronization standard, named SMIL [W3C98], which uses simple parallel and sequential constructs for authoring multimedia titles. A better solution is a system built on temporal constraints that can be dynamically changed by user interaction [Bailey98, Buchanan93]. Other users are unlikely to experiment with these research systems because they were implemented in toolkits that are not widely supported or used.

These two prototypical applications, namely distributed collaboration and interactive media authoring, interact. A researcher may develop a multimedia title with people at geographically dispersed locations. The authors might work on the project using a combination of video conferencing and shared playback tools. Moreover, an author may want to incorporate material captured during a live event into the title or present the title later in another live conference. These operations sound easy, but the software required to play a multiple stream title into a unicast or multicast session with a shared remote control interface is difficult to get correct. Lastly, editing material is a common activity. One problem with the current Internet Mbone tools, which already support N-way conferencing and archive/playback, is the absence of good editing tools that allow one to create a derived work from captured material or to analyze stored material using an off-line processing algorithm.

While some excellent tools and applications are available in the commercial marketplace (e.g., non-linear digital video editors like Adobe Premiere and Avid Composer, H.323 desktop video conferencing systems like Microsoft NetMeeting or Intel ProShare, and archive/playback systems like Real Networks G2 and Microsoft NetShow), the underlying media representations, transport protocols, and applications do not interoperate. Moreover, they are often closed proprietary systems that focus on low-bitrate unicast streaming media. The community needs widely available supported applications to experiment with high-bandwidth collaboration.

In some cases, interfaces can be created between the representations and applications used by commercial applications. For example, as discussed below, we propose to build gateways between the Internet Mbone media representations and tools and Real Networks G2 and Microsoft NetShow applications and to build the plug-ins required so that Adobe Premiere can be used to edit stored Mbone conferences. In other cases, other members of the research community should be encouraged to write new tools. An example is a human factors analysis tool that captures multiple video streams and synchronized notes taken by an experimenter. The tool should simplify the time and activity analysis common to human factors experiments.



Technical Approach
This section describes different approaches we considered before selecting the approach discussed in the next section. First and foremost, we want to build on an existing software base rather than starting from scratch. In other words, we want to pick an existing standard or prototype toolkit and use it as the foundation for the toolkit to be supported and extended. A significant advantage of using an existing toolkit or standard is that we will support an existing research community. The idea is to support directions being pursued rather than build yet another toolkit.

We considered many alternative toolkits including:

  1. Continuous Media Toolkit [Mayer-Patel97]
  2. DAVE [Mines94]
  3. Java Media Framework [Sun99]
  4. IMA Multimedia Systems Services [Koegel-Buford94]
  5. ISO Presentation Environment for Multimedia Objects (PREMO) [Duke98]
  6. Mash [McCanne97a]
  7. Microsoft Windows Media (Active Movie) [Microsoft99]
  8. Rapport [Ahuja88]
  9. SCOOT [Craighill94]
  10. VueStation [Lindblad96]

We believe the only practical choices are the Java Media Framework (JMF) and the Mash toolkit. The other choices are either proprietary or not widely used. JMF is a continuous media API to Java designed by companies developing the Java language and system. JMF 1.0 developed by Sun, Intel, and SGI, supports streaming media playback. JMF 2.0, developed by Sun and IBM, supports media capture and broadcasting. JMF 1.0 implementations are available for Sparc and Windows platforms. JMF 2.0 implementations are being developed.

Mash is a comprehensive toolkit for multimedia communication and collaboration over the Internet using IP multicast. The toolkit is an outgrowth of the Internet Mbone tools developed to support streaming audio and video applications. Mash supports live media broadcasting, N-way conferencing, and session capture and replay. Mash is a split system. High performance routines are coded in C/C++. Routines that are less performance critical are coded in Tcl/Tk including user interface controls and application-specific control protocols. Mash implementations exist for Unix and Windows (e.g., WNT and W95/W98).

JMF and Mash each have advantages and disadvantages.

The major advantage of JMF is that it is written in a widely praised next generation object-oriented programming language designed for distributed Internet applications. In particular, it is designed for portability and supports threads.

The disadvantages of JMF are poor performance and the absence of required services and applications (e.g., N-way audio/video conferencing and multicast services). Java compilers and interpreters have improved dramatically over the past several years, and they are likely to continue to improve in the future. However, the perceived quality of continuous media applications, which are inherently real-time applications, is sensitive to the performance of media compression and decompression algorithms. The Java interpreted environments and, in particular, the use of garbage collection is incompatible with high-quality streaming media applications [Tsay98]. While using custom-designed hardware (e.g., compression and decompression boards) can solve some problems, this approach limits portability because these boards are rarely available on all platforms. For example, try calling a vendor of an MPEG2 compression or decompression board and attempting to convince them to deliver a driver for FreeBSD.

Mash has the following advantages:

  1. it is based on IETF standards
  2. it is widely used in the research community
  3. it is efficient for real-time applications
  4. it contains many required services and facilities

The major problems with Mash are the portability of a system built using C/C++ and Tcl/Tk, and the absence of documentation and support. The Consortium can solve the documentation and support problem. Consequently, Mash is a good choice except for the issue of portability.

Each system has its problems. Consider each system with respect to portability, code maturity, and the existing community. First, consider portability. The only way to get acceptable real-time performance with JMF or Mash is to use a split system. In other words, code the high-performance routines and inner loops in a low-level language (e.g., assembly language or C/C++) and interface to platform-specific instructions (e.g., Intel MMX). As a practical matter both systems today run only on selected environments. JMF runs on Sparc/Solaris and Windows. Mash runs on Unix and Windows. Neither system runs on a Macintosh. Consequently, porting will be required to run either system on different platforms. So the question is which will be easier? Java supporters argue that applications are easier to port because the language is the same on every platform. Unfortunately, different vendors deliver compilers and run-time systems that have the same version and feature incompatibilities found in conventional languages. Creating efficient media handling code is hard. Because both systems must use a split architecture, they both will require comparable time to port. Consequently, the choice of language is unlikely to significantly impact the work required.

Second, consider code maturity and community. Mash is based on applications built in the early 1990's that are widely used in the Internet Mbone community. These applications support N-way conferencing and IETF standards (e.g., RTP, SAP, etc.). Moreover, a sizable research community already uses these tools to build novel applications and experiment with technologies to support scalable collaboration. In contrast, a corporate committee defined JMF. It is not widely used for distributed collaboration and streaming media research. And, other services and abstractions that already exist in Mash (e.g., announce/listen protocols, scoped address discovery protocol [Kermode98], reliable multicast protocols [Bradner98, Obraczka98, Floyd95], scalable naming and announcement protocol [Raman98], etc.) will require considerable effort to replicate in Java. Selecting JMF will require core object and infrastructure development whereas selecting Mash will require code maintenance to improve portability and integration of code developed by the research community.

Taking these issues into consideration, we believe Mash is the best toolkit with which to start because it is high-performance, widely used by the research community, and is IETF standards-based. It solves problems already being addressed by researchers today.



Development Plan
This section describes the initial development plan for the consortium and a proposed approach to supporting the community.

Overview
We will establish a consortium that will:

  1. integrate code developed here and elsewhere into a tested release
  2. provide support, documentation, and training
  3. develop core facilities required to support the research community

Two versions of the code will be maintained - a stable supported version and an unstable experimental version. A public source management server will be maintained that will allow any member of the community to check-out/check-in source and binary code. The primary objective is to support the research community.

Newsgroups and a web site will be used to communicate. In addition, we will use what we develop in our day-to-day activities. The consortium will hold bi-weekly Mbone conferences that will include training seminars on using and developing Mash applications, project planning and status reports, and research presentations by participants from the community. We will simulcast high and low quality versions of these events as currently done by the Berkeley MIG Seminar [Amir95, Rowe99] until scalable delivery services are deployed. Several projects discussed below relate to producing high quality audio and video media streams and supporting scalable delivery services.

An advisory committee will be formed to direct the activities of the consortium and provide guidance to project management. The management and operations plan is described in more detail in the next section.

The consortium will produce a stable tested release every four to six months. This release will include both source code and binary application code for all supported platforms. The test procedures and scripts will be published for others to use. The following platforms will be supported at Berkeley:

  1. PC Windows (WindowsNT and W95/W98)
  2. PC Unix (FreeBSD and Linux)
  3. Sun Sparc (Solaris)
  4. Macintosh

If an organization or individual wants to support another platform or operating system, their changes will be incorporated into the test and release procedures performed every four to six months. It should not matter where a platform is supported.

On-line and paper documentation will be developed that describes how to use the tools and abstractions supplied to support application development. In addition, a series of training seminars will be produced for on-demand replay. These seminars will use the lecture archive described in the next paragraph.

A storage system and procedures for a lecture archive will be developed that can be used to store high quality lectures recorded at member institutions. High quality means both exceptional speakers and production values. The lectures will use multiple, high-bandwidth streams (e.g., presentation material, speaker(s), and audience members). High quality production values will include multiple camera sources, video effects (e.g., titling, compositing, etc.), and linking to additional material. We will experiment with novel interfaces to the stored lectures (e.g., links to introductory material, indexes to various segments of the talk and questions, and links to topic-specific bulletin boards and chat rooms). Titles will be developed at Berkeley and elsewhere to determine cost models and production procedures for producing the lecture archive. Automated lecture capture tools mentioned above will be used to reduce the cost and effort required to produce the titles.



Project Plans and Release Schedules
This section describes several projects to be completed and a proposed schedule of deliverables. The schedule below shows developments during the first two years. We anticipate that other events and opportunities will dictate changes in the actual schedule. The advisory board will approve all plans and modifications.

The following list identifies several development projects.

  1. Establish source library procedures, develop build and distribution scripts, and fix known bugs.
  2. Produce documentation on using the applications and toolkit. Documents will include: 1) "Mash Consortium Engineering Organization and Procedures," 2) "Introduction to the Mash Toolkit," 3) "Developing Mash Applications," and 4) "Mash Core Objects."
  3. Port Mash to the Macintosh platform. This project will be done in two phases. Phase one will be a shallow port to get a usable system out quickly. And, phase two will integrate the toolkit into the native platform support for media processing.
  4. Integrate addition media CODEC's (e.g., H.263 and MPEG video, MP3, G.722, and G.728 audio, etc.). This work will involve integrating code developed elsewhere and adding the RTP fragment and defragment objects [Adamson99, Finlayson99, Schulzrinne95, Stuhlmuller98, Tan98].
  5. Develop, present, and archive a series of Mash training seminars. Examples include: 1) "Introduction to the Mash Consortium," 2) "Building a Simple Mash Application," 3) "Using the Internet Mbone Tools," 4) "OTcl/C++ API and Defining Core Objects," and 5) "Advanced Mash Application Development." Other seminars will be created as required by the community.
  6. Continue development of the Collaborator tool, which is an integrated interface for producing and participating in Mbone conferences [Romer98]. The Collaborator plug-in architecture should provide more flexible interface and interaction controls and support user layout preferences (e.g., windows and controls displayed). Media processing (e.g., audio equalization and synchronization) can also be improved.
  7. An archiving system is needed that: 1) supports recording and playback of layered multicast sessions, 2) provide management and operations support (e.g., access control and load management), and 3) allows stored material to be edited. Several archiving systems have been developed in the Mbone research community (e.g., MARS [Schuett98], IMJ [Almeroth98], and Mbone VCR [Holfelder95]). One of these solutions should be enhanced and documented for production use.
  8. Develop and release a suite of tools for producing live conferences. These tools will include session announcement, broadcast management and control (e.g., selecting streams to be transmitted and bandwidth allocations), archiving/playback management, and floor control.
  9. Establish procedures and a storage system for the high-quality lecture archive.
  10. Develop a gateway server that will interface Internet Mbone conferences to Microsoft NetShow and Real Networks broadcast servers.
  11. Complete support for layered addressing and CODEC's [McCanne96, McCanne97b]. Preliminary work on a new session announcement tool, named nsdr, was reported recently [Swan98]. Nsdr and the layered CODEC's must be tested and deployed for actual use.
  12. Design and implement a command stream RTP payload format and objects required to use it. A command stream, such as TclStream [Herlocker95], allows commands to be executed at times synchronized with other media streams. They can be used to implement timed user interactions and device controls.
  13. Establish a real-time media effect processing service. Many researchers require audio and video processing either during live broadcasts or when authoring a title off-line. By establishing a media processing service, users can experiment with media effects and develop new services. The model for this system is the parallel video effects system under development [Mayer-Patel99] at Berkeley and the Resolution Independent Video Language [Swartz95] and Dali Multimedia Software Library [Ooi99] at Cornell.

The goal of projects 6-8 is to create a turnkey package that can be widely deployed to all researchers and users for high quality interactive conferences. A collection of tools already exist, they need to be hardened for production use, documented for use by naive users and busy researchers, and packaged for easy deployment. An important aspect of this initiative is to provide on-going technical support to acquire and install the appropriate audio/video equipment and devices and to assist research groups that want to experiment with distributed collaboration.

We expect it will take two months to establish the consortium (i.e., hire personnel, acquire and install equipment, setup development environments, establish binary and source management policies, setup the web site, etc.). After that start-up period, a supported system will be released every six months. The following schedule shows the proposed work plan for the first three releases. The delivery time is relative to the beginning of the grant.

Release 1 (8 Months)

Release 2 (14 Months)

Release 3 (20 Months)

This development plan is only a proposal. The advisory board will meet before development actually begins to review the plan and suggest changes in priority and project plans. Moreover, some delivers will likely be accelerated if researchers elsewhere contribute to the development.



Management and Operations Plan
The consortium headquarters will be located at the University of California at Berkeley. The organization will have three components:

  1. management
  2. advisory committee
  3. research community

Management will include an Executive Manager and a Project Manager. The Executive Manager will be responsible for policy, budget, and scheduling, and he will chair the advisory committee. The Project Manager will be the overall technical leader responsible for coordinating technical working groups, managing consortium staff, and meeting schedules. The Executive Manager will be the grant PI Professor Lawrence A. Rowe. The Project Manager will be a Senior Programmer to be hired.

The advisory committee will be composed of 6-8 researchers recommended by members of the research community and approved by the Consortium Executive Manager. The advisory committee will meet every 6-12 months to review progress and, if appropriate, revise Consortium plans and resource allocations. It is essential that the activities of the Consortium staff serve the needs of the research community. The advisory committee will have binding authority over the activities of the Consortium.

The research community will include any person interested in participating in the Consortium. Members of the community will be encouraged to participate by submitting code and applications, posting messages to the Consortium newsgroup, and participating in the bi-weekly on-line Mbone conferences. We will also experiment with a distributed work tool in the form of a continuous Mbone conference that any member can join while working. The idea is to experiment with technology to improve group productivity when the group is widely dispersed geographically.

As appropriate, workshops and other activities will be scheduled in conjunction with conferences and meetings attended by Consortium participants.



References

[Adamson99] W.A. Adamson, "MPEG1 Video Support for vic," http://www.citi.umich.edu/u/andros/, 1999.

[Agarwal98a] D.A. Agarwal, S.R. Sachs, and W.E. Johnston, "The Reality of Collaboratories," Computer Physics Communications, vol. 110, issue 1-3 (coverdate May 1998), pages 134-141.

[Agarwal98b] D.A. Agarwal, et.al., Remote Camera and Videoswitcher Control Software Software Library, Lawrence Berkeley National Laboratory, http://www-itg.lbl.gov/mbone/devserv/, Nov 1998.

[Ahuja88] S.R. Ahuja, J.R. Ensor and D.N. Horn, "The Rapport Multimedia Conferencing System", Proc. Conf. on Office Information Systems, Palo Alto CA, pp. 1-8, Mar 1988.

[Almeroth98] K. Almeroth and M. Ammar, "The Interactive Multimedia Jukebox (IMJ): A New Paradigm for the On-Demand Delivery of Audio/Video", Proc. Seventh International World Wide Web Conference (WWW7), Brisbane, Australia, Apr 1998.

[Amir95] E. Amir, S. McCanne, and H. Zhang, "An Application Level Video Gateway," Proc. ACM Multimedia 95, San Francisco CA, November 1995

[Amir97] E. Amir, S. McCanne, and R. Katz, "Receiver-driven Bandwidth Adaptation for Light-weight Session," Proc. ACM Multimedia 97, Seattle WA, Nov 1997.

[Bailey98] B. Bailey et.al., "Nsync - A Toolkit for Building Interactive Multimedia Presentations," Proc. ACM Multimedia 98, pp. 257-266, Bristol UK, Sep 1998.

[Brotherton98] J.A. Brotherton, J.R. Bhalodia, and G.D. Abowd, "Automated Capture, Integration, and Visualization of Multiple Media Streams," Proc of IEEE Multimedia '98, July, 1998.

[Bradner98] S. Bradner, A. Mankin, A. Romanow, and V. Paxson, "IETF criteria for evaluating reliable multicast transport and application protocols," Internet Draft, Internet Engineering Task Force, May 1998. Work in progress.

[Buchanan93] M.C. Buchanan and P.T. Zellweger, "Automatically Generating Consistent Schedules for Multimedia Applications," Multimedia Systems, Vol. 1, No. 2, pp. 55-67, 1993.

[Craighill94] E. Craighill, et.al., "SCOOT: An object-oriented toolkit for multimedia collaboration," Proc. ACM Multimedia 94, pp. 41-49,San Francisco CA, Oct 1994.

[Davis99] R. C. Davis, J. A. Landay, V. Chen, J. Huang, R. B. Lee, F. Li, J. Lin, C. B. Morrey III, B. Schleimer, M. N. Price, and B. N. Schilit, "NotePals: Lightweight Note Sharing by the Group, for the Group," to appear in Human Factors in Computing Systems: CHI 99 Conference Proceedings, Pittsburgh, PA, May 1999.

[Duke98] D.J. Duke and I. Herman, "A Standard for Multimedia Middleware," Proc. ACM Multimedia 98, Bristol UK, pp. 381-390, Sep 1998.

[Finlayson99] R. Finlayson, "liveCaster: Multicast your data throughout the Internet!," http://www.live.com/liveCaster/, Jan 1998.

[Floyd95] S. Floyd, V. Jacobson, C. Liu, S. McCanne, and L. Zhang, "A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing," Proceedings ACM SIGCOMM 95, pp. 342-356, Aug 1995.

[Gabbe94] J. Gabbe, A. Ginsberg, and B. Robinson, "Towards Intelligent Recognition of Multimedia Episodes in Real-Time Applications." Proc. ACM Multimedia 94, San Francisco CA, Oct 1994.

[Ginsberg95] A. Ginsberg and S. Ahuja, "Automating Envisionment of Virtual Meeting Room Histories," Proc. ACM Multimedia 95, San Francisco CA, Nov 1995.

[Handley95] M. Handley and I. Wakeman, "CCCP: Conference Control Channel Protocol - A Scalable Base for Building Conference Control Applications," Proc. SIGCOMM 95, Aug 1995.

[Haskell98] B.G. Haskell, et.al., "Image and Video Coding – Emerging Standards and Beyond," IEEE Trans. On Circuits and Systems for Video Technology, Vol. 6, pp. 814-837, Nov 1997.

[Hodes99] T. Hodes, et.al., "Shared Remote Control of a Video Conferencing Application: Motivation, Design, and Implementation," Multimedia Computing and Networking 1999, Proc. IS&T/SPIE Symposium on Electronic Imaging: Science & Technology, pp. 17-28, San Jose, CA, Jan 1999.

[Holfeder95] W. Holfeder, "MBone VCR - Video Conference Recording on the Mbone," http://www.icsi.berkeley.edu/mbone-vcr/, 1995.

[Herlocker95] J.L. Herlocker and J.A. Konstan, "Commands as Media: Design and Implementation of a Command Stream," Proceedings ACM Multimedia 95, San Francisco CA, Nov 1995.

[Kermode98] R. Kermode, "Scoped address discovery protocol (SADP)," Internet Draft, Internet Engineering Task Force, Nov 1998. Work in progress.

[Koegel-Buford94] J.F. Koegel-Buford, "Middleware System Services Architecture," chapter in Multimedia Systems J.F. Koegel-Buford (Editor), Addison-Wesley, 1994.

[Laws96] R. Laws, "Distance Learning's Explosion on the Internet," Journal of Computing in Higher Education, Vol. 7, No. 2, pp. 48-64, Spring 1996.

[Lennox99] J. Lennox, H. Schulzrinne, and T.F. La Porta, "Implementing Intelligent Network Services with the Session Initiation Protocol," Tech-Report Number CUCS-002-99, http://www.cs.columbia.edu/~lennox/cucs-002-99.pdf.

[Lindblad96] C.J. Lindblad and D.L. Tennenhouse, "The VuSystem: A Programming System for Compute-Intensive Multimedia." IEEE Journal of Selected Areas in Communication, vol. 14,no. 7, Sep 1996.

[Malpani97] R. Malpani and L.A. Rowe, "Floor Control for Large-Scale Mbone Seminars," Proc. ACM Multimedia 97, Seattle WA, Nov 1997, pp 155-163.

[Mayer-Patel97] K. Mayer-Patel and L.A. Rowe, "Design and Performance of the Berkeley Continuous Media Toolkit," in Multimedia Computing and Networking 1997, Proc. IS&T/SPIE Symposium on Electronic Imaging: Science & Technology, pp 194-206 San Jose CA, Jan 1997.

[Mayer-Patel98] K. Mayer-Patel and L.A. Rowe, "Exploiting Temporal Parallelism for Software-only Video Effects Processing," Proc. ACM Multimedia 98, Bristol UK, Sep 1998.

[Mayer-Patel99] K. Mayer-Patel and L.A. Rowe, "Exploitng Spatial Parallelism for Software-only Video Effects Processing," Multimedia Computing and Networking 1999, Proc. IS&T/SPIE Symposium on Electronic Imaging: Science & Technology, pp.252-263, San Jose CA, Jan 1999.

[McCanne96] S. McCanne, V. Jacobson, and M. Vetterli, "Receiver-driven Layered Multicast," Proc. ACM SIGCOMM 96, Stanford CA, Aug 1996, pp. 117-130.

[McCanne97a] S. McCanne, et.al., "Toward a Common Infrastructure for Multimedia-Networking Middleware," Proc. NOSSDAV 97, St. Louis MO, May 1997.

[McCanne97b] S. McCanne, M. Vetterli, and V. Jacobson, "Low-complexity Video Coding for Receiver-driven Layered Multicast," IEEE Journal on Selected Areas in Communications, vol. 16, no. 6, pp. 983-1001, August 1997.

[Microsoft99] "Microsoft Windows Media Technology," http://www.microsoft.com/windows/windowsmedia/, February 1999.

[Mines94] R.F. Mines, J.A. Friesen, and C.L. Yang, "DAVE: A plug and play model for distributed multimedia application development," Proc. ACM Multimedia 94, pp. 59-66, San Francisco CA, Oct 1994.

[Minneman95] S. Minneman, et.al., "A Confederation of Tools for Capturing and Accessing Collaborative Activity," Proc. ACM Multimedia 95, pp. 523-534, San Francisco CA, Nov 1995.

[Obraczka98] K. Obraczka "Multicast Transport Mechanisms: A Survey and Taxonomy", IEEE Communications Magazine, Vol. 36 No. 1, Jan 1998.

[Ooi99] W.-T. Ooi, et.al., "The Dali Multimedia Software Library," Multimedia Computing and Networking 1999, Proc. IS&T/SPIE Symposium on Electronic Imaging: Science & Technology, pp.264-275, San Jose, CA, Jan 1999.

[Raman98] S. Raman and S. McCanne, "Scalable Data Naming for Application Level Framing in Reliable Multicast," Proc. ACM Multimedia 98, pp. 391-400, Bristol UK, Sep 1998.

[Raymond98] E.S. Raymond, "The Cathedral and the Bazaar," http://www.linuxresources.com/Eric/cathedral-paper.html, Jan 1998.

[Romer98] C. Romer and T. Wong, "Flexible GUI's with Mash," Third Mash Retreat, http://www-mash.cs.berkeley.edu/mash/pubs/retreat/win98/, Jan 1998.

[Rowe99] L.A. Rowe, "Berkeley Multimedia, Interfaces, and Graphics Seminar," weekly scheduled Internet Mbone broadcast, http://bmrc.berkeley.edu/mig/, Mar 1999.

[Schooler96] E. Schooler, "Conferencing and Collaborative Computing" ACM Multimedia Systems Journal, Vol. 4 (1996), pp. 210-225.

[Schuett98] A. Schuett, S. Raman, Y. Chawathe, S. McCanne and R. Katz, "A Soft-state Protocol for Accessing Multimedia Archives," Proc. NOSSDAV 98, Cambridge UK, July 1998.

[Schulzrinne95] H.G. Schulzrinne, "NEtwork VOice Terminal," ftp://gaia.cs.umass.edu/pub/hgschulz/nevot, 1995.

[Sisalem98] D. Sisalem and H. Schulzrinne, "The Multimedia Internet Terminal," Journal on Telecommunication Systems, Vol. 9, No. 3, pages 423-444, 1998.

[Smith98] B.C. Smith, Personal communication, Sep 1998.

[Stuhlmuller98] K. Stuhlmuller and M. Meissner, "ScalVico: Scalable Video CODEC," http://www.nt.e-technik.uni-erlangen.de/Projekte/dfn/Welcome.html, Dec 1998.

[Sun99] "Java Media Framework API," http://www.javasoft.com/products/java-media/jmf/, March 1999.

[Swan98] A. Swan, S. McCanne and L.A. Rowe, "Layered Transmission and Caching for the Multicast Session Directory Service," Proc. ACM Multimedia 98, pp. 119-128, Bristol UK, Sep 1998.

[Swartz95] J. Swartz and B.C. Smith, "A Resolution Independent Video Language," Proc. ACM Multimedia 95, pp. 179-188, San Francisco CA, Nov 1995.

[Tan98] W. Tan and A. Zakhor, "Internet Video using Error Resilient Scalable Compression and Cooperative Transport Protocol," Proc. ICIP, Vol. 3, pp 458-462, Oct 1998.

[Tsay98] J. Tsay, "MPEG-4 Audio Decoder in Java," Class Project Report, Computer Science Division - EECS, U.C. Berkeley, http://www.cs.berkeley.edu/~ctsay/, December 1998.

[W3C98] WWW Consortium, "Synchronized Multimedia Integration Language (SMIL) 1.0 Specification," W3C Recommendation, REC-smil-19980615, Jun 1998.

[Wu99] D. Wu, A. Swan, and L.A. Rowe, "An Internet Mbone Broadcast Management System," Multimedia Computing and Networking 1999, Proc. IS&T/SPIE Symposium on Electronic Imaging: Science & Technology, San Jose, CA, Jan 1999.