Mash Decoding Process --------------------- This document describes Mash decoding architecture - Introduction In order to receive and display an incoming RTP stream you need to do four main tasks: 1. Get every packet 2. When all the packets for a frame have arrived, join them into an encoded frame 3. Decode the frame into a raw frame 4. Render the raw frame in the monitor The first three tasks are done by a "VideoAgent" object (mash/tcl/net/agent-video.tcl). The fourth one is done by any of the "Renderer" objects (mash/render/renderer.h). - Architecture How it works (main view of objects): - VideoWidget | |- inherits from TkWindow | |- target_ = new Renderer/XXX handler - VideoAgent (mash/tcl/net/agent-video.tcl) | |- inherits from RTPAgent (mash/tcl/net/agent-rtp.tcl) | |- inherits from MediaAgent (mash/tcl/net/agent-rtp.tcl) | | | |- inherits from SourceManager (mash/rtp/source.h,cc) | | | |- hashtable_ (Source* table), which represents all the | Sources (tuple {ip address, source id}) that have joined | the session | | | |- Source objects, aka Source/RTP in tcl (mash/rtp/source.h,cc) | | | |- handler_ = PacketModule*, which is a pointer to the | | decoder object, XXXDecoder. You can access to a | | handler to the XXXDecoder object by calling | | "$source handler" | | | |- XXXDecoder object (mash/codec/decoder-xxx.cc) | | | |- inherits from Decoder | | | |- has variable engines_ (Renderer*), linked list to | all the renderers that take their input from this | encoder | |- network_ = new NetworkManager (tcl/net/network.tcl) | | | |- net_($nchan_) = new NetworkLayer (tcl/net/network.tcl) | | (nchan_ = number of layers) | | | |- dn_ = new Network # data network | | | | | |- tcl name for IPNetwork (mash/net/net-ip.cc) | | | |- cn_ = new Network # control network | |- session_ = Session/RTP/Video | |- tcl name for VideoSession (mash/rtp/session-rtp.h,cc) | |- inherits from RTP_Session (mash/rtp/session-rtp.h,cc) | |- inherits from RTP_Transmitter (mash/rtp/transmitter-rtp.h,cc) - Architecture Description 1. To build a basic RTP receiver you just need a VideoAgent object. If you want to display it in a window in your screen you also need a TkWindow-derived object that registers itself as an Observer in the VideoAgent. 2. A session is represented by a Session/RTP/Video (which is created in the initialization process of the VideoAgent). As in the encoding case, this is the object that implements packet I/O. 3. A source in a session represents any of the applications that have joined the session. It is represented by a Source/RTP object. Beware: an application doesn't need to be transmitting data to be a source. One of the Source/RTP objects (in fact the first one that is created) represents our own application as a source. All sources are linked in a hash table available in VideoAgent as a part of the SourceManager object inheritance. - Decoding Description a. Object Initialization (joining the session) The first step is therefore creating a new VideoAgent object with an IP address (maybe m/c (multicast)) as the second parameter. VideoAgent::init{} nexts to RTPAgent::init{}, which first at all nexts down to MediaAgent::init{} and then to SourceManager::SourceManager(). Second, RTPAgent::init{} calls VideoAgent::create_session{}, which creates a Session/RTP/Video object that will represent the session for us. Third, it calls RTPAgent::reset{}, which calls RTPAgent::mk_local_source{}, which calls "$self create-local". The method create-local doesn't exist neither in RTPAgent nor in MediaAgent, so SourceManager::command() is called with "create-local" as first parameter. This calls SourceManager::init() (yeah, the name overloading is definitely confusing), which first calls SourceManager::create_source() with our own address (this crosses again to the tcl world by calling MediaAgent::create-source{}, which creates a "new Source/RTP" object that will represent our own source). Second SourceManager::init() calls SourceManager::enter() with the new source handler, which adds the new source to the SourceManager::hashtable_ list of sources. Now we have joined the session and all the world knows we're listening to the address passed as parameter to VideoAgent. b. Learning about other people joining the session When a remote box joins the session, our box's SourceManager::lookup() is called with the address and srcid of the remote box and a handler to our Session/RTP/Video object (mash/rtp/session-rtp.h,cc). SourceManager::lookup() calls SourceManager::create_source(), which calls MediaAgent::create-source{} up. This one creates a new Source/RTP object (mash/rtp/source.h,cc) that represents the new box we now know about, adds it to its list of sources (sources_ instvar) and returns the handler to SourceManager::lookup(), which then calls Source/RTP::notify_observers{} up so that all observers can register the new Source/RTP object. As always, after SourceManager::create_source() is called, SourceManager::enter() is also called, which adds a handler to the just-created Source/RTP object to the list of sources in the SourceManager object, SourceManager::hashtab_. This table is used by the SourceManager to account for all the sources in the m/c session. c. Receiving streams c.1. Getting the packets The object that implements packet I/O is the Session/RTP/Video one, which inherits from RTP_Session. Any packet coming from outside for our session is sent to RTP_Session::recv(), which in turn sends it to RTP_Session::demux(). This gets the Source/RTP object that represents the packet source and checks if it has a handler attached. A Source handler may be any PacketModule object, and it would be normally a decoder (XXXDecoder object) for the video format used by the remote source. If RTP_Session::demux() finds the handler successfully, it will dispatch the packet to it. If the remote source representation still hasn't got an object attached to it as a handler (h), RTP_Session::demux() calls Source/RTP::activate(), which in turn calls VideoAgent::activate{} up. This creates a decoder for the data using VideoAgent::create_decoder and attachs it to the Source/RTP object by calling Source/RTP::data-handler{}, which is sent to Source/RTP::command data-handler. For example, if the remote box A is sending an H261 stream, the handler of the Source/RTP object that represents A gets filled with a new Module/VideoDecoder/H261 object, which is the tcl name for H261Decoder (mash/codec/decoder-h261.cc). c.2. Assembling RTP packets into an encoded frame and decoding it As we have just seen, "handler->recv(packet)" is a call to XXXDecoder::recv(). A decoder has to be able to assemble packets. Once it decides it has all the packets that compose a frame, the XXXDecoder decodes the frame by calling XXXDecoder::decode() and then sends it to Decoder::render_frame(). This method checks all the renderers that want the frame and sends it to them by calling Renderer::recv(). c.3. The Observer model How does the decoder know which renderers it has to send the frame to? The Decoder class, and by inheritance any XXXDecoder object, maintains a linked-list of Renderer objects called engines_. engines_ lists all the renderers it has to send every frame to. This list is initialized to empty, and managed with Decoder::attach() and Decoder::detach(). If Decoder receives a packet and its engines_ list is empty, it doesn't send the frame anywhere. The way to add a renderer to a decoder is to create the renderer, and then call "$decoder attach $renderer". This calls "XXXDecoder command (attach)", and XXXDecoder sends it down to "Decoder command (attach)", which does the attachment. c.4. The Vic case In Vic, the attachment of renderers is is done by VideoWidget::attach-decoder. Let's see the full Vic process: The main object Vic creates is a VicApplication one (mash/tcl/vic/application-vic.tcl). VicApplication::main{} calls VicApplication::init_ui{}, which creates a VicUI object, which is the main UI object (mash/tcl/vic/ui-main.tcl). VicUI::main{} calls VicUI::layout_gui{}, which creates a new ActiveSourceManager object (mash/tcl/vic/ui-activesourcemgr.tcl). Any active Source/RTP object is managed by an ActiveSource object. The ActiveSourceManager is in charge of firing up ActiveSource objects when they are needed. During ActiveSourceManager::init{}, the ActiveSourceManager object attaches itself to the VideoAgent as an observer ("$videoAgent_ attach $self"). So any of the events the VideoAgent has registered as observable are sent to the ActiveSourceManager. When one of the Source/RTP objects starts transmitting, ActiveSourceManager::activate{} is called. This proc calls ActiveSourceManager::really_activate{}, which creates a new ActiveSource object. The ActiveSource object creates a VideoWidget object called thumbnail_, and then calls ActiveSource::attach-thumbnail{}, which calls VideoWidget::attach-decoder{}. This proc creates a Renderer object called target_ and finally calls "$decoder attach $target_". Now your new VideoWidget object (called thumbnail_) starts getting decoded frames from the XXXDecoder output.