Streaming Protocols
In the previous section, we looked at the infrastructure of the Internet, starting with the physical layer and working our way up to the application layer where web protocols such as HTTP and video streaming protocols exist. We spent a good deal of time differentiating the two major transmission protocols:TCP and UDP. HTTP, the Web protocol, is based on TCP, and is optimized for retrieving files. It has commands for getting files, checking the date and size of files, posting data from a web form, and getting portions of files.
HTTP does not, however, have any concept of real-time transfer in it. HTTP takes as long as it takes. And in HTTP, the client and server take turns talking; no bi-directional chatter is allowed.
As we discussed earlier, UDPnot TCPis the preferred transmission protocol for real-time streaming because it is not troubled by (or even aware of) dropped packets. UDP can send packets at a constant rate, regardless of network con- gestion or the application's ability to receive them. Now we must consider the features of a streaming media protocol built on UDP. Such protocols have to perform a series of tasks:
Setup. Providing start, stop, fast-forward, rewind, and track skip commands.
Transport. Providing a means to deliver multiple streams of media, (possibly) detecting missing packets.
Synchronization. Providing a means to synch up different media streams into a shared time-base in real-time, and re-sequencing out- of-order packets.
Quality monitoring, Providing a means to report back to the server conditions like packet loss and client playback quality.
Real-Time Transport Protocol (RTP)
The Internet Engineering Task Force (IETFsee Chapter 8, "Internet Video Standards") has standardized a set of protocols for video delivery. The Real- time Transport Protocol (RTP) provides all the transport and synchronization features listed in the previous section. RTP is spoken between a media server and a media player application. RTP provides the actual data transferfor example, the audio and video come down from the media server as two different streams over the RTP protocol. RTP usually runs over UDP, but it can run over TCP as well, and it can actually run over other non-Internet transports systems. It takes care of packet timing; it doesn't actually ensure real-time delivery, but it wraps the different frames of audio and video with enough timing information so they can be synchronized in real time on the receiving end. RTP is also the standard way to deliver media over UDP on multicast networks.
Another protocol in the RTP specification, RTCP (Real-Time Control Protocol), couples with RTP to provide a control channel that's useful for quality monitoring. Servers send RTCP packets down to all the clients; clients send RTCP packets back periodically (for example, every 5 seconds) to let the server know the quality of the stream it receives. The server then might throttle down the quality of the stream, if needed.
In late 1996, the Real-Time Streaming Protocol (RTSP) provided the setup features for video delivery. RTSP essentially provides the VCR controls (play, stop, fast-forward, and rewind) for a streaming media server. The protocol is modeled somewhat after HTTP because it was intended to be as good for streaming media as HTTP had been for web pages. RTSP can work in conjunction with RTP; RTSP sets up the connection and then RTP is used to deliver the data. RealNetworks and Netscape both worked on the specification of this protocol. RealNetworks then switched to RTSP for its transport setup, deprecating its earlier PNM (Progressive Networks Media) and PNA (Progressive Networks Audio) transport protocols.
Microsoft Media Server Protocol (MMS)
In the late 1990s, Microsoft created its own set of protocols for media delivery. Although they already used RTP in their NetMeeting conferencing application, Microsoft had not implemented RTSP in any products.
Microsoft created the MMS (Multimedia Server) protocol, which integrated most of the features of RTP, RTCP, and RTSP but removed some of the pedantic features of RTP. To reach the broadest possible audience, it designed their protocol with several different versions, each going over a more restricted kind of network:
MMSU goes over UDP for the most efficient delivery.
MMST goes over TCP for networks that do not permit UDP traffic.
HTTP carries the MMS protocol over HTTP for networks that allow only HTTP traffic due to firewalls.
Falling back to less restricted protocols until the audio or video starts working is a common approach, as shown in Figure 5-21.
Figure 5-21 Microsoft's "falling back" system.
The MMS protocol provides the setup, transport, synchronization, and quality monitoring, and has additional capabilities for transmitting digital rights management (DRM) information and requesting licenses from the server.
Delightfully, Microsoft also supports the more standard RTSP/RTP protocols. RealNetworks, Apple, and Microsoft have all implemented streaming media systems that use the RTP and RTSP specifications, and each of their media servers and players can use RTP as a protocol for media transport.
RealNetworks was the first to have a full RTSP/RTP system (circa 1998) in its RealMedia G2 product. Apple adopted RTSP/RTP for the open-source Darwin Streaming Server in 2000 for delivery of QuickTime v4.0. Finally, Microsoft implemented RTSP/RTP support in Windows Media version 9 in late 2002, and is heading in the direction of fully standardizing on RTSP/RTP as well.
Shoutcast/Icecast Protocol (ICY)
The Shoutcast/Icecast streaming protocols began in 1998 as a simple hack to stream MP3 radio stations. A company called Nullsoft (now part of AOL), using a slightly customized version of the HTTP protocol (called the ICY protocol, with a URL like icy://http://www.masteringinternetvideo.com:8200), created the Shoutcast server, which can send or receive streamed MP3 or pretty much any streamable audio or video codec.
The first version of the protocol was so simple it consisted of merely MP3s shoved one after another. Later versions of the protocol added support for sending track names, titles, and more, along with the songs and having the client players display them. Finally, very stable video streaming features were added.
An explosion of different players that can play these streams as well as a variety of services that could reflect the streams resulted in the rapid improvement and de facto standardization of the Shoutcast protocol. In fact, there are more players for Shoutcast MP3 streams than any other kind of player. Every streaming MP3 player on the marketincluding the big three media players as well as Apple's iTunesplays Shoutcast audio streams. Looked at it in this light: Shout-cast is in some ways the most cross-platform, interoperable protocol for streaming audio. Currently, however, the video playback is limited to WinAmp and other NSV (Nullsoft Video) players.