SKIP THE SHIPPING
Use code NOSHIP during checkout to save 40% on eligible eBooks, now through January 5. Shop now.
Register your product to gain access to bonus material or receive a coupon.
The most complete, focused guide to MPEG-4the breakthrough standard for interactive multimedia.
MPEG-4 represents a breakthrough in multimedia, delivering not just outstanding compression but also a fully interactive user experience. In The MPEG-4 Book, two leaders of the MPEG-4 standards community offer a comprehensive, targeted guide to the MPEG-4 standardand its use in cutting-edge applications. Fernando Pereira and Touradj Ebrahimi, together with a unique collection of key MPEG experts, demonstrate how MPEG-4 addresses tomorrow's multimedia applications more successfully than any previous standard. They review every element of the standard to offer you a book that covers:
The authors also walk through the MPEG-4 Systems Reference Software ?offering powerful real-world insights for every product developer, software professional, engineer, and researcher involved with MPEG-4 and state-of-the-art multimedia delivery.
Part of the new IMSC Press Series from the Integrated Multimedia System Center at the University of Southern California, a federally funded center specializing in cutting-edge multimedia research.
Context, Objectives, and Process in MPEG-4
Foreword.
Preface.
Abbreviations.
1. Context, Objectives, and Process.
MPEG-4 Objectives. Formal Standardization Process. MPEG Modus Operandi. MPEG-4 Standard Organization. MPEG-4 Schedule. MPEG-4 Industry Forum. Summary. References.
Design Goals. An End-to-End Walkthrough. Terminal Architecture. MPEG-4 Tools. MPEG-4 and Other Multimedia Standards. MPEG-4 Applications. Summary. References.
Object Descriptors: Entry Points to MPEG-4 Content. Semantic Description and Access Management. Timing Model and Synchronization of Streams. Summary. References.
Basics of BIFS. Basic BIFS Features by Example. Advanced BIFS Features. A Peek Ahead on BIFS. Profiles. All BIFS Nodes. Summary. References.
MPEG-J Architecture. MPEG-J APIs. Application Scenarios. Reference Software. Summary. References.
Objectives. Cross-Standard Interoperability. XMT Two-Tier Architecture. XMT-?Format. XMT-A Format. Summary. References.
Delivery Framework. FlexMux Tool. MPEG-4 File Format. Transporting MPEG-4 over MPEG-. Transporting MPEG-4 over IP. Summary. References.
General Overview. Coding of Rectangular Video Objects. Coding of Arbitrarily Shaped Video Objects. Scalable Video Coding. Special Video Coding Tools. Visual Texture Coding. Summary. References.
SNHC Overview. Face and Body Animation. D Mesh Coding. D Mesh Coding. View-Dependent Scalability. Profiles and Levels. Summary. Acknowledgments. References.
Introduction to Speech Coding. Overview of MPEG-4 Speech Coders. MPEG-4 CELP Coding. MPEG-4 HVXC Coding. Error Robustness. Summary. References.
Introduction to Time/Frequency Audio Coding. MPEG-2 Advanced Audio Coding. MPEG-4 Additions to AAC. MPEG-4 Scalable Audio Coding. Introduction to Parametric Audio Coding. MPEG-4 HILN Parametric Audio Coding. Summary. Acknowledgments. References.
Synthetic-Natural Hybrid Coding of Audio. Structured Audio Coding. Text-to-Speech Interface. Audio Composition. Summary. References.
Profiling and Conformance: Goals and Principles. Profiling Policy and Version Management. Overview of Profiles in MPEG-. Summary. Acknowledgements. References.
Reference Software Modules. Systems Reference Software. MPEG-4 Player Architecture. Scene Graph. PROTOs. Synchronization. Object Descriptors. Plug-Ins. D Compositor. D Compositor. Summary. References.
General Aspects. Test Methods. Error-Resilience Test. Content-Based Coding Test. Coding Efficiency for Low and Medium Bit-Rate Test. Advanced Real-Time Simple Profile Test. Summary. References.
General Aspects. Test Methods. Narrowband Digital Audio Broadcasting Test. Audio on the Internet Test. Speech Communication Test. Version 2 Coding Efficiency Test. Version 2 Error-Robustness Test. Summary. References.
Video Buffering Verifier Mechanism. Definition of Levels for Video Profiles. Definition of Levels for Synthetic Profiles. Definition of Levels for Synthetic and Natural Hybrid Profiles. References.
Complexity Units. Definition of Levels for Audio Profiles. References.
Simple 2D Profile. Simple 2D + Text Profile. Core 2D Profile. Advanced 2D Profile. References.
Simple 2D Profile. Audio Profile. D Audio Profile. Basic 2D Profile. Core 2D Profile. Advanced 2D Profile. Main 2D Profile. References.
Scene APIs. Resource and Decoder APIs. Network APIs. Section Filtering APIs.
The last decade has shown the quick growth of multimedia applications and services, with audiovisual information playing an increasingly important role. Today's existence of tens of millions of digital audiovisual content users and consumers is tightly linked to the maturity of such technological areas as video and audio compression and digital electronics and to the timely availability of appropriate audiovisual coding standards. These standards allow the industry to make major investments with confidence in new products and applications and users to experience easy consumption and exchange of content.
In this environment, the Moving Picture Experts Group (MPEG) is playing an important role, thanks to the standards it has been developing. After developing the MPEG-1 and MPEG-2 standards, which are omnipresent in diverse technological areas and markets (such as digital television, video recording, audio broadcasting, and audio players and recorders), MPEG decided to follow a more challenging approach, moving away from the traditional representation models by adopting a new model based on the explicit representation of objects in a scene. The new object-based audiovisual representation model is much more powerful in terms of functionalities that it can support. The flexibility of this new model not only opens new doors to existing multimedia applications and services, it also allows the creation of a wide range of new ones, offering novel capabilities to users that extend or redefine their relationship with audiovisual information.
The MPEG-4 standard is the first audiovisual coding standard that benefits from a representation model in which audiovisual information is represented in a sophisticated and powerful way that is close not only to the way we experience "objects" in the real world but also to the way digital content is created. In a way, MPEG-4 is the first digital audiovisual coding standard in which technology goes beyond a simple translation to the digital world of analog to exploit the full power of digital technologies.
With the MPEG-4 standard emerging as the next milestone in audiovisual representation, interested people worldwide are looking for reference texts that, while not providing the level of scrutiny of the standard itself, give a detailed overview of the technology standardized in MPEG-4. Because it takes advantage of many technologies, MPEG-4 may seem a large and complex standard to learn about. However, it has a clear structure that can be understood by interested people.
The purpose of this book is to explain the standard clearly, precisely, and completely without getting lost in the details. Although surely there will be other good references on MPEG-4, we tried hard to make this the reference by creating a book exclusively dedicated to MPEG-4, which addresses all parts of the standard, as timely and complete as possible, written and carefully reviewed by the foremost experts: those who designed and wrote the standard during many years of joint work, frustration, and satisfaction.
To help readers find complementary or more detailed information, the chapters include a large number of references. Some of these references are MPEG documents not readily available to the public. For access to these, first check the MPEG Web page at mpeg.telecomitalialab.com
. Some of the most important MPEG documents are available from that site. If that does not work, contact the MPEG "Head of Delegation" from your country (check www.iso.ch/addresse/address.html
), who should be able to help you get access to documents that were declared "publicly available" but still may be hard to obtain.
The book is organized in three major parts: the introductory chapters, the standard specification chapters, and the complementary chapters.
The introductory chapters, Chapters 1 and 2, introduce the reader to the MPEG-4 standard. Chapter 1 presents the motivation, context, and objectives of the MPEG-4 standard and reviews the process followed by MPEG to arrive at its standards. Chapter 2 gives a short overview of the MPEG-4 standard, highlighting its design goals. It also describes the end-to-end creation, delivery, and consumption processes, and it explains the relation of MPEG-4 to other relevant standards and technologies. Lastly, it proposes three example applications.
The standard specification chapters describe and explain the MPEG-4 normative technology, as specified in the various parts of the standard. The first batch of these chapters addresses the technologies associated with the layers below the audiovisual coding layer. Chapter 3 addresses the means to manage and synchronize the potentially large numbers of elementary streams in an MPEG-4 presentation. Essential MPEG-4 concepts and tools such as object descriptors, the Sync layer, the system decoder model, and timing behavior are presented. Chapter 4 is dedicated to the MPEG-4 scene description format, a major innovation, supporting MPEG-4's object-based data representation model. It uses a number of examples to explain the BInary Format for Scenes, or BIFS format. Chapter 5 explains how it is possible to use the Java language to control features of an MPEG-4 player through the MPEG-J application engine. This chapter presents the MPEG-J architecture and describes the functions of an application engine. It also introduces the new Java APIs specific to MPEG-4 (Terminal, Scene, Resource, Decoder, and Network) that were designed to communicate with the MPEG-4 player. Chapter 6 presents the Extensible MPEG-4 Textual Format (XMT) framework, which consists of two levels of textual syntax and semantics: the XMT-A format, providing a one-to-one deterministic mapping to the MPEG-4 Systems binary representation, and the XMT-W format, providing a high-level abstraction of XMT-A to content authors so they can preserve the original semantic information. Chapter 7 describes the general approach and some specific mechanisms for the delivery of MPEG-4 presentations. It introduces the Delivery Multimedia Integration Framework (DMIF), which specifies the interfaces to mechanisms to transport MPEG-4 data, and describes two DMIF instances: MPEG-4 over MPEG-2 and MPEG-4 over IP. (Of course, MPEG-4 presentations can be delivered over other transport protocols as needed.) Finally, the chapter introduces the delivery-related tools included in the MPEG-4 Systems standard, notably the FlexMux tool and the MPEG-4 file format.
While Chapters 3-7 address technologies specified in MPEG-4 Part 1: Systems, Chapters 8-12 focus on media representation technologies specified in MPEG-4 Visual, Audio, and Systems as far as a few synthetic audio techniques are concerned. Chapter 8 introduces all the tools related to video and texture coding for rectangular and shaped objects, and it presents the tools for important functionalities such as error resilience and scalability. Chapter 9 presents the coding tools specified by MPEG-4 to support the representation of synthetic visual content. These tools address face and body animation, 2D and 3D mesh coding, and view-dependent scalability. Chapter 10 introduces the coding tools for natural speech. To address a large range of bit rate, quality, speech bandwidth, and other functionalities, MPEG-4 specifies two coding algorithms: CELP and HVXC. Chapter 11 addresses the general audio coding tools. Here, three coding algorithms are adopted to fulfill the requirements: an AAC-based algorithm with some extensions over MPEG-2 advanced audio coding (AAC); TwinVQ, which is a vector quantization algorithm suitable for very low bit rates; and HILN, which is a parametric coding algorithm providing additional functionalities. Chapter 12 presents the MPEG-4 audio synthetic-natural hybrid coding (SNHC) and composition and presentation tools. The main SNHC audio tools are structured audio and the text-to-speech interface. The audio composition and presentation tools are known as AudioBIFS and Advanced AudioBIFS.
Profiling and conformance are the major topics addressed in Chapter 13. Profiles and levels provide technical solutions for classes of applications with similar functional and operational requirements, allowing interoperability with reasonable complexity and cost. Moreover, they allow conformance to be tested, which is essential for determining if bitstreams and terminals are compliant.
Chapter 14 presents the concept of reference software in MPEG-4 and elaborates on the software architecture of the MPEG-4 Systems player, included in Part 5 of the MPEG-4 standard.
The complementary chapters (15 and 16) address the validation testing of the MPEG-4 video and audio technology. Although they do not cover MPEG-4 normative technology, they provide important information about the standard's performance from various points of view and for various potential applications.