Decomposing AV Foundation
One of the biggest early challenges in learning to use AV Foundation is making sense of the large number of classes the framework provides. The framework contains more than 100 classes, a large collection of protocols, and a variety of functions and constants you’ll use as well. This can certainly seem a bit overwhelming the first time it is encountered, but when you decompose the framework into its functional units it becomes much more understandable. Let’s look at the key areas of functionality it provides.
Audio Playback and Recording
If you look back at Figure 1.1, you’ll see a small box in the upper-right corner of the AV Foundation box labeled Audio-Only Classes. Some of the earliest functionality provided by AV Foundation relates to audio. AVAudioPlayer and AVAudioRecorder provide easy ways of incorporating audio playback and recording into your applications. These aren’t the only ways of playing and recording audio in AV Foundation, but they are the easiest to learn and provide some powerful features.
Media Inspection
AV Foundation provides the capability to inspect the media you are using. You can inspect media assets to determine their suitability for a particular task, such as whether they can be used for playback or if they can be edited or exported. You can retrieve technical attributes about the media, such as its duration, its creation date, or its preferred playback volume. Additionally, the framework provides powerful metadata support based around the AVMetadataItem class. This enables you to read and write descriptive metadata about the media, such as album and artist information.
Video Playback
One of the more common uses of AV Foundation is to provide video playback. This is often a primary or secondary use case in many media applications. The framework enables you to play video assets from either a local file or a remote stream, and control the playback and display of the video content. The central classes in this area are the AVPlayer and AVPlayerItem classes that enable you to control the playback of an asset, as well as incorporate more advanced features, such as subtitles and chapter information. Or you can access alternate audio and video tracks.
Media Capture
These days, almost all Macs and all iOS devices include built-in cameras. These are high quality devices that can be used for capturing both still and video images. AV Foundation provides a rich set of APIs, giving you fine-grained control of the capabilities of these devices. The central class in capture scenarios is AVCaptureSession, which is the central hub of activity for routing camera device output to movie and image files as well as media streams. This has always been a robust area of functionality within AV Foundation and has been significantly enhanced again in the most recent release of the framework.
Media Editing
AV Foundation also provides very strong support for media composition and editing. It enables you to create applications that can compose multiple tracks of audio and video together, trim and edit individual media clips, modify audio parameters over time, and add animated title and transition effects. Tools such as Final Cut Pro X and iMovie for the Mac and iPad are prime examples of the kind of applications that can be built using this functionality.
Media Processing
Although much can be accomplished in AV Foundation without getting too deeply into the bits and bytes of the media, at times you need to get access to this level of detail. Fortunately, when you need to perform more advanced media processing, you can do so using the AVAssetReader and AVAssetWriter classes. These classes provide direct access to the video frames and audio samples, so you can perform any kind of advanced processing you require.