Inside Oracle VM VirtualBox
Oracle VM VirtualBox ("VirtualBox") is a high-performance, cross-platform virtualization engine for use on computers running Microsoft Windows, the most popular Linux distributions, Oracle Solaris, or MacOS. Designed for use on Intel and AMD x86 systems, Oracle VM VirtualBox can be deployed on desktop or server hardware. As a hosted hypervisor, it extends the existing operating system installed on the hardware rather than replacing it.
VirtualBox includes a hypervisor for the host platform, an application programming interface (API) and software development kit (SDK) for managing guest virtual machines, a command-line tool for managing guests locally, a web service for remote management of guests, a wizard-style graphical tool to manage guests, a graphical console for displaying guest applications on the local host, and a built-in Remote Desktop Protocol (RDP) server that provides complete access to a guest from a remote client.
As shown in Figure 5.1, VirtualBox can run on a wide variety of host platforms. Binaries are available for these operating systems, most of them in 32-bit and 64-bit versions:
- Solaris 10 5/08 and newer, and OpenSolaris 2008.05 and newer
- Oracle Enterprise Linux (32-bit)
- Microsoft Windows (XP, Vista, 7) and Windows Server 2003 and 2008
- Mac OS X 10.5 and newer (Intel only)
- Linux distributions, including SuSE 9 and newer, Ubuntu, Red Hat Enterprise Linux 4 and newer, and others
Figure 5.1 Platforms Supported by Oracle VM VirtualBox
There are no specific limitations on the guest operating system, but supported guests include all of the host operating systems plus FreeBSD, OS/2, and legacy Windows versions (NT, Windows 98, Windows 3.1, DOS). No special hardware is required to run VirtualBox, other than an Intel x86-compatible system and adequate memory to run the guests. If the system has Intel VT-x or AMD-V hardware virtualization extensions and they are enabled in the BIOS, VirtualBox can take advantage of these items and provide even better guest operational behavior.
The modular design of VirtualBox provides a consistent set of features across a wide range of host platforms. As a consequence, a virtual machine or disk image created on one host can be loaded and run on any supported host. In addition, a user or administrator who is familiar with managing guest virtual machines on one type of host can manage guests on any of the other supported systems.
Advanced desktop features such as Seamless Mode and Shared Clipboard give users a uniquely intimate experience when interacting with locally running guests. The built-in Remote Desktop Protocol (RDP) server makes VirtualBox ideal for consolidating and hosting remote desktop systems. Recent improvements in disk and network performance, especially when combined with the advanced resource management features available in Oracle Solaris, make VirtualBox an excellent choice for hosting server workloads.
This chapter assumes general knowledge of PC hardware. It also assumes the use of VirtualBox version 3.1.4.
5.1 How Oracle VM VirtualBox Works
Virtualizing an operating system on an x86 processor is a difficult task, especially without Intel VT-x or AMD-V hardware features. Before describing how VirtualBox works, a quick review of the x86 storage protection model is necessary.
The Intel x86 architecture defines four levels of storage protection called rings, which are numbered from 0 (the most privileged) to 3 (the least privileged). These rings are used by operating systems to protect critical system memory from programming errors in less-privileged user applications. Of these four levels, ring 0 is special in that it allows software to access real processor resources such as registers, page tables, and service interrupts. Most operating systems execute user programs in ring 3 and their kernel services in ring 0.
VirtualBox runs a single process on the host operating system for each virtual guest. All of the guest user code is run natively in ring 3, just as it would be if it were running in the host. As a result, user code will perform at native speed when running in a guest virtual machine.
To protect the host against failures in the guest, the guest kernel code is not allowed to run in ring 0 but instead runs in ring 1 if there is no hardware virtualization support, or in a VT-x ring 0 context if such support is available. This presents a problem because the guest may be executing instructions that are permitted only in ring 0 while other instructions behave differently when run in ring 1. To maintain proper operation of the guest kernel, the VirtualBox Virtual Machine Monitor (VMM) scans the ring 1 code and either replaces the troublesome code paths with direct hypervisor calls or executes them in a safe emulator.
In some situations, the VMM may not be able to determine exactly what the relocated ring 1 guest code is doing. In these cases, VirtualBox makes use of a QEMU emulator to achieve the same general goals. Examples include running BIOS code, real-mode operations early during guest booting when the guest disables interrupts, or when an instruction is known to cause a trap that may require emulation.
Because this emulation is slow compared to the direct execution of guest code, the VMM includes a code scanner that is unique for each supported guest. As mentioned earlier, this scanner will identify code paths and replace them with direct calls into the hypervisor for a more correct and efficient implementation of the operation. In addition, each time a guest fault occurs, the VMM will analyze the cause of the fault to see if the offending code stream can be replaced by a less expensive method in the future. As a consequence of this approach, VirtualBox performs better than a typical emulator or code recompiler. It can also run a fully virtualized guest at nearly the same speed as one that is assisted by Intel VT-x or AMD-V features.
Some operating systems may run device drivers in ring 1, which can cause a conflict with the relocated guest kernel code. These types of guests will require hardware virtualization.
5.1.1 Oracle VM VirtualBox Architecture
VirtualBox uses a layered architecture consisting of a set of kernel modules for running virtual machines, an API for managing the guests, and a set of user programs and services. At the core is the hypervisor, implemented as a ring 0 (privileged) kernel service. Figure 5.2 shows the relationships between all of these components. The kernel service consists of a device driver named vboxsrv, which is responsible for tasks such as allocating physical memory for the guest virtual machine, and several loadable hypervisor modules for things like saving and restoring the guest process context when a host interrupt occurs, turning control over to the guest OS to begin execution, and deciding when VT-x or AMD-V events need to be handled.
Figure 5.2 Oracle VM VirtualBox Architecture
The hypervisor does not get involved with the details of the guest operating system scheduling. Instead, those tasks are handled completely by the guest during its execution. The entire guest is run as a single process on the host system and will run only when scheduled by the host. If they are present, an administrator can use host resource controls such as scheduling classes and CPU caps or reservations to give very predictable execution of the guest machine.
Additional device drivers will be present to allow the guest machine access to other host resources such as disks, network controllers, and audio and USB devices. In reality, the hypervisor actually does little work. Rather, most of the interesting work in running the guest machine is done in the guest process. Thus the host's resource controls and scheduling methods can be used to control the guest machine behavior.
In addition to the kernel modules, several processes on the host are used to support running guests. All of these processes are started automatically when needed.
- VBoxSVC is the VirtualBox service process. It keeps track of all virtual machines that are running on the host. It is started automatically when the first guest boots.
- vboxzoneacess is a daemon unique to Solaris that allows the VirtualBox device to be accessed from an Oracle Solaris Container.
- VBoxXPCOMIPCD is the XPCOM process used on non-Windows hosts for interprocess communication between guests and the management applications. On Windows hosts, the native COM services are used.
- VirtualBox is the process that actually runs the guest virtual machine when started. One of these processes exists for every guest that is running on the host. If host resource limits are desired for the guest, this process enforces those controls.
5.1.2 Interacting with Oracle VM VirtualBox
There are two primary methods for a user to interact with VirtualBox: a simple graphical user interface (GUI) and a very complete and detailed command-line interface (CLI). The GUI allows the user to create and manage guest virtual machines as well as set most of the common configuration options. When a guest machine is started from this user interface, a graphical console window opens on the host that allows the user to interact with the guest as if it were running on real hardware. To start the graphical interface, type the command VirtualBox at any shell prompt. On Oracle Solaris, this command is found in /usr/bin and is available to all users.
The CLI is the VBoxManage command. VBoxManage has many subcommands and options, some of which are discussed in the following sections. To get a list of all VBoxManage options, just type VBoxManage at any shell prompt. Without any command arguments, VBoxManage will respond with a list of all valid options. When a VBoxManage command successfully completes, it will print out a banner similar to the one in the following example:
% VBoxManage list vms Sun VirtualBox Command Line Management Interface Version 3.1.4 (C) 2005-2010 Sun Microsystems, Inc. All rights reserved. "Windows XP" {4ec5efdc-fa76-49bb-8562-7c2a0bac8282}
If the banner fails to print, an error occurred while processing the command. Usually, diagnostic information will be displayed instead of the banner. If the banner is the only output, the command successfully completed. In the examples in the remainder of this chapter, the banner output has been omitted for the sake of brevity.