- Resource Pooling
- Resource Reservation
- Hypervisor Clustering
- Redundant Storage
- Dynamic Failure Detection and Recovery
- Multipath Resource Access
- Redundant Physical Connection for Virtual Servers
- Synchronized Operating State
- Zero Downtime
- Storage Maintenance Window
- Virtual Server Auto Crash Recovery
- Non-Disruptive Service Relocation
Hypervisor Clustering
How can a virtual server survive the failure of its hosting hypervisor or physical server?
Problem |
The failure of a hypervisor or its underlying physical server cascades to all hosted virtual servers further causing their hosted IT resources to fail. |
Solution |
Hypervisors are clustered across multiple physical servers, so that if one fails, active virtual servers are transferred to another. |
Application |
Heartbeat messages are passed between clustered hypervisors and a central VIM to maintain status monitoring. Shared storage is provided for the clustered hypervisors and further used to store virtual server disks. |
Mechanisms |
Cloud Storage Device, Hypervisor, Logical Network Perimeter, Resource Cluster, Resource Replication, Virtual Infrastructure Manager (VIM), Virtual Server, Virtual Switch, Virtualization Monitor |
Problem
Virtual servers run on a hypervisor, and hardware resources are emulated for the virtual servers via the hypervisors. If the hypervisor fails or if the underlying physical server fails (thereby causing the hypervisor to fail), the failure condition cascades to all of its hosted virtual servers.
The following steps are shown in Figure 4.7:
- Physical Server A hosts a hypervisor that hosts Virtual Servers A and B.
When Physical Server A fails, the hypervisor and the two virtual servers fail as well.
Figure 4.7 Two virtual servers experience failure after their host physical server goes down.
Solution
A high-availability hypervisor cluster is created to establish a group of hypervisors that span physical servers. As a result, if a given physical server or hypervisor becomes unavailable, hosted virtual servers can be moved to another physical server or hypervisor (Figure 4.8).
Figure 4.8 Physical Server A becomes unavailable, thereby bringing down its hypervisor. Because the hypervisor is part of a cluster, Virtual Server A is migrated to a different host (Physical Server B), which has another hypervisor that is part of the same cluster.
Application
A hypervisor cluster architecture is established and controlled via a central VIM, which sends regular heartbeat messages to the hypervisors to confirm that they are up and running. Any heartbeat messages that are not successfully acknowledged can lead the VIM to initiate the live VM migration program in order to dynamically move affected virtual servers to a new host. The hypervisor cluster utilizes a shared cloud storage device, which is used during the live migration of virtual servers by different hypervisors in the cluster.
Figures 4.9 to 4.12 provide examples of the results of applying the Hypervisor Clustering pattern, accompanied by numbered steps.
Figure 4.9 A cloud architecture resulting from the application of the Hypervisor Clustering pattern (Part I). These initial steps detail the assembly of required components.
Figure 4.10 A cloud architecture resulting from the application of the Hypervisor Clustering pattern (Part II).
Figure 4.11 A cloud architecture resulting from the application of the Hypervisor Clustering pattern (Part III).
Figure 4.12 A cloud architecture resulting from the application of the Hypervisor Clustering pattern (Part IV).
- Hypervisors are installed on the three physical servers.
- Virtual servers are created by the hypervisors.
- A shared cloud storage device containing virtual server configuration files is positioned so that all hypervisors have access to it.
- The hypervisor cluster is enabled on the three physical server hosts via a central VIM.
- The physical servers exchange heartbeat messages with each other and the VIM, based on a predefined schedule.
- Physical Server B fails and becomes unavailable, jeopardizing Virtual Server C.
- The VIM and the other physical servers stop receiving heartbeat messages from Physical Server B.
- Based on the available capacity of other hypervisors in the cluster, the VIM chooses Physical Server C as the new host to take ownership of Virtual Server C.
- Virtual Server C is live-migrated to the hypervisor running on Physical Server C, where it may need to be restarted before continuing to operate normally.
Mechanisms
- Cloud Storage Device – The cloud storage device mechanism acts as a central repository that hosts the virtual server folders, so that the folders and virtual server configurations are accessible to all of the hypervisors participating in the cluster.
- Hypervisor – The hypervisor is the primary mechanism by which this pattern is applied. It acts as a member of the cluster and hosts the virtual servers. If a hypervisor fails, one of the other available hypervisors restarts its virtual machine to recover the hosted virtual servers from failure.
- Logical Network Perimeter – This mechanism creates logical boundaries that ensure that none of the hypervisors of other cloud consumers are accidentally included in a given cluster.
- Resource Cluster – The resource cluster is the fundamental mechanism used to create and initiate hypervisor clusters.
- Resource Replication – Each hypervisor informs others in the cluster about its status and availability. When a part of cluster configuration needs to be changed, for instance when a virtual switch is created, deleted, or modified, then this update may need to be replicated to all hypervisors via the VIM.
- Virtual Infrastructure Manager (VIM) – This mechanism is used to create and configure the hypervisor cluster, add cluster members to the cluster, and cascade the cluster configuration to cluster members.
- Virtual Server – Virtual servers represent the type of mechanism that is protected by the application of this pattern.
- Virtual Switch – This mechanism is used to ensure that any virtual servers retrieved from hypervisor failure will be accessible to cloud consumers.
- Virtualization Monitor – This mechanism is responsible for actively monitoring the hypervisors, and sending alerts whenever one of the hypervisors in the cluster fails.