- 11.1 Workload Distribution Architecture
- 11.2 Resource Pooling Architecture
- 11.3 Dynamic Scalability Architecture
- 11.4 Elastic Resource Capacity Architecture
- 11.5 Service Load Balancing Architecture
- 11.6 Cloud Bursting Architecture
- 11.7 Elastic Disk Provisioning Architecture
- 11.8 Redundant Storage Architecture
- 11.9 Case Study Example
11.3 Dynamic Scalability Architecture
The dynamic scalability architecture is an architectural model based on a system of predefined scaling conditions that trigger the dynamic allocation of IT resources from resource pools. Dynamic allocation enables variable utilization as dictated by usage demand fluctuations, since unnecessary IT resources are efficiently reclaimed without requiring manual interaction.
The automated scaling listener is configured with workload thresholds that dictate when new IT resources need to be added to the workload processing. This mechanism can be provided with logic that determines how many additional IT resources can be dynamically provided, based on the terms of a given cloud consumer’s provisioning contract.
The following types of dynamic scaling are commonly used:
- Dynamic Horizontal Scaling – IT resource instances are scaled out and in to handle fluctuating workloads. The automatic scaling listener monitors requests and signals resource replication to initiate IT resource duplication, as per requirements and permissions.
- Dynamic Vertical Scaling – IT resource instances are scaled up and down when there is a need to adjust the processing capacity of a single IT resource. For example, a virtual server that is being overloaded can have its memory dynamically increased or it may have a processing core added.
- Dynamic Relocation – The IT resource is relocated to a host with more capacity. For example, a database may need to be moved from a tape-based SAN storage device with 4 GB per second I/O capacity to another disk-based SAN storage device with 8 GB per second I/O capacity.
Figures 11.5 to 11.7 illustrate the process of dynamic horizontal scaling.
Figure 11.5 Cloud service consumers are sending requests to a cloud service (1). The automated scaling listener monitors the cloud service to determine if predefined capacity thresholds are being exceeded (2).
Figure 11.6 The number of requests coming from cloud service consumers increases (3). The workload exceeds the performance thresholds. The automated scaling listener determines the next course of action based on a predefined scaling policy (4). If the cloud service implementation is deemed eligible for additional scaling, the automated scaling listener initiates the scaling process (5).
Figure 11.7 The automated scaling listener sends a signal to the resource replication mechanism (6), which creates more instances of the cloud service (7). Now that the increased workload has been accommodated, the automated scaling listener resumes monitoring and detracting and adding IT resources, as required (8).
The dynamic scalability architecture can be applied to a range of IT resources, including virtual servers and cloud storage devices. Besides the core automated scaling listener and resource replication mechanisms, the following mechanisms can also be used in this form of cloud architecture:
- Cloud Usage Monitor – Specialized cloud usage monitors can track runtime usage in response to dynamic fluctuations caused by this architecture.
- Hypervisor – The hypervisor is invoked by a dynamic scalability system to create or remove virtual server instances, or to be scaled itself.
- Pay-Per-Use Monitor – The pay-per-use monitor is engaged to collect usage cost information in response to the scaling of IT resources.