Multipathing
Multipathing can be defined as a solution that uses redundant components, such as adapters and switches, to create logical paths between a server and a storage device.
Pluggable Storage Architecture
Pluggable Storage Architecture (PSA) is a collection of APIs that allows storage manufacturers to insert code directly into the VMkernel layer. Third-party software (for example, EMC PowerPath VE) can thus be developed to offer more advanced load-balancing functionalities in direct relation to their storage array’s technology. VMware, however, offers standard basic multipathing mechanisms, called native multipathing (NMP), divided into the following APIs: Storage Array Type Plug-in (SATP), which is in charge of communicating with the storage array; and Path Selection Plug-in (PSP), which provides access to load balancing between paths.
As shown in Figure 3.22, VMware offers three PSPs:
- Most Recently Used (MRU): Selects the first path discovered upon the boot of ESXi. If this path becomes inaccessible, ESXi chooses an alternate path.
- Fixed: Uses a dedicated path designated as the preferred path. If configured otherwise, it uses the path found at boot. When it can no longer use this path, it selects another available path at random. When it becomes available again, ESXi uses the fixed preferred path again.
- Round Robin (RR): Automatically selects all available paths and sends the I/O to each in a circular fashion, which allows basic load balancing. PSA coordinates NMP operations, and third-party software coordinates the multipathing plug-in (MPP) software.”
Figure 3.22. PSPs offered by VMware.
The NMP Round Robin path-selection policy has a parameter known as the I/O operation limit, which controls the number of I/Os sent down each path before switching to the next path. The default value is 1000; therefore, NMP defaults to switching from one path to another after sending 1000 I/Os down any given path. Tuning the Round Robin I/O operation limit parameter can significantly improve the performance of certain workloads (such as online transaction processing [OLTP]). In environments that have random and OLTP workloads, setting the Round Robin parameter to a lower number yields the best throughput, but lowering the value does not improve performance as significantly as it does for sequential workloads. For these reasons, some hardware storage companies recommend that the NMP Round Robin I/O Operation parameter should be lower (can be set to 1).
Third-party software solutions use more advanced algorithms because a limitation of Round Robin is that it performs an automatic distribution without taking into account the actual activity at path level. Some software establishes dynamic load balancing and is designed to use all paths at all times rather than Round Robin, which uses only a single path at a time to bear the entire I/O burden.
Modes
Access to data stored on shared storage space is fundamental in a virtual environment. VMware strongly recommends implementing several access paths to the LUN. Two paths is a minimum, but VMware recommends using four. Multipathing reduces service interruptions by offering a redundancy of access paths to the LUNs. When a path is not available, another is used—without service interruption. These switch mechanisms are called multipath I/O (MPIO).
In VMware, as shown in Figure 3.23, storage can adopt various modes:
- Active/active: At a given moment, a LUN is connected to several storage controllers at the same time. The I/O can arrive from several controllers simultaneously.
- Active/passive: At a given moment, a single controller owns one LUN (owned LUN). No other controller can send I/O to this LUN as long as it is linked to a controller.
- ALUA: Access to a LUN is not direct (nonoptimized) but occurs in an asymmetrical manner, going through the secondary controller.
Figure 3.23. Storage modes.