Disk Technology Considerations
This section examines a number of the factors to consider when deciding on the disk technology to use in your environment.
Supported Disk Types
As you have seen, storage architecture is important, and the disk technology plays an important part. ESXi supports a variety of disks, including SSD, SAS, FC, SATA, NL-SAS, IDE, USB, and SCSI.
Many options are available, making it possible to adapt the technology according to several criteria. As shown in Table 3.5, in terms of disk technology, many parameters are to be considered: speed expressed in revolutions per minute and in I/O per second (IOPS), as well as bandwidth transfers.
Table 3.5. Average Speed Parameters for Disk Types (May Vary)
Disks |
RPM |
IOPS |
SSD |
N/A |
3000 |
SAS |
15 K |
180 |
SAS |
10 K |
130 |
NL-SAS |
7.2 K |
100 |
SATA |
5.4 |
50 |
Solid-State Drives (SSDs) are high-performance drives composed of flash memory. These disks are nonmechanical. They are less likely to experience failures, they consume less energy, and they heat up much less than traditional disks. Their access time is low, with very high IOPS (3000 IOPS). They are ideal for reading but not well adapted to a large quantity of writes.
These disks are typically used for log files (for example, for databases). They are often used to extend the cache memory of controllers. (EMC calls them Fast Cache disks, and Netapp calls them Flash Cache disks.) In a VMware environment, these high-performance disks are ideal for storing the swap memory of VMs. They are also very useful for absorbing the charge when activity spikes appear—for example, in a VDI environment, when all VMs boot simultaneously. (This phenomenon is called a boot storm.) Disk sizes currently available are 100 GB, 200 GB, and 400 GB. Soon, 800 GB will also be available.
Serial Attached SCSI (SAS) disks replace Fibre Channel disks. These disks are directly connected to the controller, point to point. Revolution speeds are high—10,000 RPM or 15,000 RPM. They are ideal for applications with random access, and they can process small-size I/O of 8 bytes to 16 bytes, typically databases. The stream is bidirectional. Current disk sizes are 300 GB, 600 GB, and 900 GB.
Today, SAS disks are best adapted to virtual environments, and they offer the best price–performance ratio. Although FC disks are still widely found in production environments, the trend is to replace them with SAS disks.
Near-Line SAS (NL-SAS) disks use the mechanics of SATA disks mounted on SAS interfaces. Their advantage over SATA is that they transmit data in full duplex. Read and write can be performed simultaneously, contrary to SATA, which allows only a single read or write at a time. These disks offer features that allow the interpretation of SCSI commands, such as command queuing (to reduce read-head movements), and they provide better control of errors and reporting than SATA disks.
Serial-ATA (SATA) disks allow the management of a large capacity—2 TB, 3 TB, and soon 4 TB. They are recommended for the sequential transfer of large files (for example, backups, video files), but are not suitable for applications with elevated random I/O-like databases (for example, Oracle, Microsoft SQL, MySQL). They are unidirectional and allow a single read or write at a time. Depending on storage array manufacturers, SATA may or may not be recommended for critical-production VMs. Find out from the manufacturer. SATA disks are always well-suited for test VMs or for ISO image, template, or back-up storage.
RAID
Table 3.6 lists recommendations for RAID types and associated traditional uses.
Table 3.6. RAID Types and Traditional Uses
|
Write |
Read |
Use |
Protection |
RAID0 |
Excellent |
Excellent |
Real-time workstation |
None (striping) |
RAID1 |
Excellent |
Excellent |
DB log file, operating system, ESXi Hypervisor |
Mirror |
RAID5 |
Good |
Very good |
DB, ERP, web server, file server, mail |
Parity |
RAID6 |
Average |
Very good |
Archiving, backup, file server |
Double parity |
RAID10 |
Excellent |
Excellent |
Large DB , application servers |
Striping + mirror |
Storage Pools
In a physical environment, a LUN is dedicated to a server and, thus, to a specific application. In this case, parameters can be set to adapt RAID levels to the application, either sequential or random. This method is not well adapted to a virtual environment. Indeed, because of the dynamic nature of a virtual environment, keeping the same LUN-attribution logic based on the application becomes difficult. VMs are mobile and move from one datastore to another. RAID levels risk not remaining the same. Instead of using dedicated RAID levels, some manufacturers suggest using storage pools. This method is preferable because it offers excellent performance and simplifies management.
Automatic Disk Tiering
Only 20% of a LUN’s data is frequently accessed. Statistics also show that 80% of data is unused after two weeks. Through automatic tiering, frequently used data is automatically placed on high-performance SSD or SAS disks, while less frequently used data is stored on lower-performance disks such as SATA or NL-SAS.
Performance
In virtual environments, monitoring performance is complex because of resource pooling and the various layers (for example, applications, hypervisor, storage array). Speeds measured in IOPS and bandwidths in MBps depend on the type and number of disks on which the datastore is hosted. Storage activity should be monitored to determine whether waiting queues form on either of these criteria (queue length). At the hypervisor or vCenter level, the most reliable and simplest performance indicator for identifying contentions is the device access time.
Access time through all HBAs should be below 20 ms in read and write. Another indicator that should be monitored and that shows a contention by highlighting an activity that cannot be absorbed by the associated storage is the Stop Disk value. This value should always be set to 0. If its value is higher than 0, the load should be rebalanced. There are usually two causes:
- VM activity is too high for the storage device.
- Storage has not been properly configured. (For example, make sure there is no zoning issue, that all paths to storage are available, that the activity is well distributed among all paths, and that the storage cache is not set to forced flush.)
Additional Recommendations
Following are additional recommendations that can improve disk performance:
- Using solid-state cache allows a significant number of I/O to disk. The cache serves as leverage because the major part of the read and write I/O activity occurs in the cache. Databases require very high I/O operations in 4-byte, 8-byte, or 16-byte random access, while video-file backup servers require high speeds with large block sizes (32, 64, 128, or 256 bytes).
- Sequential access and random access should not be mixed on the same disk. If possible, I/Os should be separated by type (read, write, sequential, random). For example, three VMs hosting one transactional DBMS-type database should each have three datastores:
- One datastore for the OS in RAID 5. Separating the OS means a VM can be booted without drawing from the database’s available I/Os from RAID 5.
- One datastore for the RAID 5 database if the read/write ratio is 70%/30%. If not, the RAID type should be changed. A database generally uses 70% of random read-type transactions.
- One datastore for logs in RAID 1 because writes are sequential (or RAID 10 for large databases with a high write rate).