Solaris 10 ZFS Essentials: Managing Storage Pools
- 2.1 ZFS Pool Concepts
- 2.2 Creating a Dynamic Stripe
- 2.3 Creating a Pool with Mirrored Devices
- 2.4 Creating a Pool with RAID-Z Devices
- 2.5 Creating a Spare in a Storage Pool
- 2.6 Adding a Spare Vdev to a Second Storage Pool
- 2.7 Replacing Bad Devices Automatically
- 2.8 Locating Disks for Replacement
- 2.9 Example of a Misconfigured Pool
ZFS storage pools are the basis of the ZFS system. In this chapter, I cover the basic concepts, configurations, and administrative tasks of ZFS pools. In Chapter 5, I cover advanced configuration topics in ZFS pool administration.
The zpool command manages ZFS storage pools. The zpool command creates, modifies, and destroys ZFS pools.
Redundant configurations supported in ZFS are mirror (RAID-1), RAID-Z (similar to RAID-5), and RAID-Z2 with a double parity (similar to RAID-6). All traditional RAID-5-like algorithms (RAID-4, RAID-6, RDP, and EVEN-ODD, for example) suffer from a problem known as the RAID-5 write hole. If only part of a RAID-5 stripe is written and power is lost before all blocks have made it to disk, the parity will remain out of sync with the data (and is therefore useless) forever, unless a subsequent full-stripe write overwrites it. In RAID-Z, ZFS uses variable-width RAID stripes so that all writes are full-stripe writes. This design is possible only because ZFS integrates file system and device management in such a way that the file system's metadata has enough information about the underlying data redundancy model to handle variable-width RAID stripes. RAID-Z is the world's first software-only solution to the RAID-5 write hole.
2.1 ZFS Pool Concepts
ZFS pools are comprised of virtual devices. ZFS abstracts every physical device into a virtual device. A vdev can be a disk, a slice of a disk, a file, or even a logical volume presented by another volume manager such as Solaris Volume Manager (SVM) or a LUN from a hardware RAID device.
These are the virtual devices types:
- Dynamic stripe: A dynamic stripe is a nonredundant configuration of a simple disk or concatenation of disks.
- Redundant group (mirror, RAID-Z1, or RAID-Z2): A mirror can be a two-way or three-way mirror. RAID-Z groups are recommended to have up to nine disks in the group. If there are more, then multiple vdevs are recommended. Two disks minimum are needed for RAID-Z, and three disks at a minimum are needed for RAID-Z2. (Note that RAID-Z and RAID-Z1 are interchangeable terms. With the introduction of the RAID-Z2 feature, the term RAID-Z evolved into RAID-Z1 to differentiate it from RAID-Z2. I use both terms in this book.)
- Spare: A spare vdev is for a hot standby disk replacement.
- Log: A log vdev is for ZFS Intent Log (ZIL). The ZIL increases the write performance on ZFS. Only dynamic stripe and mirrored vdev configurations are supported for this vdev type.
- Cache: A cache vdev is used to speed up random reads from a RAID-Z-configured pool. Its intended use is for read-heavy workloads. There is no redundancy support at this point for this vdev type. If there is a read error, then ZFS will read from the original storage pool.
Log and cache vdevs are used with solid-state disks (SSDs) in the Sun Storage 7000 series in hybrid storage pools (HSPs; see http://blogs.sun.com/ahl/entry/fishworks_launch).
Best-practice guidelines for ZFS pools include the following:
- Mirrored configuration beyond a three-way mirror should not be used. Think about using a RAID-Z configuration instead.
- Use RAID-Z or RAID-Z2 virtual device groups with fewer than ten disks in each vdev.
- Using whole disks is best. ZFS works best with "just a bunch of disks" (JBOD).
- Use slices for vdev groups only for boot disks.
- Use disks of a terabyte or less for boot devices.
- Use matched-capacity disks (mixed geometry is OK) for the best maximum storage results.
- Use matching sizes of vdevs in a ZFS pool. Match the number of disks and redundancy groups in each vdev in a pool for best performance.
Creating/adding new vdevs to a ZFS pool is the most unforgiving part about ZFS. Once committed, some operations cannot be undone. The zpool command will warn you, however, if the operation is not what's expected. There is a force option in zpool to bypass any of the warnings, but it is not recommended that you use the force option unless you are sure you will not need to reverse the operation.
These are the rules for ZFS pools:
- Once a normal (dynamic stripe) vdev is added to a ZFS pool, it cannot be removed.
- Only the special-use vdevs can be removed: spares, log, and cache.
- Disks the same size or larger can be replaced within a vdev.
- Disks can be added to a single disk or mirrored vdev to form a mirror or a three-way mirror.
- New disks cannot be added to an existing RAID-Z or RAID-Z2 vdev configuration.