- 2.1 ZFS Pool Concepts
- 2.2 Creating a Dynamic Stripe
- 2.3 Creating a Pool with Mirrored Devices
- 2.4 Creating a Pool with RAID-Z Devices
- 2.5 Creating a Spare in a Storage Pool
- 2.6 Adding a Spare Vdev to a Second Storage Pool
- 2.7 Replacing Bad Devices Automatically
- 2.8 Locating Disks for Replacement
- 2.9 Example of a Misconfigured Pool
2.7 Replacing Bad Devices Automatically
ZFS has the capability to replace a disk in a pool automatically without intervention by the administrator. This feature, known as autoreplace, is turned off by default. This feature will allow ZFS to replace a bad disk with a spare from the spares pool, automatically allowing the pool to operate at optimum performance. This allows the administrator to replace the failed drive at a later time. The manual disk replacement procedure is covered later in this chapter.
If you list the properties of the ZFS pool, you can see that the autoreplace feature is turned off using the get subcommand.
1 # zpool get all mpool 2 NAME PROPERTY VALUE SOURCE 3 mpool size 234M - 4 mpool used 111K - 5 mpool available 234M - 6 mpool capacity 0% - 7 mpool altroot - default 8 mpool health DEGRADED - 9 mpool guid 11612108450022771594 - 10 mpool version 13 default 11 mpool bootfs - default 12 mpool delegation on default 13 mpool autoreplace off default 14 mpool cachefile - default 15 mpool failmode wait default 16 mpool listsnapshots off default |
To turn on the autoreplace feature, use the following command line:
# zpool autoreplace=on mpool
To simulate a bad device, I shut down the system and removed c5t4d0 from the system. The system now cannot contact the disk and has marked the removed disk as unavailable. With the system rebooted, you can examine the output of the zpool status command:
1 $ zpool status mpool 2 pool: mpool 3 state: DEGRADED 4 status: One or more devices could not be opened. Sufficient replicas exist for 5 the pool to continue functioning in a degraded state. 6 action: Attach the missing device and online it using 'zpool online'. 7 see: http://www.sun.com/msg/ZFS-8000-2Q 8 scrub: resilver completed after 0h0m with 0 errors on Mon Apr 6 00:52:36 2009 9 config: 10 11 NAME STATE READ WRITE CKSUM 12 mpool DEGRADED 0 0 0 13 mirror ONLINE 0 0 0 14 c5t2d0 ONLINE 0 0 0 15 c5t3d0 ONLINE 0 0 0 16 mirror DEGRADED 0 0 0 17 spare DEGRADED 0 0 0 18 c5t4d0 UNAVAIL 0 89 0 cannot open 19 c5t14d0 ONLINE 0 0 0 31K resilvered 20 c5t5d0 ONLINE 0 0 0 31K resilvered 21 spares 22 c5t14d0 INUSE currently in use 23 24 errors: No known data errors |
On line 3, the state of the pool has been degraded, and line 4 tells you that the pool can continue in this state. On lines 6 and 7, ZFS tells you what actions you will need to take, and by going to the Web site, a more detailed message tells you how to correct the problem. Line 19 tells you that the spare disk has been resilvered with disk c5t5d0. Line 22 now gives you the new status of the spare disk c5t14d0.
The original definition of resilver is the process of restoring a glass mirror with a new silver backing. In ZFS, it is a re-creation of data by copying from one disk to another. In other volume management systems, the process is called resynchronization. Continuing the example, you can shut down the system and attach a new disk in the same location of the missing disk. The new disk at location c5t4d0 is automatically resilvered to the mirrored vdev, and the spare disk is put back to an available state.
$ zpool status mpool pool: mpool state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Mon Apr 6 02:21:05 2009 config: NAME STATE READ WRITE CKSUM mpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 23K resilvered c5t5d0 ONLINE 0 0 0 23K resilvered spares c5t14d0 AVAIL errors: No known data errors |
If the original disk is reattached to the system, ZFS does not handle this case with the same grace. Once the system is booted, the original disk must be detached from the ZFS pool. Next, the spare disk (c5d14d0) must be replaced with the original disk. The last step is to place the spare disk back into the spares group.
$ pfexec zpool detach mpool c5t4d0 $ pfexec zpool replace mpool c5t14d0 c5t4d0 $ pfexec zpool add mpool spare c5t14d0 $ zpool status mpool pool: mpool state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Mon Apr 6 02:50:25 2009 config: NAME STATE READ WRITE CKSUM mpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t2d0 ONLINE 0 0 0 c5t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c5t4d0 ONLINE 0 0 0 40.5K resilvered c5t5d0 ONLINE 0 0 0 40.5K resilvered spares c5t14d0 AVAIL errors: No known data errors |