Hardware Replication Challenges
As the business value of the data increases, the replication technique becomes more important. Consequently, more and more information technology (IT) infrastructures rely on data replication for availability, access, and security of vital data. Data replication consists of cloning a set of data stored on a primary disk onto a secondary disk. Storage administrators set up data replication for various reasons:
Disaster recovery plansstore a clone of the production data off site.
Live data manipulationtreat data, while avoiding overkill of the production system.
Online backupsback up online data without disrupting the production system
This article covers the following topics:
-
"I/O Stack and Data Replication"
-
"Hardware Replication"
-
"Hardware Replication Problems"
-
"Example 1: Logical Device With SVM"
-
"Example 2: Logical Device With VxVM"
-
"Conclusion"
This article is intended for use by intermediate-level system administrators.
The I/O stack model provides a programmatic view of the different treatment of data during I/O operations. Different types of replication exist along the I/O stack. This article describes the challenging aspects of hardware replication. Commonly used in large environments, this type of replication is attractive as it does not overload a server and because it is operating system (OS) independent. However, accessing the replicated data is not straightforward. Accessing the replicated data without sacrificing the performance and flexibility of a Logical Volume Manager (LVM) requires additional steps. As an example, this article details how to access, on a single host, both primary and secondary (replicated) data, using ShadowImage on the Sun StorEdge_ 99x0 systems and the VERITAS Volume Manager (VxVM). The article is as generic possible, so that this example can apply to other platforms.
I/O Stack and Data Replication
When dealing with storage systems, it is important to keep in mind the I/O architecture within a server. This I/O architecture, better called the I/O stack, describes how data, stored on a physical media, is accessed by an application. The I/O path corresponds to the software stack an I/O request has to go through. Replication of data consists of copying an I/O request from a primary I/O stack to a secondary one. This replication can occur within different layers of the I/O stack, either local or remote. The following paragraphs briefly describe the I/O layers and their corresponding replication modes.
FIGURE 1 I/O Stack Layers
Layer 1: Physical Layer
This I/O layer includes all of the storage hardware components: storage subsystems, the connectivity, disk drives, and storage area network (SAN) switches. When the replication occurs at this level, it is called hardware replication, and it is handled by a physical component such as a disk array (Sun StorEdge 99x0 systems for example) or even a SAN switch (Sun_ PSX-100).
Layer 2: Device Drivers Layer
Device drivers are software components, specific for given hardware. There is a device driver for every hardware device the operating system interacts with (Controller bus, device type, ...). Device drivers make a physical disk drive visible to the OS as a disk device (/dev/rdsk/c5t, ...). No replication products operate at the device driver level.
Layer 3: Logical Volume Manager Layer
The role of the logical volume manager (LVM) is to organize and manage disk devices, so as to optimize performance, availability, and capacity of a storage space. This software layer determines which physical device an I/O request goes to. This decision is handled by different algorithms such as RAID-1, RAID-5, and so on. This layer is very complex and, in many cases, it is optional. Different products on the market offer a LVM-based replication type. This includes the Sun StorEdge_ Availability Suite Software and the VERITAS Replicator.
Layer 4: File System Layer
On top of a logical volume is the file system. This software layer provides an interface to block addressing. Indeed, in order to store a data, it is necessary to determine a suitable location on the media. The file system translates a block address to a logical address, such as a file or directory. The file system is also responsible for managing the storage space by offering different services such as metadata, error checking, and clean-up. Products that offer replication at this I/O layer include the Solaris_ Operating System's built-in filesync and GNU rsync.
Layer 5: Application Layer
This layer corresponds to the application stack running on a given server. Replication is sometimes handled directly by the application. ORACLE products and Lotus Notes include replication mechanisms for their own data.