- Introduction
- Technology Options for Disaster Recovery Solutions
- Quick Checklist for Deployments
- Campus Cluster Maximum Distances
- Campus Cluster Topologies and Components
- Campus Cluster Configurations
- Performance in a Campus Cluster Environment
- Management Aspects of Campus Clusters
- Glossary
- Related Resources
Performance in a Campus Cluster Environment
Data services running in a campus cluster environment may encounter performance degradation due to the latencies imposed by the distance between the nodes and the storage subsystems. Over long distances, the pure signal traveling time increases by a significant factor when compared with short distances. For example, a laser beam travels through a fiber optic cable at 0.2km/μs. Thus, a 40 km round trip adds a latency of 200 μs or 0.2 μs to all remote disk I/O operations and network transmissions between the sites. Network equipment may add to the latency.
Basic Performance Considerations
Because data must be mirrored to the remote site, some latency is unavoidable. The Sun Cluster 3.0 campus cluster environment requires the use of a volume manager product (such as Solaris Volume Manager software) for host-based remote data mirroring. The synchronous mirroring process introduces some latency. For example, a write operation is only complete if the write operations to all mirrors are completed. Fortunately, read operations are not affected if the preferred plex property is employed. This property can be used to direct the volume manager to a local plex for read operations.
Traffic routed over the cluster interconnect is affected by the distance of the nodes in the cluster infrastructure. This traffic includes intracluster traffic, network packets for scalable services, global file service (GFS) operations including data and replication information, and possibly user application traffic.
The GFSalso called the cluster file system (CFS)uses the concept of a single primary server node for a CFS. Nodes that are not the primary server access the CFS through the cluster interconnect. Most applications today do not make heavy concurrent use of the CFS. A recommended practice for campus cluster environments is to ensure that the primary server node of a CFS runs on the same node as the application that uses that CFS. A special resource type available in Sun Cluster 3.0 software called HAStoragePlus can be used to help ensure colocation of the storage and services using that storage. HAStoragePlus has a special resource property called AffinityOn that, if set to "True," provides exactly this functionality.
The failover file service (FFS through the HAStoragePlus resource type) may be deployed if there is no data requirement with CFS.
Heartbeats that use the cluster interconnect typically have time outs that are magnitudes higher than the latency even over a long distance. Due to the complexity of networks over long distances and the latencies introduced by additional hardware, the probability of a failure is much higher than in a local environment, so monitoring software must be configured accordingly.
Oracle Parallel Server and Oracle9i RAC
Enterprise environments increasingly deploy parallel databases to achieve greater scalability and service availability levels. In campus cluster environments, many companies may deploy the Oracle Parallel Server and Oracle9i Real Application Clusters (RAC) technologies. If the performance of an OPS/RAC configuration is latency bound on the interconnect or the storage, the longer distance between sites in a campus cluster most likely impacts the database performance negatively.
Performance Recommendations
The following recommendations help reduce possible performance impacts of wide-area campus clusters:
Use the "preferred plex" property of a volume manager to achieve good read performance.
Use the HAStoragePlus resource type to configure the colocation of application and storage (using the AffinityOn resource property) or to configure failover filesystems if the information does not need to be accessed globally.
Do intensive performance testing, especially for peak application usage levels, to ensure that unexpected performance degradation does not adversely affect the production environment.