- Introduction
- Windows Clustering 101
- Forest Creation Process
- Installation
- Installation of Root Domain
- Quality Assurance
- Forest Preparation, DNS, and Exchange
- Installation of Bridgehead Servers and the Child Domain
- Installing DHCP and WINS Services
- Patching and Updating Domain Controllers
- Exchange Domain Preparation
- Creation of Initial Service and Administration Resources
- Clustering
- Time-Out
Clustering
A number of steps must be completed before a cluster with multiple nodes is complete. Figure 6.6 provides a flow-chart of the steps to peform.
Figure 6.6 Cluster creation flow-chart.
Create Shared Disk Resources
The first step in creating the cluster is the configuration of the disk drives. Obviously, the cluster creation fails if it does not find or recognize drives. If you are going to cluster on a SAN or a SCSI-shared storage array, then you first need to install your host bus adapters (HBAs) in the servers and configure them. This step might entail installing special drivers for the cards, management software, and any patches that may be necessary to get them working in Windows Server 2003.
After the adapters are installed and the interface management software sees the controllers working, you'll connect them to the SCSI array or to the switches of the SAN fabric. By now, your disk arrays are installed and ready to go.
This looks like a small step from the flow-chart in Figure 6.6, but it's not. It can take a lot of time and effort to set up the SAN devices, and the effort can vary greatly between different SANs SCSI arrays or disk replication solutions, such as the one provided by NSI-Software (see Chapters 9 and 10). The installation and configuration of the SAN, fabric, zoning, and so on, is very complex and beyond the scope of this book.
Make sure your servers see the external or replicated drives. If everything is configured properly, your Windows servers see the drives as if they are installed on the same server. You are able to manage the new drives the system sees from the SAN management software and various server utilities, including the Computer Management utility. This is demonstrated in Figure 6.7.
Figure 6.7 Computer Management recognizing the shared disk drives.
In Figure 6.7, notice the presence of the P and Q drives. In this case, we have configured the P drive for the application and the Q drive is the drive that holds the quorum resource. The "Quorum" drive is essential for the cluster and accommodates the so-called Quorum resource.
Prepare the Cluster Network
The next step is to prepare the cluster network or interconnect between the nodes in the cluster. If you are installing a 2-node (active-passive) cluster, it is possible to install an interconnect network between the nodes using a single network cable attached NIC to NIC. The network link needs to be crossed over, but you may not need a cross-over cable because most modern servers employ NICs that recognize the need to cross over the datapaths.
Your interconnect IP configuration must be different to the LAN NICs. In other words, you should set up a private subnet between the servers (unless you are setting up geo-clusters and don't have enough cable to stretch your cluster from NY to LA). For example, if your LAN is on a subnet configured as 10.10.20.0, then put the interconnect on a 192.168.0.0 subnet. The IP on a one-node is, thus, 192.168.0.1, and the NIC on the other node is 192.168.0.2. Leave the gateway addresses on both NICs vacant. As long as the .1 can ping .2, your interconnect is ready.
If you are going to install an N+1 node or any configuration comprising of more than two nodes, then you need to use a hub for your interconnect network. This issue was discussed in Chapter 4. Remember, you don't need a switch.
One last word: Make sure your interconnect NIC's IP addresses do not end up in the DNS configuration as belonging to your virtual server (the cluster name) because that can result in problems for clients connecting to the name resource. In other words, they can look up the resource IP address, but they are unable to connect to it.
Start Server Cluster Wizard
You can install a cluster interactively using the GUI of the Server Cluster Wizard, or from the command line with command-line parameters passed to the "cluster" executable (cluster/create). We recommend that until you know enough about what makes the cluster service tick, you should work with the wizard. The remainder of this chapter discusses installing a cluster using the wizard.
At this point in the cluster configuration and installation, shut down all potential cluster nodes except the first node. It is important you install the first node without the possibility that other nodes might interfere with the installation process. The cluster is created on the first node because it is allowed to gain exclusive use of the shared resources. It installs a cluster only if it discovers that it is the first node in the cluster. After the cluster has been created, the next node is added to the cluster, and the procedure is different.
Also, when you power on and start the operating system, make sure it is only the first node that has access to the cluster disk. If another server can see and access the disks, the data on them can be easily destroyed and would have to be reformatted. To prevent the corruption of the cluster disks, you should shut down all but the cluster node you are going to make the first node in the cluster. You can use other techniques (such as, Logical Unit Number or LUN masking, selective presentation, or zoning) to protect the cluster disks before creating the cluster, but we have learned that it's safer to simply power down all the other nodes until you have a cluster. After the Cluster service is running properly on one node, the other nodes can be powered up and then added to the cluster as needed.
When you create a cluster, the physical disk resources are automatically created for cluster disks that use drive letters. As mentioned earlier, follow a sound naming convention for all your resources and keep the names consistent. This is critical to do as you will see in the final chapter in this book when we configure and use Microsft Operations Manager. The alerts and logs are not much help if you can't identify the devices and the servers they are on in your MOM data.
To get started, open Cluster Administrator from Adminstrative Tools. Select File, Open Connection, and then select Create New Cluster from the Action list in the dialog box that appears. This is demonstrated in Figure 6.8. Click OK to launch the Server Cluster Wizard.
Figure 6.8 The Create New Cluster option in Cluster Administrator.
The first request the wizard makes is for the domain name of the cluster and the cluster name. For the deployment shown in this chapter, we make sure that we have the correct domain name and that the name you use for the virtual server is the cluster name. This is demonstrated in Figure 6.9. You can set up additional network names for the actual application resoureces (such as SQL Server) as we show in Part II of this book.
Figure 6.9 Cluster domain and cluster name.
Enter the domain name and cluster name and then click Next. The Select Computer Name dialog box appears. Enter the name of the node you are installing (typically the server on which you started the Cluster Administrator), and then click Next. The wizard now analyzes the configuration to check if it has everything it needs to create a cluster. This is demonstrated in Figure 6.10, which shows the cluster has failed due to a variety of reasons. When you see a lot of red and yellow in the dialog box, it's a sign you have work to do before you can move forward.
If the analaysis fails, you can simply go back or cancel out of the wizard and proceed to fix the problems that were discovered in the configuration analysis stage. The wizard then can be restarted at any time. Figure 6.11 shows that now the configuration analysis has succeeded. You are looking for check marks in all areas and a solid green line on the progress bar. When you have a clean analayis, click Next to continue installing the cluster.
Figure 6.10 Cluster configuration analysis has failed.
Figure 6.11 Cluster configuration analysis has succeeded.
The next dialog box prompts you for an IP address that Cluster Administrator can connect to. This is shown in Figure 6.12. Enter the IP address and click Next.
Figure 6.12 Cluster configuration IP address requirement.
The Cluster Service Account dialog box now appears. This is shown in Figure 6.12. You need to enter an account name, its password, and domain before continuing. Create an account specially for the cluster services account (create a separate account for each cluster). See Figure 6.13 for creating a cluster service account.
Figure 6.13 Cluster service account configuration requirement.
In the example shown, it is clear from the account name that the cluster service account is intended for the first SQL Server cluster. Under no circumstance make the account a member of Domain Admins. Making the cluster service account a member of Domain Admins was a common practice with earlier version of Windows. With Windows Server 2003, the account only needs to have administrative rights on each knot of the cluster. Upon entering the account data, click Next. The proposed cluster configuration is presented in the next dialog box. You can confirm the configuration and then go back to make a change if needed. If everything checks out, then click Next to begin the installation. Upon successful creation of the cluster, the dialog box shown in Figure 6.14appears. When you again have a solid green line in the progress bar, you have yourself a cluster. The next dialog box gives you an opportunity to examine the cluster installation log.
Figure 6.14 The cluster has been successfully created.
Now you can close Cluster Administrator and then reopen it to attach to the local node where you now have a single node cluster running. You attach Cluster Administrator to the name of the cluster or you can use the . (dot) notation, which is the symbol for local. If the Administrator attaches to the node successfully, the cluster can be accessed and your configuration can continue. This is demonstrated in Figure 6.15.
Figure 6.15 Attaching to the cluster.
You can now continue to build the cluster by adding additional nodes to it. In Cluster Administrator, click File, select New, and then select Node from the child menu. This is shown in Figure 6.16.
Figure 6.16 Adding nodes to the cluster.
Upon selecting New, the Add Nodes Wizard appears and prompts you to enter the name of the server that will be added as a new node to the cluster. This dialog box is shown in Figure 6.17.
Figure 6.17 The Add Nodes Wizard.
Add the computer details and click Next. From now until the end, the process is the same as before for the first node. You are asked again for cluster service account information, and you have to provide the same service account used for the first node in the cluster. The Add Nodes Wizard again performs Configuration Analysis. When you see a green progress bar, you have a two-node cluster and you are ready to begin configuring resources for the cluster.
Cluster Administrator can tell you whether the cluster is operating properly. Open a command prompt and enter the command cluster.resource. This action lists the status for the available resources of the cluster. (You can issue this command even before you have added the second node to the cluster.) This is illustrated in Figure 6.18.
Figure 6.18 Checking cluster resource status.
It is also important to check whether the cluster has been registered in DNS and can be accessed from the network. You can do this by simply pinging the cluster name from the command line as demonstrated in Figure 6.19.
Figure 6.19 Ping the virtual server or cluster name on the network.
During the cluster creation process (using the Quorum button on the Proposed Cluster Configuration page), you are able to select a quorum resource type (that is, a Local Quorum resource, Physical Disk, or other storage class device resource, or Majority Node Set resource).
Troubleshooting
If things go bad and the cluster fails, you can simply back out of the clustering process, fix the errors, and restart the process. Usually the clustering process simply starts again with no issues. It is possible to corrupt the cluster database or contaminate it with invalid data. You may have to back out a node from the cluster, and it may not be possible to do this cleanly.
If you need to evict a node from the cluster, you can do this from the Cluster Administrator. Figure 6.20 shows the process of evicting a node that has for some reason become inoperable.
Figure 6.20 Evicting a node from the cluster.
Now, if the database is corrupt, it might not be possible to evict the node, and you may have to destroy the cluster and start all over again. When you shut down a cluster, you do not remove the cluster database (it's like the WINS or DHCP database; it's always there). It remains on the disk and it can remain in corrupt state. You are unable to re-create a cluster until the database is clean again. If you have cause to blow away the cluster and start all over again with a clean database, then perform the following steps.
Open up the command window on each node and change the directory to the Cluster folder in the system root (such as C:\Windows\Cluster). Then run the /forcecleanup command. The exact command you use is very important and not easily remembered. See Figure 6.21, which demonstrates this.
Figure 6.21 Cleaning up the cluster database on a node.
Now you have seen the Cluster command used for more than one reason. It is obviously clear you can call the Cluster executable from a script and configure a cluster after an unattended setup. As soon as the operating system is online, you can run a script to invoke the cluster /create command and supply it the the necessary configuration parameters at command line. Imagine that you drop a CD into a blank server and go have a cup of coffee. When you return, you have a cluster running and serving tens of thousands of users.