- What's New in SQL Server AlwaysOn and Availability Groups
- SQL Server AlwaysOn and Availability Groups
- Summary
SQL Server AlwaysOn and Availability Groups
Typical use cases (scenarios) for AlwaysOn and availability groups include the following:
- Need for High Availability nearing five-nines (99.999% available). This means that your database layer must be super resilient to failure and have nearly no data loss in the case of failure.
- You have disaster recovery (DR) needs that need to replicate data to another site (perhaps on the other side of the country or planet), but you can tolerate a little bit of data loss (and data latency).
- You have a performance need to offload some operational functions such as database backups away from your primary database. These must be completely accurate and have the highest integrity for recovery purposes.
- You have a performance and availability need to offload read-only processing/access away from your primary transactional database, and you can tolerate a bit of latency. Even when the primary is down, you still provide read-only access to your applications.
All of the these can be addressed by this new AlwaysOn Availability Groups feature, and it’s easier to do than you think.
Windows Failover Cluster Services
Windows Failover Cluster Services (WFCS) is actually considered a part of the basic HA foundation components described earlier, except that it’s possible to build an HA system without it (for example, a system that uses numerous redundant hardware components and disk mirroring or RAID for its disk subsystem). Microsoft has made WFCS the cornerstone of its clustering capabilities, and WFCS is used by applications that are cluster enabled. A prime example of a cluster-enabled technology is Microsoft SQL Server 2014 (and most of its components). For more on clustering or SQL clustering, see Chapter 44, SQL Server Failover Clustering.
WFCS can be set up in an active/passive or active/active mode. Essentially, in an active/passive mode, one server sits idle (passive) while the other is doing the work (active). If the active server fails, the passive one takes over the shared disk and the cluster-aware applications instantaneously.
AlwaysOn Failover Clustering Instances
Being always on is a pretty powerful statement and commitment. It is now possible to mostly achieve this commitment with infrastructure, instance, database, and client-connectivity-level HA. Built on WSFC (or Failover Cluster Server (FCS) as it is also known), the SQL Server AlwaysOn configuration leverages the tried-and-true experience and technology components of SQL clustering and database mirroring under the covers (and repackaged). This new packaging and enhancement has allowed Microsoft to deprecate database mirroring in favor of these new AlwaysOn options (FCI and availability groups). AlwaysOn Failover Cluster Instances (FCI) is the server-level instance portion of the AlwaysOn HA capability. As you can see on the left side of Figure 45.1, the FCI is a two-server (node) SQL cluster that will also be utilized in an availability group configuration to give the primary (the application database that your application uses) the highest level of resilience possible. We didn’t have to make the primary have SQL instance-level availability, but are showing you that you can do more than just simple AlwaysOn configurations. This SQL Server clustered instance shares the database as a part of its clustering configuration for the instance. Server A and Server B form the SQL cluster and are configured as an active/passive cluster. The entire SQL cluster will be Node 1 in our availability group configuration. More on the availability group (database-level availability) in the next section.
FIGURE 45.1 AlwaysOn and availability group components.
The FCI becomes fault tolerant at the server instance level for maximum availability of SQL Server itself (just as you know and love SQL clustering from the past) and is using a single storage location that is shared (owned) by the two underlying SQL instances for the clustered instances’ database storage. Okay, let’s continue exploring the concepts with AlwaysOn Availability Groups.
AlwaysOn Availability Groups
Figure 45.1 also shows the availability group capability. Availability groups are focused on database-level failover and availability by utilizing a data redundancy approach. Again, borrowing from the database mirroring experience (and underlying technologies), a transactionally consistent secondary replica is made that can be used for both read-only access (active for use at all times) and for failover if the primary database (primary replica) fails for any reason. In Figure 45.1, you can see a SQL Server AlwaysOn Availability Group being used for HA and even for distributing the read-only workload off of the primary SQL Server instance to the secondary replica. You can have up to four secondary replicas in an availability group with the first secondary replica being used for automatic failover (using the synchronous-commit mode), and then other secondary replicas available for workload distribution and manual failover use. Remember, this is storing data redundantly, and you can sure burn up a lot of disk storage fast. When in synchronous-commit mode, that secondary replica can also be used to make database backups because it is completely consistent with the primary replica. Outstanding!
Modes
As with database mirroring, two primary modes are used to move data via the transaction log from the primary replica to the secondary replicas; synchronous mode and asynchronous mode. Synchronous mode means that the data writes of any database change must be done in not only the primary replica but also the secondary replica as a part of one logical committed transaction. This can be costly in the sense of doubling the writes, so the connection between the primary and secondary should be fast and nearby (within the same subnet). However, for this reason, the primary and secondary replicas are in a transactionally consistent state at all times, which makes failover nearly instantaneous. Synchronous mode is what is used for automatic failover between the primary replica and the secondary replica. You can have up to 3 nodes in Synchronous mode (essentially two secondaries and one primary at once). Figure 45.1 shows that Node 1 and Node 2 are configured to use automatic failover mode (synchronous). As previously mentioned, because of this transactional consistency, it is also possible to do database backups against the secondary replica with 100% accuracy and integrity.
Asynchronous mode does not have the commit transaction requirement that synchronous mode has and is actually pretty lightweight (from a performance and overhead point of view). We have observed that, even in asynchronous mode, transactions made it to the secondary replicas pretty quickly (in seconds) in most cases. Network traffic and the number of transactions determine this. Asynchronous mode can also be used about anywhere you need within your stable network (across the country or even to another continent if you have decent network speeds).
The AlwaysOn Availability Groups feature also takes advantage of transaction record compression, which allows for compression of the transaction log records used in database mirroring and AlwaysOn configurations to increase the speed of transmission to the mirror or replicas.
In addition, as with database mirroring, during the data replication of the transaction, if data page errors are detected, the data pages on the secondary replica are repaired as a part of the transaction writes to the replica and raise the overall database stability even further (if you had not been replicating). Nice feature.
Read-Only Replicas
As you can also see in Figure 45.1, creating more secondary replicas is possible (up to eight). However, these must be asynchronous replicas. You can easily add these to the availability group and provide distribution of workload and significant mitigation to your performance. Figure 45.1 shows two additional secondary replicas used to handle all the read-only data accesses that would normally be hitting the primary database. These read-only replicas will have near-real-time data and can be pretty much anywhere you want (from a stable network point of view).
For Disaster Recovery
Figure 45.2 shows a typical AlwaysOn with availability groups configuration for DR purposes. It has a primary replica in Data Center 1, and its secondary replica is in Data Center 2. We use the asynchronous mode because of the distance and network speeds.
FIGURE 45.2 An AlwaysOn and availability group configuration for disaster recovery.
This means that the secondary replica at the DR site is only as good as the most recent transaction that was written asynchronously. Some data loss may happen in this mode, but for the DR purposes, this is easily meeting the DR requirements and service levels needed. As a bonus (almost free actually), the DR site (secondary replica) can also be used for read-only data access (as also shown in Figure 45.2). You are essentially leveraging a copy of the primary database that would normally not have been considered available if other DR technologies were being used. Most are in continuous update mode and do not support read-only modes at all.
Availability Group Listeners
Looking back at Figure 45.1, we also see that the virtual network names (VNNs) that are created for the WSFCs will be used when the availability group is created. In particular, the availability group must know the virtual network names (that reference the individual instances) of all nodes in the availability group. These can be used directly to reference the primary or the secondary replicas. But for more stability (and consistency), you can create an availability group listener as part of the availability group that abstracts these VNNs away from the application that must use the databases. In this way, the application sees only one connection name at all times, and the underlying failover state is completely insulated away from the application, yielding even higher consistency and availability from the application point of view.
Endpoints
Availability groups also leverage the endpoint concept for all communication (and visibility) from one node to another node in an availability group configuration. They are the exposed point used by the availability group communication between nodes. This is also the case with database mirroring. Availability group endpoints will be created as a part of each availability group node configuration (for each replica).
Combining Failover with Scale-Out Options
SQL Server 2014 pushes combinations of options to achieve higher availability levels. Building up an AlwaysOn FCI configuration with AlwaysOn Availability Groups with two or more replicas launches you into distributed workload scalability and maximum HA.
Building a Multinode AlwaysOn Configuration
Now, let’s build a multinode AlwaysOn configuration, create the clustering configuration, define the availability group, specify the databases roles, replicate the databases, and get our availability group listener up and running, as follows:
- Verify SQL Server instances are alive and well (exist).
- Set up WSFC.
- Prepare the database.
- Enable AlwaysOn HA.
- Back up the primary databases.
- Create the availability group.
- Select the databases for the availability group.
- Identify the primary and secondary replicas.
- Synchronize the data (primary to replica).
- Set up the listener (availability group listener).
- Connect using the listener.
- Fail over to a secondary.
Figure 45.3 shows the basic configuration that we will build. We have three nodes to work with but will focus on getting the primary and one secondary up and running in this example. The failover cluster will be named sqlproduction, the availability group will be named DXD-AG1, and the listener will be named DXD-AG1Listener. We’ll create and use a database named CompSales2014. The third node can be used to create another secondary replica for read-only accesses if you want.
FIGURE 45.3 The DXD AlwaysOn configuration details.
We’ve also indicated how the availability group will be used: primary (Node 1) for both read/write operations, secondary (Node 2) for backup and failover. The other secondary (Node 3, if you add it) can be used for read-only access. Okay, let’s get going.
Verify SQL Server Instances
We will assume that you have installed and have running at least two SQL Server instances on separate nodes that can be clustered for this configuration. These don’t have to be mirror images of each other, just viable SQL Server instances that can be enabled for AlwaysOn (Enterprise or Developer Editions). Verify that the SQL Server instances are alive and well.
Set Up Failover Clustering
For each of the servers (nodes), you need to configure WSFC.
Chapter 44 showed you how to do this via the Server Manager of each node. We’ll not cover this here except to show you what the configured feature should look like before you begin creating your AlwaysOn and Availability Group configuration.
As you can see in Figure 45.4, the Failover Cluster feature must be installed for each node.
FIGURE 45.4 The installed Failover Clustering feature as shown from the Server Manager.
You likely have also run a validation of the cluster configuration as well.
A number of extensive tests are performed on each node in the cluster that you are configuring. These tests take a bit of time, so go get tea or coffee, and then make sure that you look through the summary report for any true errors. You’ll likely see a few warnings that refer to items that were not essential to the configuration (some TCP/IP or network-related things are usually here). Figure 45.5 shows this cluster configuration validation running for the newly created cluster configuration. You are now ready to get into the AlwaysOn business.
FIGURE 45.5 Validation test for the cluster configuration.
We’ve already created the cluster group access point (named sqlproduction) as you can see in Figure 45.6. The IP address for this access point is: 192.168.1.120.
FIGURE 45.6 Access point for administering the cluster and cluster name.
This cluster will contain three nodes; SQLONE, SQLTWO, and SQLTHREE, as shown in the Failover Cluster Manager in Figure 45.7.
FIGURE 45.7 Failover Cluster Manager with the sqlproduction cluster and three nodes.
Okay, time to start the AlwaysOn configuration on the SQL Server side of the equation.
Prepare the Database
We first want to make sure that you have a primary database that can be used for this example. In the sample files and code listings folder for this book on the Web (and for this chapter), you’ll find a script named CreateDB4AlwaysOn.sql that you can pull into SQL Server Management Studio (SSMS) right now. We will create a database on the SQLONE node and add a test table for you to play with for this example. If you already have another database you want to use, go ahead and use it. Figure 45.8 shows our script in a query window that was just executed. It created a database named CompSales2014 (as mentioned earlier), created a Product table, added data to the table for testing purposes, and did an initial database backup (to NULL).
FIGURE 45.8 SSMS execution of create db, create table , and initial database backup (to NULL).
Enable AlwaysOn HA
For each of the SQL Server instances that you want to include in the AlwaysOn configuration, you need to enable their instances for AlwaysOn (if is turned off by default). From each node, bring up the SQL Server 2014 Configuration Manager and select the SQL Server Services node in the Services pane. Right-click the SQL Server instance for this node (a default instance name of MSSQLSERVER in this example) and choose Properties. Figure 45.9 shows the properties of this SQL Server instance. Click the AlwaysOn High Availability tab and choose Enable. Notice that the cluster name appears in this dialog box because this server was identified already in the cluster configuration step. Now, click OK (or apply) and you’ll receive a note about having to restart the service for this option to be used. After you have closed the Properties dialog, go ahead and right-click the SQL Server instance service again, but this time choose the Restart option. This will enable the AlwaysOn HA feature. For each of the other nodes (SQLTWO and SQLTHREE), do the same for their SQL Server configuration and the SQL Server instance that is to be included in the AlwaysOn configuration.
FIGURE 45.9 Enabling AlwaysOn HA via SQL Server Configuration Manager.
You could have also done this from PowerShell. At the prompt, just type in SQLPS and press Enter. You then enter the following:
Enable -SqlAlwaysOn PATH SQLSERVER:\SQL\EMU-SQL1\Default -FORCE
It will even restart the SQL Server service for you.
Back Up the Database
Before we venture on to create the availability group, you want to do a full database backup of the primary database (on Node 1: SQLONE). This backup will be used to replicate the database to the replicas. From the database node of the primary database (in SSMS), choose to perform a full backup of the database by right-clicking the database, choosing Tasks, and then clicking Back Up. Figure 45.10 shows this Back Up Database dialog. Click OK to perform the full backup.
FIGURE 45.10 Doing a full database backup for the primary database (CompSales2014).
When you finish this, you can move on to creating the availability group.
Create the Availability Group
From Node 1 (SQLONE node in the example), expand out the AlwaysOn High Availability node for this SQL Server instance (in SSMS). As you can see in Figure 45.11, you can right-click the Availability Group node and choose to create a new availability group (via the wizard). This is where all the action will be in creating the entire availability group.
FIGURE 45.11 Invoking the New Availability Group Wizard from SSMS.
This starts the New Availability Group Wizard, which will select the databases, specify the replicas, select the data synchronization, and then do validation. Initially, there is a splash page for the wizard on which you’ll just click Next. This brings you to the Specify Availability Group Name dialog. Figure 45.12 shows this dialog and the availability group name of DXD-AG1 being specified. Click Next.
FIGURE 45.12 Specifying the availability group name.
Select the Databases for the Availability Group
Next you are asked to identify which application databases you want to include in this availability group. Figure 45.13 shows our list of databases and the one we’ve selected (CompSales2014). Click Next.
FIGURE 45.13 Specifying the databases for the availability group.
Identify the Primary and Secondary Replicas
Specifying the replicas and how they will be used is next. Initially, there will only be one server instance (the primary). Click the Add Replicas button in the lower left (below the Server Instances list) and choose the secondary replication instance you want (SQLTWO in this example, Node 2). Now, both the primary and secondary instances should be listed. You also want to specify that each of these should be using automatic failover (up to two) by checking the check boxes. We also want the synchronous commits (up to three) option for both to get this HA feature. Figure 45.14 shows each failover and commit option specified for each server instance.
FIGURE 45.14 Specifying the instance of SQL Server to host a secondary replica and failover options.
If you click on the Endpoints tab, you will see the endpoints that are getting generated for use by the instances to communicate with each other (hadr_endpoint for each SQL Server instance). We’ll make no changes to these (take the default).
If you also click the Backup Preferences tab, you can indicate how (and where) you want database backups to be performed once the availability group is formed and the replicas are active. As you can see in Figure 45.15, we’ll keep the Prefer Secondary as our option for doing database backups, thus relieving the primary from this overhead task. If the secondary isn’t available for doing a backup, the primary is used (with this option).
FIGURE 45.15 Specifying database backup preferences: Prefer Secondary option.
If you click on the Listener tab, you can see that you have two options here: not to set up the availability group listener at this time or to create one now. We’ll actually do this a bit later, so skip this for now (specify Do Not Create an Availability Group Listener Now) and click Next.
Synchronize the Data
Now you set up your data synchronization preferences. Figure 45.16 shows your various options. You can use the Full option, which will do full database and log backups for each selected database. Then these databases get restored to each secondary and joined to the availability group. The Join Only option starts data synchronization up where you have already restored a database and log backup. (In other words, you already restored a database at a secondary, and you just want to have that secondary join the availability group.) The Skip Initial Data Synchronization option simply says you will do the full backups for the primary databases. We’ve chosen the Full option and must specify a shared network location accessible by all replicas.
FIGURE 45.16 Specifying the initial data synchronization options for the replicas in the availability group.
It is important that this shared location be fully accessible by the service accounts from all nodes in the availability group (the service account being used by the SQL Server services on each node). After choosing your option, click Next. This finishes the availability group creation, and you’ll now see the new group under the Availability Group node in SSMS. The availability group is functional now. You will see the primary and secondary replicas, the databases within the availability group, and an indication at the database node level as to whether the database is synchronized. If so, you are in business. However, to complete the abstraction of instance names away from the applications, you want to create the availability group listener to complete this configuration.
Set Up the Listener
The cluster’s VNN gets bound to the availability group listener name in this process. It is also the name exposed to the applications to connect to the availability group. As long as at least one node is functioning in the availability group, the applications never know that any node has failed.
Right-click the availability group we created and choose to create a new listener (Add Listener). You can now specify the listener DNS name (DXD-AG1Listener in this example), a port (use 1444), and specify to use a static IP address for this listener. A small dialog box will appear (as you can see in Figure 45.17) that indicates the IPv4 address of this listener (192.168.1.213 in this example). Click OK.
FIGURE 45.17 Specifying the new availability group listener.
Figure 45.18 shows the availability group and the new availability group listener we just configured.
FIGURE 45.18 The new availability group listener (ready to use).
Connect Using the Listener
Let’s do a quick test with SSMS to connect to this new availability group listener as if it were its own SQL Server instance. As shown in Figure 45.19, we’ve started a new connection dialog that specifies the availability group listener name we just created. Go ahead and connect.
FIGURE 45.19 Connection using the availability group listener (DXD-AG1Listener).
Then, after we connect, we open up a new query window (as shown in Figure 45.20) so that we can do a simple SELECT statement to query for the server name (SELECT @@servername;) and the first 1000 rows from the Product table in CompSales2014. As you can see, it returns the underlying SQL Server node name of SQLONE, which is the current primary node. In Figure 45.20, you can also see that the database is in a synchronized state and is fully functional within the availability group configuration. We are in business!
FIGURE 45.20 Select @@servername and a Product table select using the availability group listener connect.
Fail Over to a Secondary
You can fail over from within SQL Server by right-clicking the primary replica of the Availability Group node and selecting Failover. Or, you can do this from the Failover Manager. Let’s jump back over to the Failover Manager for Node 1 and see what things look like from there. Figure 45.21 shows the option to move this clustered role to another node (which is just another way of saying fail it over to another node). It also shows your failover options of Best Possible (the normal secondary replica) or to a specific replica (Move to Node SQLTWO or Move to Node SQLTHREE). When you select one of these (like Best Possible), you’ll get a Please Confirm Action warning. Remember, when you fail over, client connections will be broken and re-established to the new node and may have to rerun any changes that were not committed yet. Figure 45.22 shows the execution of the same query (via the listener) and the different node (SQLTWO) that is satisfying the client connection successfully. That is High Availability at work.
FIGURE 45.21 Failover Cluster Manager moving (failover) from one node to another.
FIGURE 45.22 Select @@servername and a Product table select using the availability group listener connection.
Adding Replicas
To add another replica to the configuration, you simply right-click the Availability Replicas node within the availability group you want to add it to and choose Add Replica. You’ll have to connect to existing replicas first and then identify (connect) to the new replica you want to add.
Then, as you can see in Figure 45.23, you identify how you want this replica to be used in the availability group (read-only Yes for this example). We also want this to be updated asynchronously, so do not check the Automatic Failover or Synchronous Commit boxes. Finish the wizard the same way we did for the first replica (data synchronization, shared network location for backups, and so on).
FIGURE 45.23 Adding secondary replicas to your existing availability group.
Then, after this is finished, you’ll have another replica available to use for read-only access (as was originally outlined in our three-node cluster in Figure 45.3). AlwaysOn and availability groups turn out to be pretty easy to configure and leverage. No reason for you to wait.
As you can see in Figure 45.24, it is now super easy to just query the Product table from the secondary (read-only) replica of SQLTHREE directly without impacting any other node in the configuration.
FIGURE 45.24 Querying the Product table via the Read-Only Secondary (SQLTHREE).
Dashboard and Monitoring
Various system views, a dashboard, and dynamic management views have been added for monitoring (and debugging) the AlwaysOn Availability Groups feature. The AlwaysOn dashboard (right-click the availability group and select Show Dashboard) is used to obtain an at-a-glance view of the health of an AlwaysOn availability group, its availability replicas, and databases.
You can use this dashboard to do the following:
- Choose a replica for a manual failover.
- Estimate data loss if you force failover.
- Evaluate data-synchronization performance.
- Evaluate the performance impact of a synchronous-commit secondary replica.
The dashboard also provides key availability group states and performance indicators, including the following:
- Replica roll-up state
- Synchronization mode and state
- Estimated data loss
- Estimated recovery time (redo catch-up)
- Database replica details
- Synchronization mode and state
- Time to restore log
SQL Server AlwaysOn and availability groups dynamic management views are also available with SQL Server 2014, including the following:
- sys.dm_hadr_auto_page_repair
- sys.dm_hadr_cluster_networks
- sys.dm_hadr_availability_group_states
- sys.dm_hadr_database_replica_cluster_states
- sys.dm_hadr_availability_replica_cluster_nodes
- sys.dm_hadr_database_replica_states
- sys.dm_hadr_availability_replica_cluster_states
- sys.dm_hadr_instance_node_map
- sys.dm_hadr_availability_replica_states
- sys.dm_hadr_name_id_map
- sys.dm_hadr_cluster
- sys.dm_tcp_listener_states
- sys.dm_hadr_cluster_members
And finally, SQL Server AlwaysOn and Availability Groups catalog views make it easy to see key components of the configuration. These include the following:
- sys.availability_databases_cluster
- sys.availability_groups_cluster
- sys.availability_group_listener_ip_addresses
- sys.availability_read_only_routing_lists
- sys.availability_group_listeners
- sys.availability_replicas
- sys.availability_groups