Database Replication
The Database Replication pattern may be the most prevalent pattern of EAI integration today. Database replication involves managing copies of data over two or more databases, resulting in redundant data. Companies engage in database replication for numerous reasons. One reason is that many organizations are becoming more distributed in their operations, requiring multiple copies of the same data over several physical locations. Replication is also a means of data recovery. In many organizations, an active secondary database is maintained for data recovery purposes. In the event that the production database needs to be recovered, the secondary replicated database can be used. This also applies for "high availability" systems. In these situations, a redundant copy of "live" data is maintained to ensure that if the first system is not available, the redundant database system is activated.
The two general categories for database replication are synchronous and asynchronous replication.
Synchronous Replication
Synchronous replication involves maintaining absolute consistency between source and target databases. The primary objective is to ensure real-time data consistency between the databases. This is what is known as achieving zero latency between the data resources. In practice, this calls for the use of transaction processing technology in order to ensure absolute data consistency. Figure 3.1 depicts the use of transaction processing monitor in replication.
Figure 3.1 Synchronous replication with transaction processing.
Transactions must conform to what is commonly known as the ACID properties. This means that transaction operations must be Atomic, Consistent, Isolated, and Durable:
Description of the ACID properties
AtomicA transaction is atomic when the system treats each transaction discretely as a single call that either succeeds or fails.
ConsistentThis attribute means that the transaction component or object is changed from one valid state to another.
IsolatedThe operation of a transaction is isolated from other transactions.
DurableTransactions that are committed are permanent even if the system fails.
Using the transactional protocol means that the brokering of data across the databases must be accomplished as a single unit of work. Discrete data changes to Database A are simultaneously made to Database B. If the data changes are successful for Database A but not successful for Database B, the changes for Database A are "rolled back" and both systems are returned to the previously consistent state. As mentioned previously, this kind of transaction processing between both systems is achieved through the use of a transaction processing monitor (TPM) such as CICS (IBM) or Tuxedo (BEA).
Asynchronous Replication
Asynchronous replication has a much looser latency requirement. The time required for all systems to be "in sync" or consistent is an observable measure of time. This doesn't mean that the need to maintain transactional integrity is diminished in any way. It is still necessary to ensure that discrete data elements are moved as a single unit of work. Asynchronous message queuing products such as MQ Series are often used to preserve transactional semantics as part of the replication process. They do so through the use of transactional queues. Transactional queues guarantee that the data delivery process is not completed until the data packet inserted into the queue by the source database is de-queued and committed to the target data resource.