Home > Articles > Operating Systems, Server > Solaris

This chapter is from the book

Data Synchronization

More than one copy of data is a data synchronization problem. This section describes data synchronization issues.

Throughout this book, the concept of ownership of data is important. Ownership is a way to describe the authoritative owner of the single view of the data. Using a single, authoritative owner of data is useful for understanding the intricacies of modern clustered systems. In the event of failures, the ownership can migrate to another entity. Synchronization describes how the Sun Cluster 3.0 architecture handles the complex synchronization problems and issues that the following sections describe.

Data Uniqueness

Data uniqueness poses a problem for computer system architectures or clusters that use duplication of data to enhance availability. The representation of the data to people requires uniqueness. Yet there are multiple copies of the data that are identical and represent a single view of the data, which must remain synchronized.

Complexity and Reliability

Since the first vacuum tube computers were built, the reliability of computing machinery has improved significantly. The increase in reliability resulted from technology improvements in the design and manufacturing of the devices themselves. But increases in individual component reliability also increase complexity. In general, the more complex the system is, the less reliable it is. Increasing complexity to satisfy the desire for new features causes a dilemma because it works against the desire for reliability (perfection of existing systems).

As you increase the number of components in the system, the reliability of the system tends to decrease. Another way to look at the problem of clusters is to realize that a fully redundant cluster has more than twice as many components as a single system. Thus, the cost of a clustered system is almost twice the cost of a single system. However, the reliability of a cluster system is less than half the reliability of a single system. Though this may seem discouraging, it is important to understand that the reliability of a system is not the same as the availability of the service provided by the system. The difference between reliability and availability is that the former only deals with one event, a failure, whereas the latter also takes recovery into account. The key is to build a system in which components fail at normal rates, but which recovers from these failures quickly.

An important technique for recovering from failures in data storage is data duplication. Data duplication occurs often in modern computer systems. The most obvious examples are backups, disk mirroring, and hierarchical storage management solutions. In general, data is duplicated to increase its availability. At the same time, duplication uses more components, thus reducing the overall system reliability. Also, duplication introduces synchronization fault opportunities. Fortunately, for most cases, the management of duplicate copies of data can be reliably implemented as processes. For example, the storage and management of backup tapes is well understood in modern data centers.

A special case of the use of duplicate data occurs in disk mirrors. Most disk mirroring software or hardware implements a policy in which writes are committed to both sides of the mirror before returning an acknowledgement of the write operation. Read operations only occur from one side of the mirror. This increases the efficiency of the system because twice as many read operations can occur for a given data set size. This duplication also introduces a synchronization failure mode, in which one side of the mirror might not actually contain the same data as the other side. This is not a problem for write operations because the data will be overwritten, but it is a serious problem for read operations.

Depending on the read policy, the side of the mirror that satisfies a given read operation may not be predictable. Two solutions are possible periodically check the synchronization and always check the synchronization. Using the former solution maintains the performance improvements of read operations while periodic synchronization occurs in the background, preferably during times of low utilization. The latter solution does not offer any performance benefit but ensures that all read operations are satisfied by synchronized data. This solution is more common in fault tolerant systems.

RAID 5 protection of data also represents a special case of duplication in which the copy is virtual. There is no direct, bit-for-bit copy of the original data. However, there is enough information to re-create the original data. This information is spread across the other data disks and a parity disk. The original data can be re-created by a mathematical manipulation of the other data and parity.

Synchronization Techniques

Modern computer systems use synchronization extensively. Fortunately, only a few synchronization techniques are used commonly. Thus, the topic is researched and written about extensively, and once you understand the techniques, you begin to understand how they function when components fail.

Microprocessor Cache Coherency

Microprocessors designed for multiprocessor computers must maintain a consistent view of the memory among themselves. Because these microprocessors often have caches, the synchronization is done through a cache-coherency protocol. The term coherence describes the values returned by a read operation to the same memory location. Consistency describes the congruity of a read operation returning a written value. Coherency and consistency are complementary coherence defines the behavior of reads and writes to the same memory location and consistency defines the behavior of reads and writes with respect to accesses to other memory locations. In terms of failures, loss of either coherency or consistency is a major problem that can corrupt data and increase recovery time.

UltraSPARC™ processors use two primary types of cache-coherency protocols snooping and distributed directory-based coherency.

  • Snooping protocol is used by all multiprocessor SPARC implementations. No centralized state is kept. Every processor cache maintains metadata tags that describe the shared status of each cache line along with the data in the cache line. All of the caches share one or more common address buses. Each cache snoops the address bus to see which processors might need a copy of the data owned by the cache.

  • Distributed directory-based coherency protocol is used in the UltraSPARC III processor. The status of each cache line is kept in a directory that has a known location. This technique releases the restriction of the snooping protocol that requires all caches to see all address bus transactions. The distributed directory protocol scales to larger numbers of processors than the snooping protocol and allows large, multiprocessor UltraSPARC III systems to be built. The Oracle 9i Real Application Cluster (Oracle 9i RAC) database implements a distributed directory protocol for its cache synchronization. Synchronization describes this protocol in more detail.

As demonstrated in the Sun Fire™ server, both protocols can be used concurrently. The Sun Fire server uses snooping protocol when there are four processors on board and uses directory-based coherency protocol between boards. Regardless of the cache coherency protocol, UltraSPARC processors have an atomic test-and-set operation, ldstub, which is used by the kernel. Atomic operations must be guaranteed to complete successfully or not at all. The test- and-set operation implements simple locks, including spin locks.

Kernel-Level Synchronization

The Solaris operating environment kernel is re-entrant2, which means that many threads can execute kernel code at the same time. The kernel uses a number of lock primitives that are built on the test-and-set operation3:

  • Mutual exclusion (mutex) locks provide exclusive access semantics. Mutex locks are one of the simplest locking primitives.

  • Reader/writer locks are used when multiple threads can read a memory location concurrently, but only one thread can write.

  • Kernel semaphores are based on Dijkstra's4 implementation in which the semaphore is a positive integer that can be incremented or decremented by an atomic operation. If the value is zero after a decrement, the thread blocks until another thread increments the semaphore. Semaphores are used sparingly in the kernel.

  • Dispatcher locks allow synchronization that is protected from interrupts and is primarily used by the kernel dispatcher.

Higher level synchronization facilities, such as condition variables (also called queuing locks ), that are used to implement the traditional UNIX ¨ sleep/wake-up facility are built on these primitives.

Application-Level Synchronization

The Solaris operating environment offers several application program interfaces (APIs) that you can use to build synchronization into multithreaded and multiprocessing programs.

The System Interface Guide5 introduces the API concept and describes the process control, scheduling control, file input and output, interprocess communication (IPC™), memory management, and real-time interfaces. POSIX and System V IPC APIs are described; these include message queues, semaphores, and shared memory. The System V IPC API is popular, being widely implemented on many operating systems. However, the System V IPC semaphore facility used for synchronization has more overhead than the techniques available in multithreaded programs.

The Multithreaded Programming Guide6 describes POSIX and Solaris threads APIs, programming with synchronization objects, compiling multithreaded programs, and finding analysis tools for multithreaded programs. The threads-level synchronization primitives are very similar to those used by the kernel. This guide also discusses the use of shared memory for synchronizing multiple multithreaded processes.

Synchronization Consistency Failures

Condition variables offer an economical method of protecting data structures being shared by multiple threads. The data structure has an added condition variable, which is used as a lock. However, broken software may indiscriminately alter the data structure without checking the condition variables, thereby ignoring the consistency protection. This represents a software fault that may be latent and difficult to detect at runtime.

Two-Phase Commit

The two-phase commit protocol ensures an atomic write of a single datum to two or more different memories. This solves a problem similar to the consistency problem described previously, but applied slightly differently. Instead of multiple processors or threads synchronizing access to a single memory location, the two-phase commit protocol replicates a single memory location to another memory. These memories have different, independent processors operating on them. However, the copies must remain synchronized.

In phase one, the memories confirm their ability to perform the write operation. Once all of the memories have confirmed, phase two begins and the writes are committed. If a failure occurs, phase one does not complete and some type of error handling may be required. For example, the write may be discarded and an error message returned to the requestor.

The two-phase commit is one of the simplest synchronization protocols and is used widely. However, it has scalability problems. The time to complete the confirmation is based on the latency between the memories. For many systems, this is not a problem, but in a wide area network (WAN), the latency between memories may be significant. Also, as the number of memories increases, the time required to complete the confirmation tends to increase. Attempts to relax these restrictions are available in some software products, but this relaxation introduces the risk of loss of synchronization, and thus the potential for data corruption. Recovery from such a problem may be difficult and time consuming, so you must carefully consider the long-term risks and impact of relaxing these restrictions. For details on how Sun Cluster 3.0 uses the two-phase commit protocol, see Mini-Transactions.

Systems also use the two-phase commit for three functions disk mirroring (RAID 1), mirrored cache such as in the Sun StorEdge™ T3 array and Sun StorEdge™ Network Data Replicator (SNDR software), and the Sun Cluster cluster configuration repository (CCR.)

Locks and Lock Management

Locks that are used to ensure consistency require lock management and recovery when failures occur. For node failures, the system must store the information about the locks and their current state in shared, persistent memory or communicate it through the interconnect to a shadow agent on another node.

Storing the state information in persistent memory can lead to performance and scalability issues because the latency to perform the store can affect performance negatively. These locks work best when the state of the lock does not change often. For example, locking a file tends to cause much less lock activity than locking records in the file. Similarly, locking a database table creates less lock activity than locking rows in the table. In either case, the underlying support and management of the locks does not change, but the utilization of the locks can change. High lock utilization is an indication that the service or application will have difficulty scaling.

An alternative to storing the state information in persistent memory is to use shadow agents processes that receive updates on lock information from the owner of the locks. This state information is kept in volatile, main memory, which has much lower latency than shared, persistent storage. If the lock owner fails, the shadow agent already knows the state of the locks and can begin to take over the lock ownership very quickly.

Lock Performance

Most locking software and synchronization software provide a method for monitoring their utilization. For example, databases provide performance tables for monitoring lock utilization and contention. The mpstat (1m), vmstat (1m), and iostat (1m) processes give some indications of lock or synchronization activity, though this is not their specialty. The lockstat (1m) process provides detailed information on kernel lock activity, monitors lock contention events, gathers frequency and timing data on the events, and presents the data.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020