- Challenges of Software Integration
- Integrated Software Stacks
- Terminology
- Stacks and System Architectures
- Requirements of Software Integration Architectures
- Software Integration Architectures
- Software Stack Management and Deployment Frameworks
- About the Author
- Acknowledgements
- Ordering Sun Documents
- Accessing Sun Documentation Online
Requirements of Software Integration Architectures
Integration architectures must specify how to manage and control the creation, software integration, and installation of integrated software stacks. They should provide a framework for enabling and facilitating the automation, integration, and installation of complex software stacks. Additionally, integration architectures must provide repositories for the hardware and software configuration information that is necessary for creating and implementing an integrated software stack. It is not sufficient for integrated stacks to simply install software components. To be useful and viable, integrated products must install and correctly configure all slabs that make up an integrated software stack.
Essentially, an integration architecture specifies how to manage and control the initial installation and configuration of a predefined set of software. You would then clone the initial installation, creating a software stack to rapidly mass produce identical or similar systems. Additionally, an integration architecture should provide a framework wherein individual slabs of the software stack can be easily replaced, modified, or augmented, facilitating the creation of new software stacks.
Integration architectures essentially explain how to manage the creation of integrated stacks through the use of software bills of materials, often referred to as recipes. An integrated stack recipe specifies the ingredients (the necessary hardware and software components), as well as the order and measure (the configuration and installation information) by which those ingredients are to be combined.
Capturing and Encoding Configuration Information
For all but the most trivial software stacks, configuring an integrated stack or individual slab is necessary to bind a software stack. For a software stack to be reproducible and to ensure adherence to data center standards, slab configuration and binding the integrated stack must be automated.
Because some software products or applications are complex or rely on graphics-based installation programs (often called wizards), automating slab configuration can be a challenge. To fully automate slab configuration, you might need to create wrappers with tools, such as expect, to drive application installation programs.
Installer wrappers might not be able to determine all slab configuration or software stack binding information programatically. When creating installer wrappers that require human input, keep the following recommendations in mind:
Be certain to require all information to be provided only at the beginning of the configuration process. Requiring all information up-front helps ensure that configuration and binding processes can be started and then left to run unattended.
Log all configuration information that is used so that the correctness of the slab configuration and software stack binding can be validated after binding is complete.
Corollate configuration information between slabs. This correlation helps minimize the number of times that configuration data needs to be provided. For example, if the relational database slab and the application server slab need to know the primary network interface device (or name), this information should need to be provided only once.
Section 4, "Software Stack Management and Deployment Frameworks," on page 14 describes possible solutions to the challenges of automating the deployment and binding of integrated software stacks.
Providing Explicit and Implicit Configuration Information
It is important to note the value of knowing the configuration of the hardware where the software stack is to be bound. Although binding requirements are often overlooked, it is common for the binding of a software stack to require specific hardware components. Complex software stacks also require you to provide detailed knowledge of the target hardware configuration in order for the software stack to complete its configuration. For example, when you mirror a system's boot disk with a logical volume manager (LVM), such as Solaris Volume Manager (SVM) or VERITAS Volume Manager (VxVM) software, you must provide information about the physical location of the boot disk and selection of a disk to be used as the boot mirror. For the selected disk to fulfill the reliability, availability, and serviceability (RAS) requirements for a boot mirror, it must not share hardware components (I/O board, controller, bus, physical disk drive, or power supply) with the boot disk.
When developing an integration architecture, you must gather, and in some instances infer, the hardware and software configuration information required for each integrated product or software stack. The integration architecture needs to capture and store this configuration information for each integrated stack.
This configuration information might be explicit or implicit.
Explicit configuration information is data such as "c0t0d0 is the root disk and c2t0d0 is to be used as the root mirror."
Implicit configuration information is data such as "network interface qfe2 is to be connected to the administration subnetwork" or "all disks on controllers c3 and c4 are to be used for application data."
Not all configuration information can be determined automatically or programmatically. This problem and recommendations for solving it were discussed in "Capturing and Encoding Configuration Information" on page 8.
After all necessary information has been gathered, ensure that the integration architecture addresses the need to implement and configure the software stack. After this installation is completed, the system can clone the installation, creating a layered snapshot of that software stack to be used for mass production of identical or similar integrated stacks. It is important to note that this layered snapshot can also be used as the basis for new or customized integrated products.
To achieve the necessary modularity and flexibility, the integration architecture must encode and store the necessary configuration information of a level of abstraction (or generalization) above the simple responses to software configuration prompts. For example, in addition to storing the configuration information that c0t0d0 is the root disk and c1t0d0 is the root mirror, the abstraction must store the information that the boot disk must be mirrored to a disk with a data path that shares no hardware components with the boot disk.
As a more detailed example, consider a Sun StorEdge™ T3 disk array used with the above database integrated product; the database is to be stored on a T3 configured to present two (hardware) RAID5 logical unit numbers (LUNs) to the Solaris OE. Each of the two RAID5 LUNs is then mirrored with VxVM to an identical RAID5 LUN in a different physical T3 enclosure. For the integration architecture to effectively modify the software stack to replace VxVM with SVM, it is not sufficient to simply provide the LUNs that should be mirrored. The implicit information of this configuration (the RAID5 LUN configuration information, that the LUNs must be mirrored to distinct T3s, and the fact that the T3s are to be used for the database) must also be encoded and stored.
A large measure of this configuration information is gathered by a software component of the integration architecture referred to as the probe. However, the nature of this information might preclude the automated gathering of all information. The integration architecture needs to provide a mechanism for specifying or preconfiguring information that might not be able to be automatically gathered.
For example, information like the following might be preconfigured for the integrated stack being implemented:
qfe0 is to be used for the administration subnet and its IP address is 192.168.8.100
The T3s on c4 and c5 are to be used for the database and redo logs
It is anticipated that the bulk (if not all) of the information discovered and required by the integration architecture should be encoded in a hardware-neutral format (such as XML), stored on the integration architecture or stored in a software registry and made accessible by the integration architecture over the network.
Meeting our design goal of maintaining a rigorous separation of software installation and configuration information will require the determination of both explicit and implicit configuration information.
For example, the following types of configuration information need to be determined or provided:
c0t0d0 is the root disk and c2t0d0 is to be used as the root mirror.
Network interface qfe2 is to be connected to the administration subnet.
All disks on controllers c3 and c4 are to be used for application data.
Some implicit configuration information might be impossible or prohibitively difficult to ascertain automatically. For this type of indeterminate information, you can preconfigure (or prespecify) the information and place it in the catalog of available software.
Unconfiguring Software
Just as some software applications require specific information and procedures to complete their configuration, some software applications have specific uninstallation and unconfiguration procedures. Typically, unconfigurations might consist of removing host specific information, such as host or device names, from configuration files.
If an integrated stack or individual slabs are being cloned from an installed and configured system, it might be necessary to unconfigure software before creating the integrated software stack. Unconfiguration is necessary to help ensure that the software stack is completely generalized and to ensure that it does not contain host specific information from the master system. The procedure for unconfiguring software is commonly referred to as inducing system amnesia.
Inducing Stack Amnesia
An integrated stack created for deployment on many systems needs to be given amnesia. It needs to lose or forget its identity.
With many software stacks, inducing amnesia in the base of the stack (the Solaris OS) might not be sufficient. It is important to keep in mind that some or all applications or slabs might need to have amnesia induced.
A software stack containing such applications might require additional work to remove traces of the master system's identity before the system is cloned. A list of items to consider are:
Configuration files. Some applications store their configuration information in files that might not be cleared by the sys-unconfig command. Of particular note are configuration files that contain authentication or authorization information.
Log files. Often, applications write identifying information to log files. This might include host names, Internet Protocol (IP) addresses, user names, and so on. Clean these files to ensure that the clone system does not have log records from the master system. Examples of log files include:
/var/adm/lastlog
/var/adm/messages.*
/var/adm/sulog
State files. Some applications might retain state information in files. These might include files used to flag events or configuration files. If application state information is retained in files, reconcile these files on the master system prior to cloning the system.
Backup files. Some applications that modify files create backup copies of the files before modifying them. For example, the useradd(1m) command creates backup copies of the /etc/passwd and /etc/shadow files. If these backup files exist, and if they contain information that identifies the master system, reconcile them.
Temporary files. Some applications create temporary files that are intended to be persistent across reboots. These files can be placed in spool directories or in application-specific directories. In particular, exclude or empty the /var/tmp directory from the software stack.
Queue files. Some applications copy files or data to a queue directory. Examples include the sendmail(1m) file and the Solaris OE print service. These queue directories are not cleared by sys-unconfig. Clear these directories of data files before creating the integrated stack.
Mail subsystem files. The sys-unconfig command does not clear the /var/mail directory or user mail files therein. Clear the user mail files from this directory before cloning the system.
System accounting information. System accounting information might not be cleared by the sys-unconfig command. If this is the case, clear the accounting data from the system accounting directory (typically /var/adm/sa) before creating the integrated stack.
It is also important to keep in mind that if any locally developed applications or tools utilize any of the preceding file types, those files must also be cleaned. To help enable locally developed system applications and tools to automatically clean-up after themselves on a reconfiguration boot, register them with the sysidconfig command. Please consult the sysidconfig(1m) man page for details about registering applications.