- Introduction
- Installation and Configuration
- System Controllers
- Platform and Domain Configuration
- Memory and I/O Configuration
- Domain Administration
- Platform Security
- Error Analysis and Diagnosis
- Dynamic Reconfiguration
- Hot-Swappable PCI Adaptors
- About the Author
- Acknowledgments
- Related Resources
- Ordering Sun Documents
- Accessing Sun Documentation Online
System Controllers
The two SCs, which are an integral part of the system, are considered to be critical to its operation. Much like earlier system service processors, their primary goals are to provide power and configuration for the domain and platform, to monitor the environmental events within the system, and to take corrective action in the event of failures. It is best practice to leave the SCs as much as possible in their original configuration.
SC Disk Configuration
Both Sun Fire 15K server SCs are configured with two internal 18-gigabyte disks that are mirrored using Solstice DiskSuiteTM software for SolarisTM 8 Operating System and SolarisTM Volume Manager software for Solaris 9 Operating System. This configuration has been qualified by our software QA department and has been used to verify the operation of all phases of the system.
The EIS CD contains a script, SF15k-sc-bootdisks-start.sh, that assumes the disk is already formatted with the default partitioning. It sets up Solstice DiskSuite slices and state databases. The SF15k-sc-bootdisks-finish.sh completes the process by synchronizing and attaching the metadevices. It is considered best practice to leave the SC boot devices configured as delivered from the factory.
As a part of monitoring an SC for failover, the amount of free disk space, swap space, and memory are checked at intervals. Any modification made to these boot devices could result in improper operation of the platform.
Systems among the existent install base have been shipped with 18-gigabyte disk drives. Some new installation might have 36-gigabyte or 72-gigabyte disk drives. The default partitioning that has been proposed for the 18-gigabyte SC internal disks is as follows:
Partition |
Mount Point |
Minimum Size |
0 |
/ |
8 GB |
1 |
swap |
2 GB |
2 |
overlap |
8 GB |
4 |
metadb* |
10 MB |
5 |
metadb* |
10 MB |
7 |
/export/install |
8 GB |
* database for mirroring
TIP
Do not change the default partitions set by the manufacturer.
Network Planning
The Sun Fire 15K server has 20 dedicated, private, and internal network connections, not including those assigned to domain network interfaces. A maximum of 18 of these connections are for 18 possible domains. The other two are for SC-to-SC connections that are used to facilitate failover, pass heartbeat, and synchronize system critical data.
The Site Planning Guide contains a template for defining the configuration of these network connections. It is best practice to assign IP addresses and host names using this template.
Also, it is considered best practice to configure the SC to use IP Multipathing (IPMP) across the external interfaces hme0 and eri1. Additionally, configure a logical hostname to be used for failover such that you can always reach the active main SC using the same hostname.
Setting up IPMP with a logical host requires the assignment of seven public IP addresses:
Two for each SC, assigned to hme0 and eri1, respectively.
One for each SC for local failover.
One floating or community IP to follow the active or main SC.
TIP
Use the template in the Site Planning Guide to define the management network (MAN) configuration. Use IPMP across the external interface defined as hme0 and eri1. Configure the logical hostname for failover.
SC Console
A Terminal Interface Program (TIP) port is available on the SC that allows direct console access. The cable supplied with each SC is wired so that it plugs directly into a 25-pin serial port such as the one on the back of an UltraTM 10 workstation. It is possible to connect the console to other devices by using the appropriate adapter(s) to make the connection. You must also determine if such devices use a straight-wired vs. a null-modem wired connection, which differs among the various terminal concentrators available.
If the console is connected to a device that sends a break sequence during various events such as reset, power on or power off, etc., it might also be necessary to change the Solaris Operating System (Solaris OS) break sequence in the SC configuration file /etc/default/kbd as follows:
KEYBOARD_ABORT=alternate
To facilitate failover, the BREAK sequence to stop the system has been changed from STOP-A to the alternate [RETURN] [TILDE] [CONTROL B].
NOTE
There must be an interval of more than 0.5 seconds between characters, and the entire string must be entered in less than 5 seconds.
In case of a broken connection, this avoids dropping the console in the OpenBootTM prompt (OBP).
It is best practice to connect the SC console through a terminal concentrator, which would make the consoles available from any workstation on the network.
TIP
Connect the SC console ports to a terminal concentrator.
OpenBoot PROM Parameters
The SC CPU board is a Solaris OS host running its own copy of standard Solaris OS. Also, it has an OBP and built-in power-on self tests, which execute at various levels of increasingly intensive diagnostics and verbosity. The variable that controls these features, with the recommended standard settings are defined as follows.
For extended CPU and SC diagnostics:
diag-level=pmax-epmax
For minimum diagnostic of CPU:
diag-level=pmin
Please keep in mind that pmin tests only the CPU board.
For extended CPU and minimum SC diagnostic:
diag-level=pmax-epmin
For extended CPU and SC verbosity and diagnostics:
diag-level=pmax-epvmax
With the diag-level set to a minimum (pmin), it is possible to boot the SC and be able to log in; however, should there be any hardware problems not detected due to the diagnostic level, then it will not function as an SC.
TIP
Use the diag-level=pmax-epmax on the main SC and diag-level=pmax-epvmax on the spare SC.
SMS-SVC User Configuration
The sms-svc user account should be configured to have platform administrative privileges as well as domain administrative privileges for each domain. The SC's root user account should not be assigned the same SMS privileges as the sms-svc user account. The EIS CD provides a script that performs this task, which should be performed on each SC as a root user. For more information, refer to the System Management Services (SMS) 1.4 Installation Guide.
TIP
Configure the sms-svc account with domain and platform administrative privileges. Do not assign sms-svc privileges to the root user account.
Failover
SC failover functionality is controlled by the interaction of daemons running on each of the SCs. These daemons communicate across a private network built into the Sun Fire 15K server frame. It is a best practice and crucial to the high availability (HA) environment for the Sun Fire 15K server to always have failover enabled and to have data between the two SCs always be synchronized. Both SCs should be running the same versions of OS and SMS, and they should be maintained at the same patch level.
TIP
For both SCs, always have failover enabled, run the same OS and SMS software, and maintain the latest patches.
When installing patches, carefully read the patch installations provided with the patch descriptions. Also, every SMS software release provides new functionality and enhancements with the bug fixes. It is highly recommended to run the latest SMS software version on SCs.
Solaris OS and SMS Software on SC
The Solaris OS and the SMS software packages, running on the SC, are vital elements in the operation of the Sun Fire 15K server. A complete and current backup image of the SC boot disk can prevent unnecessary downtime and possible domain or platform outages. Standard practices for system backups should be adhered to and performed at regular intervals. Also, it is recommended that you always perform backups after any OS or SMS patch updates are applied. Use the ufsdump command to the local drive for back up of the OS.
The SMS software provides smsbackup and smsrestore utilities to preserve the current SMS configuration, which includes domain configuration, state, and history data as well as other configuration data. Best practices dictate that SMS software patches are considered mandatory and require good change-control planning practices in order to maintain current levels. If you need to reinstall the software, be certain to reapply any patches you had previously applied. Those who have SunSolveSM accounts can follow the link for external access to patches available for SMS on the SMS web site at:
http://www.sun.com/servers/highend/sms.html
Observe the following guidelines when applying patches:
Disable failover before installing the patches.
The system should be stable.
No dynamic reconfiguration (DR) operations should be in progress.
No domain bringup or shutdown operations should be in progress.
No user initiated datasync or cmdsync operations should be in progress.
Complete any domain, board, or configuration changes before beginning the patch installation.
TIP
Please read all patch instructions carefully before attempting this procedure. SMS software patches need to be up to date. Always backup at regular intervals and after applying patches.
Third-Party Software Running on the SC
One of the differences between the Sun Fire 15K server and the rest of the Sun Fire server product line is in the architecture of the SCs. The Sun Fire 15K server SCs run the Solaris OS with the SMS loaded on top. The Sun Fire 15/12K Open System Controller (OpenSC) White Paper describes how to set up and verify third-party software on SC boards. Third-party applications are expected to be lightweight, such as monitoring and backup agents, and not to demand intensive system resources. Described in the white paper are the SMS resource requirements, the maximum permitted resource consumption for third-party software, and techniques to help ensure that SMS receives the resources it needs to function properly.
TIP
Avoid any resource consuming third-party applications running on SCs.