Sun CRE Infrastructure
Sun CRE is a full-featured, robust parallel environment whose purpose is to support multiprocess jobs in general, and Sun MPI jobs in particular. Sun CRE consists of a set of daemons that run on each node of a cluster. Users and programs interact with these daemons using commands such as mprun to start a parallel job, mpps to see the status of currently running jobs, and mpkill to signal or abort a job. In addition, Sun CRE provides parallel standard I/O: It broadcasts input typed at the controlling TTY to all processes, and it collects output from all processes and redirects it to the standard output of the mprun command.
Sun CRE monitors jobs, and cleanly terminates all the processes in a job if any process exits abnormally. This is one of the main advantages to having a parallel environment. In contrast, parallel libraries, such as the public domain MPICH library, that do not rely on a parallel environment are prone to losing processes when a job fails, and it is the user's responsibility to find and terminate them.
Sun CRE also sets up the parallel infrastructure that other Sun HPC ClusterTools components depend on for correct operation. This includes a database that contains both persistent information about the cluster configuration and transient information about currently running jobs. Sun HPC ClusterTools components access this database for a variety of purposes. For example, the Prism parallel debugger uses it to attach to running jobs.
The structure of Sun CRE is shown in FIGURE 1, which illustrates the sequence of events that take place when a new job is started.
FIGURE 1 Actions From the mprun Command
The mprun command starts a new parallel job by contacting the persistent master daemon mpmd in step 1. The mpmd daemon in turn contacts the persistent spmd daemon on each host in step 2, and each spmd daemon creates an iod process in step 3 that is responsible for creating and monitoring the application processes on that host. In steps 4 and 5, iod establishes a socket to the mprun process to handle the application's standard input and standard output, and also connects to the persistent database daemon called rdb. Each iod forks several a.out processes in step 6. These execute the user's parallel application. Each a.out creates a "query socket" back to its parent iod in step 7. The query socket is the application's connection to the parallel environment; for example, requests over this socket allow a process to find its peers at application initialization time.
For clarity, FIGURE 1 shows the submission node, master node, and execution nodes as separate host systems. In practice, however, any of the execution nodes might also be submission nodes or the master node.
The Sun CRE architecture is designed for scalability. The multiple iod processes operate in parallel, and can start a large number of application processes simultaneously. In the HPC ClusterTools 5 software release, Sun CRE can launch a job containing 2048 processes spread across 256 hosts.