- JMS Feature Classification
- JMS Architectural Overview
- Cluster Runtime Environment
- JMS Comparative Analysis
- Other Capabilities
- Recommendations
- Summary
JMS Architectural Overview
A typical JMS is a software package that sits in a layer above the operating system. This section provides a brief introduction to the software architecture of the three JMSs discussed in this article.
Sun™ Grid Engine Architecture
The Sun™ Grid Engine package was formerly named Codine before Sun's acquisition of Gridware. It is a software package that provides a batch queuing framework for a large variety of architectures.
FIGURE 1 briefly describes the software architecture of the Sun Grid Engine JMS. The master daemon (cod_qmaster) and the separate scheduler (cod_schedd) both live on the master host. A shadow host is ready to take over if the master host fails. Every execution node in the cluster has a communication daemon (cod_commd) and an execution daemon (cod_execd).
This execution daemon creates a shepherd process whose mission is to spawn the task to be finally executed. The shepherd process will terminate upon completion of the task. The execution daemon is also used to report load information to the scheduler. The communication daemon is used for all communications between the components of the Sun Grid Engine product. More information about the Sun Grid Engine architecture is found in the documentation set4 and in the freely available web-based course5.
FIGURE 1 Sun Grid Engine Architecture
Platform's Load Sharing Facility
Load Sharing Facility (LSF) is the second JMS researched in this article. LSF is another batch queuing system with a rich set of features that allow system administrators to effectively optimize the use of the available resources at their computer sites.
FIGURE 2 below is a description of the LSF high level software architecture by illustrating a parallel batch job submission. LSF is also a client/server architecture based on a set of daemons that allows it to perform its functions. The master daemon (mbatchd) typically picks up a request and retrieves a list of suitable hosts from the load information manager (lim) daemon. The slave daemon (sbatchd) in the first execution host inherits its task from mbatchd(8) and starts the parallel application manager (pam), a process that enables the execution of a parallel program.
FIGURE 2 LSF Software Architecture
pam starts the remote execution server (res) on each execution host allocated to the batch job. Finally res starts the tasks on each execution host. For interactive parallel submission, the user's request is directly submitted to pam instead of mbatchd. pam will in turn query the master lim for the job placement. Further information about LSF can be found in the documentation set at Platform's web site2.
Portable Batch System
The Portable Batch System (PBS) is the third and last JMS that this article is covers. PBS is also a batch queuing and workload management system that operates on networked, multi-platform UNIX™ environments.
The PBS high level software architecture is depicted by the diagram in FIGURE 3 below. A job request is picked up by the pbs_server daemon which talks to the scheduler (pbs_sched) to schedule the job according to the site adopted policy. The scheduler gets information about the resources from pbs_mom to make a final scheduling decision. The job is then dispatched from pbs_server to pbs_mom for execution on the same host or forwarded from the local pbs_mom to a remote pbs_mom for execution in another host.
FIGURE 3 PBS Software Architecture
The interested reader is referred to the documentation3 for further technical details about PBS.
Sun HPC ClusterTools Software
A compute site that uses a JMS and also uses parallel programming will more likely need additional software such as the Sun HPC ClusterTools software which provides the support for the Message Passing Interface (MPI). FIGURE 4 illustrates an overview of Sun HPC ClusterTools software.
FIGURE 4 Sun HPC ClusterTools Software Overview
The Sun HPC ClusterTools1 product consists of the following components:
Cluster Runtime Environment (CRE)
Parallel Filesystem (PFS)
Message Passing Interface (MPI)
MPI I/O
Prism™ Programming Environment
Scalable Scientific Subroutine library (S3L)
CRE is the critical component of this product that is briefly described in the next section due to the nature of its interaction with the JMS used.