Global Grid Connectivity Using Globus Toolkit With Solaris Operating System
- Grid Computing and Middleware
- Prerequisites
- Globus Toolkit Installation
- Globus Toolkit Configuration and Testing
- Testing Globus Toolkit Services
- Sun N1 Grid Engine Software Installation
- Integrating Sun N1 Grid Engine Software With Globus Toolkit
- Integration Testing
- Troubleshooting
- About the Authors
- Related Resources
- Ordering Sun Documents
- Accessing Sun Documentation Online
This article describes how to integrate grid computing with Globus Toolkit software for a site using Sun N1™ Grid Engine software (formerly Sun Grid Engine) as a local resource manager. This article provides background information and step-by-step instructions for installing, configuring, integrating, and testing Globus Toolkit software with Sun N1 Grid Engine software on an x86 architecture using the Solaris™ 9 Operating System (Solaris 9 OS).
The article contains the following topics:
"Introduction"
"Prerequisites"
"Globus Toolkit Installation"
"Globus Toolkit Configuration and Testing"
"Testing Globus Toolkit Services"
"Sun N1 Grid Engine Software Installation"
"Integrating Sun N1 Grid Engine Software With Globus Toolkit"
"Integration Testing"
"Troubleshooting"
"About the Authors"
"Related Resources"
"Ordering Sun Documents"
"Accessing Sun Documentation Online"
Introduction
This section provides background and introductory material for grid computing, the Globus Toolkit middleware, and Sun N1 Grid Engine software.
Grid Computing and Middleware
Grids are emerging as a new infrastructure for Internet-based parallel and distributed computing. They enable the sharing, exchange, discovery, and aggregation of resources distributed across multiple administrative domains, organizations, and enterprises. To accomplish this, grids need an infrastructure that supports services such as security, uniform access, resource management, scheduling, application composition, computational economy, and accounting.
The concept of grid computing is becoming popular with the emergence of the Internet as a ubiquitous media and the wide-spread availability of powerful computers and networks as low-cost commodity components. The local area network (LAN) connected clusters of computer platforms have been employed to solve computationally intensive problems, however they alone cannot offer the computational power demanded by applications. The geographically distributed resources need to be logically coupled together to make them work as a unified resource.
Grid middleware comes into play here. The most comprehensive grid middleware software currently available is the Globus Toolkit, version 3.0.2. The Globus Toolkit software offers resource management, data management, and information services, all layered on top of one security layer, the Grid Security Infrastructure (GSI).
FIGURE 1 Three Key Pillars for Grid Computing on Top of the Security Infrastructure
The Globus Toolkit 3.0.2 architecture and infrastructure evolved radically from one version to another. Globus Toolkit version 3.0.2 implements the Open Grid Services Architecture (OGSA) and Open Grid Services Infrastructure (OGSI) specifications, leveraging grid computing to a new concept: grid services as a particular type of Web services. This approach creates a uniform interface to grid resources, beneficial for both grid application developers and grid users. The grid services are available through a grid services container, and the communication is based on Simple Object Access Protocol (SOAP) and HTTP protocols that are already standards of the World Wide Web. This approach allows for easy addition and integration of new services to the grid.
However, this innovative approach is still in its infancy and some existing grid applications, especially high performance computing (HPC) applications, do not have any short-term gain by moving to this new infrastructure. For this reason, the Globus Toolkit 3.0.2 distribution includes both Globus Toolkit version 2 and version 3 components. The Globus Toolkit version 2 components are not OGSA/OGSI compliant, but are easier to understand, manage, and install. This article addresses all components shipped with the Globus Toolkit 3.0.2 distribution. Installation, configuration, and testing are presented in parallel for both versions.
The three key pillars for grid computing presented in FIGURE 1 are implemented by different components in Globus Toolkit versions 2.4 and 3.x, but the Grid Security Infrastructure (GSI) remains mainly unchanged. GSI provides secure authentication and communication services on the grid. It is based upon Secure Socket Layer (SSL), public key infrastructure (PKI), and X.509 digital certificates. The main functions implemented by GSI are single/mutual authentication, confidential communication, authorization, and delegation.
Globus Toolkit Version 2.4
The main Globus Toolkit version 2.4 components are as follows:
Grid Resource Allocation Manager (GRAM) is responsible for resource allocation, job submission and execution, and job status and progress management. GRAM makes use of Global Access to Secondary Service (GASS) to stage input/output (I/O) files and executables.
Monitoring and Discovery Service (MDS), based on the Lightweight Directory Access Protocol (LDAP), provides support for collecting information about the grid and responding to queries from clients. The two MDS services are Grid Resource Information Service (GRIS) and Grid Index Information Service (GIIS). GRIS is responsible for collecting the information from information providers and registering information to GIIS. The GIIS enables the creation of hierarchical directory structures that can efficiently store and distribute information.
GridFTP is a secure and high-performance data transfer tool; both partial and complete transfers are supported through the Globus Replica Catalog and Management features.
Globus Toolkit Version 3.x
Compared to Globus Toolkit 2.4, the Globus Toolkit 3.x counterparts are as follows:
Master Managed Job Factory Service (MMJFS) provides job submission, execution, and management services.
Index Services provide a way to produce and query service data; they are mainly used in discovery operations.
Reliable File Transfer (RFT) services, or multiRFT, are part of the Data Management implementation, with GridFTP and Replica Location Service (RLS), and provides the interface for reliable file transfers on grid servers.
NOTE
Globus Toolkit 3.x implements new OGSA/OGSI components to replace some components in Globus Toolkit 2.4. For example, Index Services in Globus Toolkit 3.x replace the GRIS in Globus Toolkit 2.4. The service data are saved in XML instead of LDIF. However, the data can be ported from GRIS to Globus Toolkit 3.x. In Globus Toolkit 2.4, the RFT is provided by the command globus-url-copy, which is implemented as a grid service in Globus Toolkit 3.x.
Sun N1 Grid Engine Software
Sun N1 Grid Engine software is a distributed management product that optimizes utilization of software and hardware resources. It can increase utilization of available resources to as much as 98 percent. Sun N1 Grid Engine software is both a job manager and a job scheduler for clusters of computers. The Sun N1 Grid Engine Enterprise Edition software can harness computing power across multiple clusters (campus grids).
Sun N1 Grid Engine software enabled hosts can be master hosts, execution hosts, submission hosts, and administration hosts. These roles are not mutually exclusive; it is possible for a host to perform all four functions. A typical cluster configuration is to have one master host, running the sge_qmaster (manager) and sge_schedd (scheduler) daemons and the other hosts running sge_execd (execution) daemons. All Sun N1 Grid Engine software hosts are communicating through TCP/IP; for this purpose, there is a special daemon, sge_commd, running on each host.
Computing resources are modeled by Sun N1 Grid Engine software as job execution queues. Each queue can have specific attributes and can support multiple parallel environments. The most frequent parallel environments used are Message Passing Interface (MPI) and parallel virtual machine (PVM).
The Globus Toolkit is a grid middleware technology that enables the usage of heterogeneous resources distributed across large geographical areas. It has to cope with stringent issues such as tight security and a complex infrastructure. It is difficult for a single software program to deal with the particularities of all involved systems. On the other extreme, you would not want all computing resources to be directly connected to the grid. This could result in management challenges and increased communication overhead.
To overcome these issues, it is better to take a hierarchical approach. Let every site or organization manage resources individually, using local policy, and allow access to these sites as a single entity. In other words, use tools such as Sun N1 Grid Engine software, Portable Batch System (PBS) or Load Sharing Facility (LSF) for local resource management, and grid middleware like Globus Toolkit for interconnecting sites.
This way, we only need one point of access (only one machinethe gatekeeper) for each site in a grid. This approach provides a performance gain, because local schedulers and job managers can use resources under their administration at close to 100 percent utilization rates, due to optimizations that are possible in homogeneous environments like clusters.