- Grid Computing and Middleware
- Prerequisites
- Globus Toolkit Installation
- Globus Toolkit Configuration and Testing
- Testing Globus Toolkit Services
- Sun N1 Grid Engine Software Installation
- Integrating Sun N1 Grid Engine Software With Globus Toolkit
- Integration Testing
- Troubleshooting
- About the Authors
- Related Resources
- Ordering Sun Documents
- Accessing Sun Documentation Online
Integrating Sun N1 Grid Engine Software With Globus Toolkit
Currently, there is no utility software provided by Globus that provides integration between the Globus Toolkit and Sun N1 Grid Engine software (as there is for other cluster job managers such as PBS or LSF), but there are a few open source packages that enable Globus Toolkit to submit jobs to Sun N1 Grid Engine software job managers.
For this article, we used the integration software provided by the London e-Science Centre. For more information, refer to http://www.lesc.ic.ac.uk/projects/epic-gt-sge.html.
To Integrate Sun N1 Grid Engine Software With Globus Toolkit
Download the following three packages:
globus_gram_job_manager_setup_sge-0.11.tar.gz, which contains Perl code to generate SGE scripts from RSL specification
mmjfs_sge_setup-0.0.tar.gz, which configures the MasterSGEManagedJobFactoryService as a Globus Toolkit 3.x service
mjs_sge_setup-0.0.tar.gz, which provides the job execution service used by MMJFS
Copy the packages to your $<GLOBUS_SOURCE_INSTALLER> directory, then install them.
If the MPI distribution is not detected, edit the <GLOBUS_LOCATION>/lib/perl/Globus/GRAM/JobManager/sge.pm file, checking the line defining the mpirun variable.
If this line does not point to your mpirun executable, modify the value of the variable to do so.
Check the pe_mpi variable to ensure that it has the value mpi or mpich corresponding to the MPI parallel environment configured for Sun N1 Grid Engine software.
To fix the minor bug in the integration code, edit the sge.pm script at the section containing the lines "Where to write output and error?" by modifying the lines on the else branch:
$ gpt-build globus_gram_job_manager_setup_sge-0.11.tar.gz $ gpt-build mmjfs_sge_setup-0.0.tar.gz $ gpt-build mjs_sge_setup-0.0.tar.gz $ gpt-postinstall
It is possible for your MPI distribution not to be properly detected during the installation.
This line should point to your mpirun executable.
There is one minor bug in the integration code, specifically in the section that translates RSL requests to Sun N1 Grid Engine software job scripts, that causes MPI jobs to fail.
$sge_job_script->print("#\$ -o " . $description->stdout() . ".real\n"); $sge_job_script->print("#\$ -e " . $description->stderr() . ".real\n");
to read as follows:
$sge_job_script->print("#\$ -o " . $description->stdout() . "\n"); $sge_job_script->print("#\$ -e " . $description->stderr() . "\n");
The problem is that for some job runs, when the script tries to create the /dev/null.real files and fails, the job terminates in error. This fix solves the problem.