- What is a Compute Cluster?
- Different Types of Compute Jobs
- Building a Compute Cluster
- Computing Resources Needed
- Price Per CPU
- Optimal Solution Economics
- Beowulf Solution
- Beowulf Cluster on SPARC Hardware
- SUN Supported Beowulf Cluster
- How To Build Your Compute Cluster
- Advantages of a Sun Based Cluster
- Grid Computing
- Conclusion
- Compute Cluster Software
Different Types of Compute Jobs
Compute jobs can be classified into three different groups:
Single threadedThe traditional UNIX process with one address space and one execution thread.
MultithreadedOne address space but multiple execution threads that can run in parallel if multiple CPUs are available in the machine. A multithreaded program can handle both fine grain and course grain parallelism.
MultiprocessMultiple processes executing the same program simultaneously. Each process can be single threaded or multithreaded. Communication between processes is done using message passing. This solution can be used only for course grain parallelism. Otherwise, the communication becomes dominant.
Currently, most computer programs are still single threaded, which yields sufficient performance for most uses. Only when the program execution takes too long does the programmer make the extra effort required to parallelize the program.
A multithreaded program only has one address space (just as in a single-threaded program), but there are multiple execution threads, each with its own stack and so forth. Parallelizing a program into a multithreaded program can be done two different ways. One way is to direct the compiler to parallelize the program. In some cases, this method might be enough to achieve sufficient performance, but in most cases more work is needed. The second way is to help the compiler by adding compiler directives to the program. Compiler directives are hints that tell the compiler what parts of the code are safe to run in parallel. You can also use compiler directives in some parts of a program and let the compiler do the other parts. Currently, the most common development environment for shared memory parallelization is OpenMP.
Single-threaded and multithreaded applications can be developed using standard compilers; no special runtime environment is required. Multithreaded applications running in a single address space must be run on a multiprocessor computer if you want to utilize more than one CPU for a single job.
For some applications, depending on the structure of the application, it is better to use the message passing approach. This approach does, however, require more programming skills because the programmer must manually structure the code and insert all subroutine calls for parallelization, synchronization, and so forth. In the past, the parallel virtual machine (PVM) subroutine library was commonly used, but currently most parallel programs are written using an implementation of the message passing interface (MPI). Open source software implementations of MPI, such as MPICH, and commercial versions, such as Sun HPC ClusterTools' software are both available. TABLE 1 lists the software required for these applications.
TABLE 1 Software Required for Single-Threaded, Multithreaded, and Multiprocessed Applications
Type |
Single threaded |
Multithreaded |
Multiprocess |
Development |
Forte' software |
Forte software |
Forte software plus Sun HPC ClusterTools software |
Execution |
Solaris' operating environment (Solaris OE) |
Solaris OE |
Solaris OE plus Sun HPC ClusterTools software |
Of course, it is possible to mix methods and create an application that is both multithreaded and multiprocessed; that is, multiple processes in which each process is multithreaded, running on different nodes.