Jini and JavaSpaces: Enabling the Grid
An exciting vision of the future of distributed systems, grid computing promises to provide seamless access to powerful resources spread across diverse platforms and geographic distances. In response to the great interest grid computing has generated, there are several tools and frameworks that support the development of grid-aware services. Specifically, Sun Microsystems provides the Jini platform, billed as "network plug and play," which provides a set of core services that can support a grid-like system. Jini includes the JavaSpaces service, distributed shared associative memory mechanism, whose simple yet expressive paradigm can enable the development of useful services to run on truly distributed systems such as grids.
This article is part one of a two-part series on grid computing. This article provides an introduction to grid computing and an overview of Jini and JavaSpaces. This introduction positions us for the second article, which will focus on the development and analysis of a realistic, useful, distributed grid service that converts XSL data to PDF documents.
What Is a Grid?
The concept of a computing "grid" is analogous to (and named after) the electric power grid. Indeed, the idea of a computing grid can be traced back to an idea expressed by Len Kleinrock in 1969 when he stated: "We will probably see the spread of 'computer utilities,' which, like present electric and telephone utilities, will service individual homes and offices across the country." On the electric grid, distributed generators produce electricity that consumers (users) "plug in" wherever they are, without concern for the source of the electricity. Users of the power grid consume power without knowledge of how or where the power was produced. For example, your power could be coming from a windmill, a coal boiler, or a nuclear reactor.
Original conceptualizations of computing grids were formulated around a single consumable resource (like the electricity of the power grid): CPU cycles. Today, however, grids are thought of as providing any sort of useful computing resources, generically referred to as "services." These services could be CPU time, use of a device such as a printer or digital camera, access to specialized software components, use of persistent storage space, or access to real-time data. In fact, a grid is now considered to be heterogeneous; specialized grids that focus on a single service, such as data grids, computational grids, utility grids, and scavenging grids, are a separate notion. In any grid implementation, a key identifying point is the distribution of heterogeneous services and, perhaps more importantly, the fact that access to this heterogeneity is mostly transparent to grid consumers.
Another defining characteristic of grids is the potential for the use of commodity hardware. The premise here is that grids that are composed of smaller and less-expensive computing resources can generate as much (andin some casesmore) computing power than expensive, traditional, vertically-scaled computing devices.
NOTE
For example, the SETI (Search for Extraterrestrial Intelligence) project, which is an example of a scavenging grid, realizes approximately 15 teraflops of CPU power at an average annual cost of $500,000. Comparatively, a centralized solution to the SETI problem could cost as much as $110,000,000 for IBM's ASCI White supercomputer, which generates "only" 12 teraflops of CPU power. (Clearly, there are other less-expensive solutions that lie between a grid implementation and a single server implementation.)
More formally defined, a grid is a collection of distributed and heterogeneous resources that are dynamically discovered and allocated. The allocation of resources is based on the alignment of node-level publications describing service-level thresholds, with user-defined service-level requirements. This alignment in many of today's grid implementations occurs through centralized or decentralized resource brokers or meta-schedulers that connect grid clients requesting a service with providers based on service requirements.
Geographically distributed grid resources are network-connected using common communication protocols. The grid resources are accessed and executed by service consumers without any knowledge of the underlying implementation decisions.