Grid technology may well be the next big leap in computing, but how do you get started?
A typical grid setup is made up of several components, the chief of which is the CPU cycles, which can come from any available machine on the networkPCs; servers of any size or operating system; or one of the newer hardware configurations, blade servers. A blade server looks much like typical rack-mounted server. Each blade includes its own CPU, memory, and storage, but accesses power, fans, switches, external drives (floppy, CD-ROM), and ports with the other blades through the chassis. These blades are mounted in racks of 160 or more blade servers to provide more than 100 gigaflops (FLOPS = floating point operations per second) of processing power. You can add and remove blades dynamically as the need arises. IBM and Sun both offer versions of blade servers, as do others in the market.
The sources of CPU cycles have to be held together by means of a network. For most businesses, this would be in the form of a local area network (LAN). Ideally, the Internet will eventually become the networkmuch as the electrical power grid in the U.S. is the major network over which electricity is accessedbut this scenario is unlikely in the near future. Companies will be slow to allow sensitive data to be carried over the Internet just to make use of commodity CPU cycles.
What use is the grid without data? Network-attached storage (SAN, NAS, or any other kind of mass storage) is the hardware on which the data resides. Currently much of the data resides in flat files, but more and more information will be coming from the databases that are the core of most companies' data storage. One of the selling points of grid computing is the ability of an end user to access any piece of information transparently, regardless of the platform, type, or location of the database, or the amount of processing required to come up with the final answer. To provide easy access to the data for the end user, portals remove the necessity for the user to access each piece individually and deliberately pull the pieces together later. The portal setup allows for more transparency and an interface that's more logical to the end user.
Oracle's newest database offering, 10g, is supposed to be the first step in a truly grid-enabled database. Meanwhile, DB2, SQL Server, and even MySQL databases not only are accessible by means of the grid, but NASA is currently accessing data stored in MySQL databases as part of their grid architecture.
These grid pieces are made to play nicely by means of a grid manager, a software product that analyzes your request's requirements and selects the best-suited and least-loaded systems matching your needs for timing, pricing, and quality, and then routes your request to those resources.
Finally, the software toolkit allows programmers to take advantage of all this power at their fingertips. The de facto standard is the open source Globus Toolkit, which is recommended by IBM, Sun, Oracle, and others, and has been the toolkit of choice for much of the development already done on grid computing. The Globus Alliance is an excellent starting point from which to access further information on the state of the grid today, its roots and history, and where we can go from here.