Other Pieces of the Puzzle
In the world of cloud computing, there are a growing number of companies and services from which to choose. Each service provider seeks to align its offerings with a broader strategy. With Amazon, that strategy includes providing very basic infrastructure building blocks for users to assemble customized solutions. AWS tries to get you to use more than one service offering by making the different services useful with each other and by offering fast and free data transfer between services in the same region. This section describes three other Amazon Web Services, along with some ways you might find them to be useful in conjunction with SimpleDB.
Adding Compute Power with Amazon EC2
AWS sells computing power by the hour via the Amazon Elastic Compute Cloud (Amazon EC2). This computing power takes the form of virtual server instances running on top of physical servers within Amazon data centers. These server instances come in varying amounts of processor horsepower and memory, depending on your needs and budget. What makes this compute cloud elastic is the fact that users can start up, and shut down, dozens of virtual instances at a moment's notice.
These general-purpose servers can fulfill the role of just about any server. Some of the popular choices include web server, database server, batch-processing server, and media server. The use of EC2 can result in a large reduction in ongoing infrastructure maintenance when compared to managing private in-house servers. Another big benefit is the elimination of up-front capital expenditures on hardware in favor of paying for only the compute power that is used.
The sweet spot between SimpleDB and EC2 comes for high-data bandwidth applications. For those apps that need fast access to high volumes of data in SimpleDB, EC2 is the platform of choice. The free same region data transfer can add up to a sizable cost savings for large data sets, but the biggest win comes from the consistently low latency. AWS does not guarantee any particular latency numbers but typically, round-tripping times are in the neighborhood of 2 to 7 milliseconds between EC2 instances and SimpleDB in the same region. These numbers are on par with the latencies others have reported between EC2 instances. For contrast, additional latencies of 50 to 200 milliseconds or more are common when using SimpleDB across the open Internet. When you need fast SimpleDB, EC2 has a lot to offer.
Storing Large Objects with Amazon S3
Amazon Simple Storage Service (Amazon S3) is a web service that enables you to store an unlimited number of files and charges you (low) fees for the actual storage space you use and the data transfer you use. As you might expect, data transfer between S3 and other Amazon Web Services is fast and free. S3 is easy to understand, easy to use, and has a multitude of great uses. You can keep the files you store in S3 private, but you can also make them publicly available from the web. Many websites are using S3 as a media-hosting service to reduce the load on web servers.
EC2 virtual machine images are stored and loaded from S3. EC2 copies storage volumes to and loads storage volumes from S3. The Amazon CloudFront content delivery network can serve frequently accessed web files in S3. The Amazon Elastic MapReduce service runs MapReduce jobs stored in S3. Publicly visible files in S3 can be served up via the BitTorrent peer-to-peer protocol. The list of uses goes on and on.... S3 is really a common denominator cloud service.
SimpleDB users can also find good uses for S3. Because of the high speed within the Amazon cloud, S3 is an obvious storage location choice for SimpleDB import and export data. It is also a solid location to place SimpleDB backup files.
Queuing Up Tasks with Amazon SQS
Amazon Simple Queue Service (Amazon SQS) is a web service that reliably stores messages between distributed computers. Placing a robust queue between the computers allows them to work independently. It also opens the door to dynamically scaling the number of machines that push messages and the number that retrieve messages.
Although there is no direct connection between SQS and SimpleDB, SQS does have some complementary features that can be useful in SimpleDB-based applications. The semantics of reliable messaging can make it easier to coordinate multiple concurrent clients than when using SimpleDB alone. In cases where there are multiple SimpleDB clients, you can coordinate clients using a reliable SQS queue. For example, you might have multiple servers that are encoding video files and storing information about those videos in SimpleDB. SimpleDB makes a great place to store that data, but it could be cumbersome for use in telling each server which file to process next. The reliable message delivery of SQS would be much more appropriate for that task.