Design Rules of Thumb
You can use a number of rules of thumb to design a system. Properly using these rules requires a firm grasp on your needsthat you have completed a statement of your requirements using the information and tables in Chapter 1 and Chapter 2.
This section describes the following design rules of thumb:
Spread your I/O devices across as many PCI buses as possible.
Decide how many CPUs the system needs.
Decide how much memory the system needs.
A well-designed system should seldom page, and never swap.
The system should always have some idle time.
Whenever you add additional CPUs, you should also add memory.
I/O Devices
You should always determine your I/O design first, as this along with your application needs determines your computing requirements. To get the best performance and reliability from your Sun Fire server, you should lay out the I/O carefully. An easy rule of thumb is:
Spread your I/O devices across as many PCI buses as possible.
Doing so distributes your I/O load across as many different controllers as possible, thus improving performance. In addition, you are reducing the number of single points of failure that could cause your data to go offline. Unfortunately, this rule of thumb has many caveats. Unlike CPUs and memory, the layout of your I/O intimately affects the reliability of your machine, and whether or not you can use features such as dynamic reconfiguration (DR). Chapter 4 discusses the issue of I/O design in detail, taking all of these factors into consideration.
CPUs
Regardless of what your tasks areNFS service, CAD simulations, or compiling software buildshandling each request requires a certain baseline of time, as the receptionist example shows. Not only is the type of request important, but the quantity of requests is important too. In fact, it is often harder for a system to handle 100 small requests than 10 large ones, due to the inherent overhead of handling each request.
How many CPUs are enough? The rules of thumb you can use to help you determine how many CPUs you need are:
One-half CPU per network card
One-eighth CPU per I/O device (disk or tape)
Two CPUs per application for mostly I/O-based applications (NFS, web servers and so forth)
Four+ CPUs per application for mostly CPU-based applications (simulations, databases and so forth).
These figures assume a moderate load on your system. If you are expecting a high load on certain aspects, you should double the corresponding numbers. For example, if a system is going to have a lot of network traffic, you should have one CPU for every network card to handle the interrupts. Conversely, if you are designing a system you expect to have a very light load, cut the numbers in half or consider whether the tasks that system is going to be performing could be combined with another server to lower overhead.
To get an idea of how many CPUs you need, add up each of the criteria that affect you, then round up to the nearest multiple of two. We recommend that you only buy the four-CPU boards for your Sun Fire system. Purchasing a two-CPU board limits your future expansion room, since it takes up the same amount of space as a four-CPU board. However, there are merits to the 2-CPU board if you do not need expansion room, and the examples later in the book demonstrate a good use for it.
So, if you are designing an NFS server with a gigabit network card and six Sun StorEdge T3 arrays, you have the following CPU requirements (TABLE 3-1).
TABLE 1-1 CPU Requirements
Description |
Number |
Gigabit network card (1) | 1/2 |
Disk arrays (6) | 3/4 |
NFS server | 2 |
Total | 3 1/4 |
Rounding up, you should buy a four-CPU board to run this system.
NOTE
These rules work well for average systems. However, for high-intensity applications such as online transaction processing (OLTP), data mining, and so forth, you should research your needs more carefully. For details, see "Analyzing an Existing System" on page 8.
When buying CPUs, you should make sure you have enough memory to accommodate them, or else you run the risk of thrashing. This means that the system spends all its time moving things around in memory, and never does any real work. This is like the receptionist who spends time picking up phone lines and saying "Please hold," without actually fulfilling any requests. "Memory" discusses this in detail.
Finally, in terms of speed, getting the fastest processor you can buy is always an advantage. In addition to the speed of the processor, you also should consider the size of its cache. Generally this is decided for you based on the processor model, but you want to make sure to get as large a cache as possible. The cache determines how many operations can be handled at one time by the processor without having to make a trip back out to system memory. Processor cache is several orders of magnitude faster than memory, so a large cache is always beneficial.
Memory
Memory is perhaps the single most important part of a computer system, and has the most direct impact on performance. The more memory you have, the more things you can do, and the faster you can do them, since less disk access is needed. There is usually a greater correlation between perceived performance and memory than processors. With relational databases, for instance, being able to fit as much of the database in memory as possible can yield a big improvement in performance.
If your system is running slowly, you should probably buy more memory, not more processors. It is more likely that your system is running out of memory, not processor cycles, and is having to use swap space to run your applications.
It is possible to waste money and overbuy memory as well, though, so here some specific rules. On the Sun Fire system, memory is tied to a processor (see TABLE 1-5 in Chapter 1). So, you cannot buy a board with just memory and no CPUs. This actually simplifies the design process considerably because there are only two decisions to make:
Whether to half-populate or fully-populate each CPU board
Whether to buy larger or smaller DIMMs
Fully-populating a CPU board allows you to put more memory on it. In addition, though, you get better interleaving, which increases performance. Thus, the rules of thumb for memory are:
For I/O-based applications, half-populate the CPU/Memory board.
For CPU-based applications, fully-populate the CPU/Memory board.
Then, choose the appropriate DIMM size to provide enough memory for your application. Following these rules will naturally lead to smaller memory sizes in NFS servers (where memory is basically solely used for the file buffer cache), and larger, faster memory configurations in database and compute servers. Most systems tend to be in one category or the other, but if there is a mix, fully-populate all boards.
Remember that, as discussed previously, paging is undesirable. So, another good rule of thumb is:
A well-designed system should seldom page, and never swap.
It is possible, in fact, to run a large memory system with very little (if any) swap space. This fact is somewhat different from other commonly available information. One commonly-used phrase is "Your swap space should be double the size of your physical memory." Consider this for a moment. You can easily design a Sun Fire system that has 64 gigabytes of memory. If you were to follow this advice, you would have to have 128 gigabytes of swap space. While a few vendors may require you to have a large swap space, you should not rely on swap for real-world memory usage, as it is too slow. When designing a system, make sure that you purchase enough memory so that your system does not swap. If it does, you need more memory.