- Logical and Physical Models
- Database Application Types
- Using Efficient Datatypes
- Partitioning Large Tables
- Bigger, Better, Faster
- Summary
Partitioning Large Tables
Unless you follow some type of data archiving procedure in your OLTP applications, sooner or later your database tables become very large. Large is a relative term; as it applies to SQL Server tables, large means tables with ten millions or more rows. Retrieving data from such tables will take longer than you'd want. Building proper indexes can help performance greatly; we'll discuss indexing strategies in a separate article. If nothing else works, partition the large table horizontally based on the record date, demographics, or some other characteristic. For instance, if you have a patient table containing many millions of records, you might want to split it into several tables based on the hospital where each patient was treated. If you have millions of patients per hospital, you can further split tables based on the year when the patients were accepted into each hospital. Of course, doing so involves rewriting some queries and some DBA work for maintaining additional tables, but the performance benefits you gain will far outweigh the maintenance costs.
In addition to splitting large tables into several smaller tables, SQL Server 2000 supports the Distributed Partitioned Views (DPV), which is a way to spread the query-processing load over several servers. Note though that the DPV only work with the Enterprise and Developer Editions of SQL Server. Feel free to check out SQL Server Books On-Line for additional information about DPV.
Some of your queries can benefit from vertical partitioning as well. If most of your queries affect 5 out of 100 columns of your table, consider splitting the table into two tables with a one-to-one relationship. Having 100 columns in a single table is poor design for most business applications, but some applications have a business need to store a couple of hundred columns. If you're using the Enterprise Edition of SQL Server 2000, we recommend building an indexed view on such tables. You can create a view that contains the most frequently used columns, and build a unique clustered index on this view. The beauty of an indexed view is that SQL Server can take advantage of the index built on a view even if the query doesn't explicitly reference the view.