- SQL Server Reference Guide
- Introduction
- SQL Server Reference Guide Overview
- Table of Contents
- Microsoft SQL Server Defined
- SQL Server Editions
- SQL Server Access
- Informit Articles and Sample Chapters
- Online Resources
- Microsoft SQL Server Features
- SQL Server Books Online
- Clustering Services
- Data Transformation Services (DTS) Overview
- Replication Services
- Database Mirroring
- Natural Language Processing (NLP)
- Analysis Services
- Microsot SQL Server Reporting Services
- XML Overview
- Notification Services for the DBA
- Full-Text Search
- SQL Server 2005 - Service Broker
- Using SQL Server as a Web Service
- SQL Server Encryption Options Overview
- SQL Server 2008 Overview
- SQL Server 2008 R2 Overview
- SQL Azure
- The Utility Control Point and Data Application Component, Part 1
- The Utility Control Point and Data Application Component, Part 2
- Microsoft SQL Server Administration
- The DBA Survival Guide: The 10 Minute SQL Server Overview
- Preparing (or Tuning) a Windows System for SQL Server, Part 1
- Preparing (or Tuning) a Windows System for SQL Server, Part 2
- Installing SQL Server
- Upgrading SQL Server
- SQL Server 2000 Management Tools
- SQL Server 2005 Management Tools
- SQL Server 2008 Management Tools
- SQL Azure Tools
- Automating Tasks with SQL Server Agent
- Run Operating System Commands in SQL Agent using PowerShell
- Automating Tasks Without SQL Server Agent
- Storage – SQL Server I/O
- Service Packs, Hotfixes and Cumulative Upgrades
- Tracking SQL Server Information with Error and Event Logs
- Change Management
- SQL Server Metadata, Part One
- SQL Server Meta-Data, Part Two
- Monitoring - SQL Server 2005 Dynamic Views and Functions
- Monitoring - Performance Monitor
- Unattended Performance Monitoring for SQL Server
- Monitoring - User-Defined Performance Counters
- Monitoring: SQL Server Activity Monitor
- SQL Server Instances
- DBCC Commands
- SQL Server and Mail
- Database Maintenance Checklist
- The Maintenance Wizard: SQL Server 2000 and Earlier
- The Maintenance Wizard: SQL Server 2005 (SP2) and Later
- The Web Assistant Wizard
- Creating Web Pages from SQL Server
- SQL Server Security
- Securing the SQL Server Platform, Part 1
- Securing the SQL Server Platform, Part 2
- SQL Server Security: Users and other Principals
- SQL Server Security – Roles
- SQL Server Security: Objects (Securables)
- Security: Using the Command Line
- SQL Server Security - Encrypting Connections
- SQL Server Security: Encrypting Data
- SQL Server Security Audit
- High Availability - SQL Server Clustering
- SQL Server Configuration, Part 1
- SQL Server Configuration, Part 2
- Database Configuration Options
- 32- vs 64-bit Computing for SQL Server
- SQL Server and Memory
- Performance Tuning: Introduction to Indexes
- Statistical Indexes
- Backup and Recovery
- Backup and Recovery Examples, Part One
- Backup and Recovery Examples, Part Two: Transferring Databases to Another System (Even Without Backups)
- SQL Profiler - Reverse Engineering An Application
- SQL Trace
- SQL Server Alerts
- Files and Filegroups
- Partitioning
- Full-Text Indexes
- Read-Only Data
- SQL Server Locks
- Monitoring Locking and Deadlocking
- Controlling Locks in SQL Server
- SQL Server Policy-Based Management, Part One
- SQL Server Policy-Based Management, Part Two
- SQL Server Policy-Based Management, Part Three
- Microsoft SQL Server Programming
- An Outline for Development
- Database
- Database Services
- Database Objects: Databases
- Database Objects: Tables
- Database Objects: Table Relationships
- Database Objects: Keys
- Database Objects: Constraints
- Database Objects: Data Types
- Database Objects: Views
- Database Objects: Stored Procedures
- Database Objects: Indexes
- Database Objects: User Defined Functions
- Database Objects: Triggers
- Database Design: Requirements, Entities, and Attributes
- Business Process Model Notation (BPMN) and the Data Professional
- Business Questions for Database Design, Part One
- Business Questions for Database Design, Part Two
- Database Design: Finalizing Requirements and Defining Relationships
- Database Design: Creating an Entity Relationship Diagram
- Database Design: The Logical ERD
- Database Design: Adjusting The Model
- Database Design: Normalizing the Model
- Creating The Physical Model
- Database Design: Changing Attributes to Columns
- Database Design: Creating The Physical Database
- Database Design Example: Curriculum Vitae
- NULLs
- The SQL Server Sample Databases
- The SQL Server Sample Databases: pubs
- The SQL Server Sample Databases: NorthWind
- The SQL Server Sample Databases: AdventureWorks
- The SQL Server Sample Databases: Adventureworks Derivatives
- UniversalDB: The Demo and Testing Database, Part 1
- UniversalDB: The Demo and Testing Database, Part 2
- UniversalDB: The Demo and Testing Database, Part 3
- UniversalDB: The Demo and Testing Database, Part 4
- Getting Started with Transact-SQL
- Transact-SQL: Data Definition Language (DDL) Basics
- Transact-SQL: Limiting Results
- Transact-SQL: More Operators
- Transact-SQL: Ordering and Aggregating Data
- Transact-SQL: Subqueries
- Transact-SQL: Joins
- Transact-SQL: Complex Joins - Building a View with Multiple JOINs
- Transact-SQL: Inserts, Updates, and Deletes
- An Introduction to the CLR in SQL Server 2005
- Design Elements Part 1: Programming Flow Overview, Code Format and Commenting your Code
- Design Elements Part 2: Controlling SQL's Scope
- Design Elements Part 3: Error Handling
- Design Elements Part 4: Variables
- Design Elements Part 5: Where Does The Code Live?
- Design Elements Part 6: Math Operators and Functions
- Design Elements Part 7: Statistical Functions
- Design Elements Part 8: Summarization Statistical Algorithms
- Design Elements Part 9:Representing Data with Statistical Algorithms
- Design Elements Part 10: Interpreting the Data—Regression
- Design Elements Part 11: String Manipulation
- Design Elements Part 12: Loops
- Design Elements Part 13: Recursion
- Design Elements Part 14: Arrays
- Design Elements Part 15: Event-Driven Programming Vs. Scheduled Processes
- Design Elements Part 16: Event-Driven Programming
- Design Elements Part 17: Program Flow
- Forming Queries Part 1: Design
- Forming Queries Part 2: Query Basics
- Forming Queries Part 3: Query Optimization
- Forming Queries Part 4: SET Options
- Forming Queries Part 5: Table Optimization Hints
- Using SQL Server Templates
- Transact-SQL Unit Testing
- Index Tuning Wizard
- Unicode and SQL Server
- SQL Server Development Tools
- The SQL Server Transact-SQL Debugger
- The Transact-SQL Debugger, Part 2
- Basic Troubleshooting for Transact-SQL Code
- An Introduction to Spatial Data in SQL Server 2008
- Performance Tuning
- Performance Tuning SQL Server: Tools and Processes
- Performance Tuning SQL Server: Tools Overview
- Creating a Performance Tuning Audit - Defining Components
- Creating a Performance Tuning Audit - Evaluation Part One
- Creating a Performance Tuning Audit - Evaluation Part Two
- Creating a Performance Tuning Audit - Interpretation
- Creating a Performance Tuning Audit - Developing an Action Plan
- Understanding SQL Server Query Plans
- Performance Tuning: Implementing Indexes
- Performance Monitoring Tools: Windows 2008 (and Higher) Server Utilities, Part 1
- Performance Monitoring Tools: Windows 2008 (and Higher) Server Utilities, Part 2
- Performance Monitoring Tools: Windows System Monitor
- Performance Monitoring Tools: Logging with System Monitor
- Performance Monitoring Tools: User Defined Counters
- General Transact-SQL (T-SQL) Performance Tuning, Part 1
- General Transact-SQL (T-SQL) Performance Tuning, Part 2
- General Transact-SQL (T-SQL) Performance Tuning, Part 3
- Performance Monitoring Tools: An Introduction to SQL Profiler
- Performance Tuning: Introduction to Indexes
- Performance Monitoring Tools: SQL Server 2000 Index Tuning Wizard
- Performance Monitoring Tools: SQL Server 2005 Database Tuning Advisor
- Performance Monitoring Tools: SQL Server Management Studio Reports
- Performance Monitoring Tools: SQL Server 2008 Activity Monitor
- The SQL Server 2008 Management Data Warehouse and Data Collector
- Performance Monitoring Tools: Evaluating Wait States with PowerShell and Excel
- Practical Applications
- Choosing the Back End
- The DBA's Toolbox, Part 1
- The DBA's Toolbox, Part 2
- Scripting Solutions for SQL Server
- Building a SQL Server Lab
- Using Graphics Files with SQL Server
- Enterprise Resource Planning
- Customer Relationship Management (CRM)
- Building a Reporting Data Server
- Building a Database Documenter, Part 1
- Building a Database Documenter, Part 2
- Data Management Objects
- Data Management Objects: The Server Object
- Data Management Objects: Server Object Methods
- Data Management Objects: Collections and the Database Object
- Data Management Objects: Database Information
- Data Management Objects: Database Control
- Data Management Objects: Database Maintenance
- Data Management Objects: Logging the Process
- Data Management Objects: Running SQL Statements
- Data Management Objects: Multiple Row Returns
- Data Management Objects: Other Database Objects
- Data Management Objects: Security
- Data Management Objects: Scripting
- Powershell and SQL Server - Overview
- PowerShell and SQL Server - Objects and Providers
- Powershell and SQL Server - A Script Framework
- Powershell and SQL Server - Logging the Process
- Powershell and SQL Server - Reading a Control File
- Powershell and SQL Server - SQL Server Access
- Powershell and SQL Server - Web Pages from a SQL Query
- Powershell and SQL Server - Scrubbing the Event Logs
- SQL Server 2008 PowerShell Provider
- SQL Server I/O: Importing and Exporting Data
- SQL Server I/O: XML in Database Terms
- SQL Server I/O: Creating XML Output
- SQL Server I/O: Reading XML Documents
- SQL Server I/O: Using XML Control Mechanisms
- SQL Server I/O: Creating Hierarchies
- SQL Server I/O: Using HTTP with SQL Server XML
- SQL Server I/O: Using HTTP with SQL Server XML Templates
- SQL Server I/O: Remote Queries
- SQL Server I/O: Working with Text Files
- Using Microsoft SQL Server on Handheld Devices
- Front-Ends 101: Microsoft Access
- Comparing Two SQL Server Databases
- English Query - Part 1
- English Query - Part 2
- English Query - Part 3
- English Query - Part 4
- English Query - Part 5
- RSS Feeds from SQL Server
- Using SQL Server Agent to Monitor Backups
- Reporting Services - Creating a Maintenance Report
- SQL Server Chargeback Strategies, Part 1
- SQL Server Chargeback Strategies, Part 2
- SQL Server Replication Example
- Creating a Master Agent and Alert Server
- The SQL Server Central Management System: Definition
- The SQL Server Central Management System: Base Tables
- The SQL Server Central Management System: Execution of Server Information (Part 1)
- The SQL Server Central Management System: Execution of Server Information (Part 2)
- The SQL Server Central Management System: Collecting Performance Metrics
- The SQL Server Central Management System: Centralizing Agent Jobs, Events and Scripts
- The SQL Server Central Management System: Reporting the Data and Project Summary
- Time Tracking for SQL Server Operations
- Migrating Departmental Data Stores to SQL Server
- Migrating Departmental Data Stores to SQL Server: Model the System
- Migrating Departmental Data Stores to SQL Server: Model the System, Continued
- Migrating Departmental Data Stores to SQL Server: Decide on the Destination
- Migrating Departmental Data Stores to SQL Server: Design the ETL
- Migrating Departmental Data Stores to SQL Server: Design the ETL, Continued
- Migrating Departmental Data Stores to SQL Server: Attach the Front End, Test, and Monitor
- Tracking SQL Server Timed Events, Part 1
- Tracking SQL Server Timed Events, Part 2
- Patterns and Practices for the Data Professional
- Managing Vendor Databases
- Consolidation Options
- Connecting to a SQL Azure Database from Microsoft Access
- SharePoint 2007 and SQL Server, Part One
- SharePoint 2007 and SQL Server, Part Two
- SharePoint 2007 and SQL Server, Part Three
- Querying Multiple Data Sources from a Single Location (Distributed Queries)
- Importing and Exporting Data for SQL Azure
- Working on Distributed Teams
- Professional Development
- Becoming a DBA
- Certification
- DBA Levels
- Becoming a Data Professional
- SQL Server Professional Development Plan, Part 1
- SQL Server Professional Development Plan, Part 2
- SQL Server Professional Development Plan, Part 3
- Evaluating Technical Options
- System Sizing
- Creating a Disaster Recovery Plan
- Anatomy of a Disaster (Response Plan)
- Database Troubleshooting
- Conducting an Effective Code Review
- Developing an Exit Strategy
- Data Retention Strategy
- Keeping Your DBA/Developer Job in Troubled Times
- The SQL Server Runbook
- Creating and Maintaining a SQL Server Configuration History, Part 1
- Creating and Maintaining a SQL Server Configuration History, Part 2
- Creating an Application Profile, Part 1
- Creating an Application Profile, Part 2
- How to Attend a Technical Conference
- Tips for Maximizing Your IT Budget This Year
- The Importance of Blue-Sky Planning
- Application Architecture Assessments
- Transact-SQL Code Reviews, Part One
- Transact-SQL Code Reviews, Part Two
- Cloud Computing (Distributed Computing) Paradigms
- NoSQL for the SQL Server Professional, Part One
- NoSQL for the SQL Server Professional, Part Two
- Object-Role Modeling (ORM) for the Database Professional
- Business Intelligence
- BI Explained
- Developing a Data Dictionary
- BI Security
- Gathering BI Requirements
- Source System Extracts and Transforms
- ETL Mechanisms
- Business Intelligence Landscapes
- Business Intelligence Layouts and the Build or Buy Decision
- A Single Version of the Truth
- The Operational Data Store (ODS)
- Data Marts – Combining and Transforming Data
- Designing Data Elements
- The Enterprise Data Warehouse — Aggregations and the Star Schema
- On-Line Analytical Processing (OLAP)
- Data Mining
- Key Performance Indicators
- BI Presentation - Client Tools
- BI Presentation - Portals
- Implementing ETL - Introduction to SQL Server 2005 Integration Services
- Building a Business Intelligence Solution, Part 1
- Building a Business Intelligence Solution, Part 2
- Building a Business Intelligence Solution, Part 3
- Tips and Troubleshooting
- SQL Server and Microsoft Excel Integration
- Tips for the SQL Server Tools: SQL Server 2000
- Tips for the SQL Server Tools – SQL Server 2005
- Transaction Log Troubles
- SQL Server Connection Problems
- Orphaned Database Users
- Additional Resources
- Tools and Downloads
- Utilities (Free)
- Tool Review (Free): DBDesignerFork
- Aqua Data Studio
- Microsoft SQL Server Best Practices Analyzer
- Utilities (Cost)
- Quest Software's TOAD for SQL Server
- Quest Software's Spotlight on SQL Server
- SQL Server on Microsoft's Virtual PC
- Red Gate SQL Bundle
- Microsoft's Visio for Database Folks
- Quest Capacity Manager
- SQL Server Help
- Visual Studio Team Edition for Database Professionals
- Microsoft Assessment and Planning Solution Accelerator
- Aggregating Server Data from the MAPS Tool
As I mentioned in my last overview on Stored Procedures, the two main categories of database objects are those that store data and those that access or provide access to data. Tables are the only objects that actually store data; all other objects in the database provide access to data.
It seems obvious that objects such as views and stored procedures provide access to data, but in this article I’ll describe another object that provides a round-about way of data access: indexes. In this overview, I’ll explain index theory, index structures, and how indexes are implemented. To put these theories into practice, we have other articles here at InformIT regarding maintain indexes. In other words, this article is more theory; the "how to" portions are in the sections I'll reference at the bottom of this overview.
Indexes Defined
So just what is an index? An index is a structure in the database that SQL Server can use to find and access data quickly.
Think about a fairly thick book. If I asked you to find a particular topic or even a word in that book, you’d turn to the index. The index in the book is very similar to what a SQL Server Index.
In the case of the book’s index, you can find the page numbers where a word or phrase is located. You simply find the word (which is listed in alphabetical order) and the page or pages where that word or phrase appears is listed just to the right. In fact, most indexes are even broken out into alphabetical groupings in a bold heading, which is a visual cue for you to locate that section quickly. In addition to alphabetical order, sometimes books have additional indexes based on content type or another grouping.
If you didn’t have an index or table of contents in a book when you were asked to locate a word, you’d have to read each and every page to find it. And even when you found it, you’d have to keep reading to make sure that it didn’t occur again. You would have to do this each and every time I asked you to find a word or phrase, since you probably can't memorize every location of something as you look it up. Each time you're asked where a word is, you have to scan the entire book.
In the case of SQL Server, you have the same issue. You have a lot of data, and it may surprise you to learn that by its very definition, the data in a relational database isn't stored in any particular order. It's just scattered all over the data pages (which are 8K blocks on the hard drive) with no particular order other than what was quickest to insert at the time. That makes the job of finding the data very difficult.
In fact, without some sort of help, the server’s Query Optimization engine has to do a "scan" operation, which is the same as what you and I would do without an index in a book — it would have to read every page of data to locate the field you're asked for. With tables under a thousand rows or so, this isn't really a problem, and in fact is the default behavior in SQL Server. But with larger tables, those with thousands or millions of rows, this takes a lot of time. It also has some other bad side-effects. Because all of those rows don't fit into memory, the server has to read from the I/O system, which is among the slowest components on your server. While data are being read from the I/O system, the database can't write to the tables while it is reading from them. That slows down not only reads but writes.
It gets worse. In a database the data is changing all the time. Imagine that you’re asked to find a word in a book, but someone keeps writing new words in the book, and changing where the old ones were.
What you can do to mitigate this scanning problem is to create an Index. The index in SQL Server works the same as the index in a book. It stores the words you're interested in along with a number (which is really a pointer to a block on the I/O system) where the word or phrase is found. Now all SQL Server has to do when you ask for data is to look up the word or phrase in the index, which is much smaller, and find where it lives on the hard drive.
The physical arrangement of indexes in SQL Server is just a table-structure in the database, and on the hard drive. When you create an index, a new physical structure is created and new rows of data are put into that table. You can tell SQL Server when you create the index which fields in the database table you want to track, just like you would mark words in a book that need to be tracked for its index. You don't put all data into an index, since it that would create the same problem you're trying to solve. You only index the fields that users are likely to look up, or fields that are useful for your programs to use. I'll explain more about this later.
The physical arrangement of indexes fall into two types: Clustered and Non-Clustered. There are important differences in these two types, and which one you choose should be carefully considered during the design phase.
A clustered index actually changes the way the data is stored in the table, and the index. In fact, if you create a clustered index on a table, the table itself simply becomes the index.
Let’s take a look at a very simple table:
A |
D |
C |
B |
E |
One tenet of a relational database is that the data isn’t stored in any particular order, and SQL Server holds to that tenet as you can see here. Tables where the data is in no particular order and with no indexes are called heap tables. Once you apply a clustered index to the columns, however, the table changes physically on the hard drive to this:
A |
B |
C |
D |
E |
As you can see, the data can only be physically arranged one way. You can’t have the same data arranged physically as A-F and also F-A. Since the data is arranged this way, the server doesn't need to store another table for the index pointing to the data — it just calls the table the index, and the index the table.
The advantage here is that data access is extremely fast. It's like rearranging a book to the point where every word is in alphabetical order. That wouldn't help the storyline much, but you would be able to find anything quickly. Come to think of it, there is a book like that — the dictionary.
Of course, this speed comes at a price. Since the table is physically ordered by the index fields, each time you insert a new row of data the I/O subsystem has to rearrange all the data in the table. So if there are a lot of data changes, the system can slow down to a crawl. If the data doesn't change that often or is arranged in large groups, a clustered index might be the best solution.
The other type of index in SQL Server is called Non-Clustered. This type doesn’t physically re-arrange the data; it just makes another table-like structure which points to where the data is physically located.
So if we continue the book analogy, a clustered index is similar to the table of contents, since the data is arranged that way. A non-clustered index would be similar to a book’s index in the back.
Let's examine our table again, but this time I'll add some more data and show a hidden column: the block address of each row:
AEF1 |
A |
BFE2 |
D |
AEF1 |
C |
BFE2 |
B |
BFE2 |
E |
BFE2 |
A |
You'll notice that there are two "A" values, stored at different locations. At this point if you asked for all of the "A" values the engine would have to read the entire table structure, stored on all these block fragments. If there were thousands of blocks and only two values of "A", the engine would still read them all just to make sure. That would take a lot of time, and create lots of locks as it did so.
Now we'll create an index on the table, which makes another internal table structure that looks like this:
AEF1, BFE2 |
A |
BFE2 |
B |
AEF1 |
C |
BFE2 |
D |
BFE2 |
E |
Now when you look up all the "A" values the engine sees that there is an index, uses that index to find the data, and quickly calls up just those two blocks where the data lives. That's a faster access process, and takes far fewer locks on the rows or tables.
Types of Indexes
There are many types of indexes in SQL Server. The one that you'll work with most often is the user-defined index. This is the Index object type in the database, and they are stored in one of the physical arrangements I just explained.
But there are other types of indexes as well. You see, even if you don't specify an index on a table, the SQL Server Query Optimizer wants to be able to find data quickly. So it will "watch" what you are querying on and make temporary indexes of its own, called "statistical indexes" or more commonly "statistics". You can tell SQL Server to create and maintain these statistics automatically, or you can control that manually. You can make those changes on the database settings in the Properties tab for the database. I recommend that you allow the server to create and maintain statistics, unless the database is quite large and you'll run this process manually. I'll cover this topic more in the other articles referenced at the end of this overview.
Another type of index is the Full-Text index. This structure is used for Full-Text fields and isn't stored in SQL Server at all, and I've covered that information in more depth here.
Finally, in SQL Server 2005, you can even index views, computed columns, and create a special XML index. I'll cover those in other overviews as well.
Using Indexes Effectively
The basic premise in any of these situations is that you need to create an index on the columns that the users access the most for searching. For instance, in an address table, the last name, city, and state are normally prime candidates for an index. In that case, you might make an index for each of those three fields. You might also make a single index that has all three of these fields. I’ll describe that further in a moment. First, take a look at this table:
1 |
1 |
1 |
1 |
2 |
2 |
2 |
2 |
3 |
Now look at this table:
1 |
2 |
3 |
4 |
5 |
See the difference? If you have an index on first field of the second table, you'll end up with the same values in the index as in the table itself. In the first filed of the first table, you have a lot of the same values, and they will potentially end up all over the drive. If users need to look up that data frequently, that's the column you should index. We'll cover these decisions more in the other tutorials I've mentioned.
You’ll normally want to create indexes on the fields in a query’s predicate (the WHERE clause). You may include more than one column in an index. Including more than one column in an index is called a composite index. If your users are including the AND predicate, then you may need to include all those columns in a single index. Don’t go crazy with that, as anything above five or so columns will usually cost you.
To follow on with that concept, sometimes you’ll create a compound index that has all of the fields a user might select at one time. Creating an index on all those fields is called a covering index. What this means is that all of the data you're query needs is included in the index table itself, so it never has to use the pointer at all – it's kind of like having the advantages of a clustered index without as much of the expense. In some cases that can really speed up the query.
If your database is used in an application that performs many inserts, edits, and deletes, then speedy access isn’t always the driving factor. Databases that are used in transaction-oriented solutions, such as a sales database, fit this model. In a database where sales are recorded, there aren’t a lot of look-up activities, other than perhaps checking inventory. These types of database systems are called On-Line Transaction Processing systems, or OLTP. The decisions regarding the use of indexes is important in OLTP databases, because adding an index slows down INSERT operations, as the index is kept up to date when the data changes. So indexes on this type of database are less useful, and sometimes can impact performance. That isn't to say you shouldn't have any indexes, because even insert operations are sometimes based on lookups.
The second type of database is more reporting or look-up oriented. Databases that match this description are those used at a public library, or those used to provide reporting information. When those systems provide multidimensional data lookups they are called On-Line Analytical Processing systems, or OLAP. It’s easier to design indexes for these databases, since the data doesn’t change as often and you’re usually familiar with the columns the user is searching on.
The next type is the hybrid transaction/reporting database. This type of database is used for both entry and reporting. This is the most difficult situation for designing indexes, because you need them to speed up access for reporting but you don’t need them because they will slow down access for data entry. Unfortunately, this is also the type you’ll encounter most often.
Creating indexes is more difficult to do with OLAP databases than with OLTP databases. As a matter of fact, it’s usually best to minimize indexes for OLTP. Of course, that’s assuming your database is used purely for transactions and not also for reporting. Unfortunately, that’s not normally the case.
The next question is how to determine which kind of physical index to use — clustered or non-clustered. There are some other good indicators that the index you’re making should be clustered. Since the table is physically re-arranged if it’s clustered, then it’s best if the data doesn’t change a lot. If the table with a clustered index on the first field looks like this:
1 |
3 |
4 |
6 |
7 |
Then to insert the values "2" and "5" into the table (which you'll remember is the index as well) everything has to move around to accommodate the insertion. It’s better if the data that will be inserted comes after the data that’s already there — at least on the columns where a clustered index is defined. That way no movement happens at all.
So now you can see that perhaps a great candidate for the one (and only) clustered index on a table is the primary key — the fields you have set as the ones that uniquely identify the rows in the table. And you also can see that picking a "meaningful" value like an identification number for a person is such a bad idea for a Primary Key. If the number changes or the person leaves, the whole table would need to be rearranged physically on the drive. If you pick a meaningless, incrementing number for the Primary Key it becomes a natural target for a clustered index, since so many lookups are based on the Primary Key.
Another consideration for making the index clustered is the length of the data. If the data length is short, and even better if it’s made up of integers, the insert operations are quick and the system can move the data around easily. It’s also a good idea if the data is unique. When the values are unique, there is less chance that you'll have to move the data to insert it.
You should also consider a clustered index if the data is within a range that the user will search on. Queries such as:
SELECT au_lname FROM authors WHERE au_lname like ’[A-L]%’
Work well with a clustered index on the au_lname column.
If the order of the data is similar to the order in which the user wants to get results, it’s also a great candidate for a clustered index.
If a column isn’t a candidate for a clustered index after all these considerations, then of course you’re looking at creating a non-clustered index. Luckily, this is the default option.
Check the references section that follows for more strategies on how you can create and maintain good indexes. What I've shown here are more guidelines than hard and fast rules. You mileage will vary.
Informit Articles and Sample Chapters
An overview of the mechanics of implementing effective indexes is here.
To learn more about index operations, see this tutorial.
SQL Server: Optimizing Database Performance Through Indexes, by Kevin Kline, Baya Pavliashvili.
Online Resources
SQL Server Performance asks: Will the SQL Server Query Optimizer will use your carefully crafted indexes?
Create and Modify SQL Server Indexes Properly — Learn how to determine the state of your indexes and understand whether that state should be changed. See how evaluating the construction and location of your system’s indexes and deciding whether to adjust their fill factors and padding can improve the system’s performance, by Drew Georgopulos.
Here are some tips on optimizing your SQL Server indexes.
Microsoft explains the architecture of indexes here.