The Cost of GUIDs as Primary Keys
In this article, I will be discussing the pros and cons of using globally unique identifiers (GUIDs) as the datatype for primary keys in SQL Server 2000. You will see concrete test results that give a hint of the performance characteristics. Along the way, I will also discuss a special type of GUID that I "invented," called COMBs. Before we start discussing GUIDs, though, I'd like to say a few words about natural and surrogate keys.
Natural or Surrogate Keys
When you do the physical design of a relational database, it's very important to decide upon which style to use for the primary keys. Some people prefer to use natural keysthat is, keys that are found in the domain that the database models. Others prefer to use surrogate keys, which are constructed keys with no other purpose than to be just keys (and which are not found in the domain). An example of a natural key is a Social Security number. A value incrementing by 1 for each row is a typical example of a surrogate key.
Using natural keys is the traditional approach, in line with Codd's original relational model. When you use them, you have only natural data that means something to users. This is good if users will ask ad hoc queries directly to the database in raw SQL. You can also often reduce the numbers of joins when using natural keys because you don't have to go to a lookup table to convert an ID to a description. One more advantage is that you get the minimum number of constraints because you don't have to protect the uniqueness of the natural keys separately. You already did this when you used them as primary keys.
Surrogate keys can be seen as a newer approach. This approach does not conflict with the relational model, but, in a way, it is a step closer to a more object-based approach in which each object has an ID and the structure of all IDs is of the same type. When you use surrogate keys, you often get smaller foreign keys, which reduces the size of the database. There is no risk of users changing the values of the primary keys, and the programming can be more consistent because all keys are of the same format.
NOTE
With cascading updates/deletes in SQL Server 2000, the problem of users changing the values of primary keys is not so great anymore because you don't have to program the UPDATE of dependent rows manually.
That was a brief description of the different kinds of keys. Now let's assume that we choose to use surrogate keys when we design a new database.