Normalization
Back in the late 1980s, it occurred to me that the relational model had been in existence for almost 20 years and that people had been designing databases using the same basic methodology for about 12 years. (And I’m still surprised we’re using it some 20+ years later.) I was using the traditional design methodology at that time, but I occasionally found it difficult to employ. The two things that bothered me the most about it were the normalization process (as a whole) and the seemingly endless iterations it took to arrive at a proper design. Of course, these seemed to be sore points with most of the other database developers whom I knew, so I certainly wasn’t alone in my frustrations. I thought about these problems for quite some time, and then I came up with a solution.
I already knew that the purpose of normalization is to take an improperly or poorly designed table and transform it into a table with a sound structure. I also understood the process: Take a given table and test it against the normal forms to determine whether it is properly designed. If it isn’t designed properly, make the appropriate modifications, retest it, and repeat the entire process until the table structure is sound. Figure 2.2 shows how I visualized the process at this point.
I kept these facts in mind and then posed the following questions:
If we assume that a thoroughly normalized table is properly and efficiently designed, couldn’t we identify the specific characteristics of such a table and state these to be the attributes of an ideal table structure?
Couldn’t we then use that ideal table as a model for all tables we create for the database throughout the design process?
The answer to both questions, of course, is yes, so I began in earnest to develop the basis for my “new” design methodology. I first compiled distinct sets of guidelines for creating sound structures by identifying the final characteristics of a well-defined database that successfully passed the tests of each normal form. I then conducted a few tests, using the new guidelines to create table structures for a new database and to correct flaws in the table structures of an existing database. These tests went very well, so I decided to apply this technique to the entire traditional design methodology. I formulated guidelines to address other issues associated with the traditional design method, such as domains, subtypes, relationships, data integrity, and referential integrity. After I completed the new guidelines, I performed more tests and found that my methodology worked quite well.
The main advantage of my design methodology is that it removes many aspects of the traditional design methodology that new database developers find intimidating. For example, normalization, in the traditional sense, is now transparent to the designer because it is incorporated (via the new guidelines) throughout the design process. Another major advantage is that the methodology is clear and easy to implement. I believe much of this is due to the fact that I’ve written all the guidelines in plain English, making them easy for most anyone to understand.
It’s important for you to understand that this design methodology will yield a fully normalized database structure only if you follow it as faithfully as you would any other design methodology. You cannot shortcut, circumvent, de-emphasize, or omit any part of this methodology (or any design methodology, for that matter) and expect to develop a sound structure. You must go through the process diligently, methodically, and completely in order to reap the expected rewards.
You’ll have to learn a few basic terms before you delve into the design process, and we’ll cover them in the next chapter.