- What Does It Mean to Normalize a Database?
- Steps to Normalize Your Data Model
- Denormalize Data—When Does It Make Sense to Break the Rules?
- Normalization Applied—Review the TEB Database and Refine the Design
- What You Have Learned
Normalization AppliedReview the TEB Database and Refine the Design
With a good understanding of normalization, it is time to apply the concepts to the current project, the Time Entry and Billing Database. To review, the following tables have already been identified:
- Case
- Client
- ClientCase
- Employee
- EmployeeCase
- Invoice
- TimeEntryDetail
Figure 4.5 shows the relationships between the entities.
Figure 4.5 Version 4 of the Time Entry and Billing Database model represents the completed first draft of the database model.
Identifying New Columns and Tables
As you might guess, the need will arise for new tables. Before getting to that step, the existing tables must be outfitted with columns. Before going any further, it is also important to note that every time-entry issue has not been contemplated. The sample you are working through contains the basic information required to illustrate and teach the database concepts covered in this book. Let's now build up the data model started in Chapter 3, "An Introduction to Database Design."
EmployeeCase and ClientCase Tables
If you recall, the EmployeeCase and ClientCase tables are many-to-many resolver tables. Clients can be assigned more than one employee. In addition, an employee can be assigned to multiple cases. The same type of relationship exists between clients and cases. Because of the nature of these tables, no additional columns are required.
Employee Table
Let's begin with the Employee table. What type of information needs be stored in the Employee table? At a minimum, the following data elements are required:
- Employee ID
- First Name
- Middle Initial
- Last Name
- Social Security Number
- Address 1
- Address 2
- City, State, ZIP Code
- Home Phone and Work Extension
In addition to those previously listed, a few more elements of information are required. One element is the Employee Classification. In the firm, you have partners, associates, paralegals, legal secretaries, and other administrative staff. Anytime you run into one of the classification items, it is a good bet that another table will be required. Do you remember the discussion on the third normal form? An employee classification is a generic item that is not dependent on the EmployeeID key. With this in mind, an EmployeeClassification table must be added, along with an EmployeeClassificationID foreign key in the Employee table.
Law firms typically have multiple departments. For example, a law firm might have some or all of the following departments: tax, antitrust, civil litigation, intellectual property, insurance, and municipal, to name a few. Sounds like another classification item, doesn't it? Yes, you will be adding another table to the database, called Department. In addition, you must add a field called DepartmentID to the Employee table.
NOTE
Just to keep track of things, the Employee table has the following structure:
- Employee ID (primary key)
- Department ID (foreign key to Department table)
- Employee Class ID (foreign key to Employee Class table)
- First Name
- Middle Initial
- Last Name
- Social Security Number
- Address 1
- Address 2
- City, State, ZIP Code
- Home Phone and Work Extension
- E-Mail Address
In conjunction with the new foreign keys, the following tables have been added to the data model:
- Department
- EmployeeClass
The Department table, in addition to having a DepartmentID field, also has a Description field. The EmployeeClass table has an EmployeeClass ID and a Description field.
Client Table
The Client table is pretty straightforward. The following elements of data need to be stored:
- ClientID (primary key)
- Client Type (business or individual)
- First Name
- Middle Initial
- Last Name
- Organization Name
- Address 1
- Address 2
- City, State, ZIP Code
- Phone
- E-Mail Address
A few items need to be addressed. First, with respect to Client Type, do you need another table to hold the various types of clients the firm represents? In this case, the answer is no. While conducting your analysis, you have found that only two types of clients will ever exist: businesses and individuals. With this in mind, each client record can have a Client Type field that holds either "B" for business or "I" for individual. In a case like this, there is no practical benefit for having an additional lookup table.
How about contacts? There is no question that in the case of business clients, you will need to provide for contacts. The next question is whether you can have multiple contacts for each client. Most likely, the answer is yes. You will need contact information for the general counsel, a staff attorney, and the CEO, to name but a few. With this in mind, it would appear that clients could have multiple contacts. If you think a Contact table will need to be defined, you are on the right track!
NOTE
The structure of the Contact table is straightforward:
- ContactID (primary key)
- First Name
- Middle Initial
- Last Name
- Title
- Phone
- Extension
- E-Mail Address
What about the Title field and the need for a lookup table? Job titles are a generic item. At the same time, job titles can vary from company to company. Here is where you need to make the decision between whether to store the information in a table or link the data with another table and a foreign key. The advantage of having predefined titles stored in a table is consistency. For example, you could control variances between CEO and C.E.O. At the same time, only a handful of job titles will probably exist. When attention turns to building user interfaces, you will be introduced to some techniques that can enforce standards, and at the same time, allow for variances. With this in mind, a separate job title table will not be created. Instead, the title itself will be stored in the contact record.
Case Table
The Case table holds basic information regarding a case, including the title, judge, court, docket number, department, start date, trial date, settlement date, and notes. Is a separate lookup table required for judge and court? The answer in this case is most likely yes. The firm wants to know which cases have been tried before specific judges. In addition, the firm wants to know the breakdown of which cases have been tried in which courts. Using the City of Philadelphia as an example, state, municipal, and federal courts exist. Within these classifications are many types of courts in which a case can be tried. The firm definitely wants to keep track of which cases are tried in which courts.
NOTE
Court cases are calendar driven. In real life, a child table would be associated with the Case table that would keep track of various date-driven items, such as depositions, pretrial motions, answers, hearings, meetings with the trial judge, settlement talks, and so on. This child table would act as a tickler file to notify users when certain items are due or when appearances have to be made. Because these items are not directly related to keeping track of time and billing, this item will be ignored.
The Case table has the following structure:
- CaseID (primary key)
- DepartmentID (foreign key to Department table)
- CourtID (foreign key to Court table)
- JudgeID (foreign key to Judge table)
- DocketNumber
- Title
- Notes
- Start Date
- Trial Date
- Settlement Date
Every case has to be assigned to a department. For example, a patent case would be assigned to the Intellectual Property Department. What does this facilitate? When it comes time to assign employees to a case, you don't want to assign just any employees to the case. Rather, you want to assign only employees who are part of that department. As you know, an employee must be assigned to a department. As you will see later, organizing data in this manner enables you to filter the list of employees. This helps prevent mistakes, such as assigning a tax attorney to an intellectual property case. Later in the book, you will be introduced to the concepts of business rules and data integrity. As you will see, and as you can probably see already, a sound database design enhances the ability to enforce business rules and data integrity.
The new Court table contains a CourtID field and a Description field. The Judge table contains a JudgeID field, as well as a CourtID field to link with the Court table. In addition, the first name, middle initial, last name, and notes about the judge are stored as well. Even though you might not realize it, this brings up an interesting issue. The Case table has links to both the Court and Judge tables. The Judge table contains a link to the Court table, so the judge must be assigned to a court. If the judge already has a link to a court, why do you need a link to both court and judge in the Case table? This is because another judge could very likely get assigned to your case. Also, these multiple links aid in reporting.
TimeEntryDetail Table
Let's now turn to the TimeEntryDetail table. In the course of analysis, it has already been determined that a TimeEntryDetail record will link with an Invoice record when invoiced. Every TimeEntryDetail record must link to an employee-case-client combination. Remember that a case can involve multiple clients. Therefore, it is not enough to simply link to a case. The issue of which client the time gets billed to would be ambiguous. What else does the TimeEntryDetail need to store? If you have ever filled out a timesheet, you can probably guess what most of the required information will be. Elements such as description of work, date, hours, and rate are a good starting point. In this context, when rate is discussed, it is the billing rate, not the pay rate. What an employee gets paid is beyond the scope of this system. There are more issues with billing rate, so let's take a few moments to discuss those issues.
What does a client pay per hour for services performed? It depends on what the nature of the work is. For example, if administrative tasks, such as photocopying, are being performed, the client might pay $50 per hour for those services. If a paralegal is working on the case, the client might pay $100 per hour for those services, and if an attorney has to appear in court, the client might pay a flat rate of $1,500 for that service. Presently, information regarding default-billing rates is not stored in the database. A billing rate is stored in the TimeEntryDetail table. To ensure consistency and accuracy, a default rate must be stored somewhere. The question is where.
How about the Employee table? Can an employee perform different types of work? The answer is yes. An attorney might confer with the client on the phone, appear in court, or take a deposition. Each of these has different default billing rates. Based on this requirement, the nature of the work seems to define the default rate. Put another way, the default rate depends on the type of work. With this in mind, is the work category stored in the database presently? The answer is no. Therefore, a new table called WorkCategory is required. The details of the new WorkCategory table are discussed shortly. Before moving on, though, do you see how the analysis process of one table can lead to new tables and fields?
The TimeEntryDetail table has the following structure:
- TimeEntryDetailID (primary key)
- InvoiceID (foreign key to Invoice table)
- CaseID (foreign key to Case table)
- EmployeeID (foreign key to Employee table)
- ClientID (foreign key to Client table)
- Description
- Hours
- Rate
- Work Date
- Work Category
The question at this point is whether the TimeEntryDetail record will point to a related WorkCategory record or whether the information will be carried in the TimeEntryDetail record. In this case, the TimeEntryDetail record needs to be independent of changes to the WorkCategory table. For example, what happens if, when time is charged, the task costs $100 per hour and three weeks later the cost of that service increases to $120 per hour? If a link is all that exists, viewing the details of any TimeEntryDetail record would show the new rate. That scenario simply will not work. Further, the rate in the WorkCategory table is a default rate. The rate, after being added to a new TimeEntryDetail record, can be modified. For these reasons, the actual rate must be stored in the TimeEntryDetail record. Because the Work Category Description can change, that, too, is carried in the TimeEntryDetail record.
How do you know when a specific TimeEntryDetail record has been invoiced? You might think you need a field to indicate whether the item has been invoiced. This is where the InvoiceID field comes into play. This field serves two purposes. First, when a new TimeEntryDetail record is created, this field has a default value of 0. Then, after the TimeEntryDetail record has been included in an invoice, the new InvoiceID field is populated with a value. Therefore, any InvoiceID field equal to 0 in TimeEntryDetail has not been invoiced. This is how you know whether a specific TimeEntryDetail record has been invoiced.
The new WorkCategory table has the following structure:
- WorkCategoryID (primary key)
- Description
- DefaultRate
When you view the latest version of the database model in Figure 4.6, you will not see the WorkCategory table involved in a relationship. In this scenario, the WorkCategory table acts as a template. In other words, after the data in a specific WorkCategory record is used to populate data in TimeEntryDetail, no need exists to refer to the WorkCategory table. The need for relationships and to enforce relationships is very important. These concepts are discussed in Chapter 7, "The Basics of Referential Integrity." Suffice it to say at this point that in some cases, tables exist for the purpose of giving data-entry operations a head start. Remember, after a user picks a WorkCategory, the billing rate can be modified.
Invoice Table
Last but not least is the Invoice table. The Invoice table is very straightforward and has the following structure:
- InvoiceID (primary key)
- ClientID (foreign key to Client table)
- Invoice Number
- Invoice Date
The Invoice table in this context is nothing more than glue to tie multiple TimeEntryDetail records together. One element in the Invoice table makes it uniquethe Invoice Number field. In this context, an invoice number is part static and part sequential. For example, invoice number 150 in the year 2001 would have the following invoice number: 2001-150. In this scenario, the invoice number is part static prefix and part sequential. How do you produce the invoice number? Because you are still in the modeling phase, the mechanics of how to produce invoice numbers is not an issue of concern.
Reviewing the TEB Model
It is definitely time to catch your breath! Figure 4.6 illustrates the latest version of the Time Entry and Billing Database model. It is very different and more expansive than what was completed in Chapter 3. It is said that a picture is worth a thousand words. Figure 4.6 conveys a good understanding of the functions the TEB Database is going to support. It is important to note that a physical database still does not exist. In real practice, you would present this model to the primary customers of the system to ensure the requirements have been met. From a budgetary perspective, this is the time you want to catch any shortfalls in functionality that might exist. If an application can be compared to a house, the database is the foundation. If the foundation is faulty, the house will fall. You can't begin construction until the foundation has been built and has had time to settle. Rush the process, and bad things are sure to follow. This is the time in the development process to ensure your application's foundation has settled.
Figure 4.6 The latest version of the Time Entry and Billing Database has been outfitted with new columns and tables and has gone through the normalization process.
Ask Questions and Get Your Users Involved!
One way you can determine whether your design is solid is to ask your database questions and determine whether the database can provide answers. For example, ask the question, "Can I determine which attorneys are working on a specific case?" Then, see whether the database can answer that question. In this case, the database can. In analyzing the design, you can see a link between cases and employees. Furthermore, employees are categorized by type in the EmployeeClass table. Obviously, if you ask a question that can't be answered, the database design is not complete.
As you might guess, catching these design gaps now, during the modeling phase, is far less expensive. Whether gaps exist can be determined by employing common sense as opposed to science. For one thing, science can't answer these questions. If you are unsure about which questions to ask, find out what the users of the system want to know. Be sure to solicit questions from all levels of the organization. Recall the earlier discussion of tactical versus strategic needs of a business? Your users will definitely appreciate being involved in the process. The more users are involved, the more likely users will take ownership and be accepting of the system.