- Thriving in a Data-Driven World
- Understanding the Relational Database
- The Relational Database Continues to Lead the Way
- Examples and Exercises
- Summary
- Q&A
- Workshop
Understanding the Relational Database
A relational database is a database that is divided into logical units called tables. These tables are related to one another within the database. A relational database allows data to be broken down into logical, smaller, manageable units, facilitating easier maintenance and providing more optimal database performance according to the level of organization. In Figure 1.4, you can see that tables are related to one another through a common key (data value) in a relational database.
FIGURE 1.4
The relational database
Again, tables are related in a relational database so that adequate data can be retrieved in a single query (although the desired data might exist in more than one table). Having common keys, or fields, among relational database tables allows data from multiple tables to be joined to form one large set of data. As you venture deeper into this book, you see more of a relational database’s advantages, including overall performance and easy data access.
Taking a Glimpse into a Sample Database
All databases have a simple reason for existing: They store and maintain valuable data. In this section, you look at a simplified example data set that illustrates how data might look in a relational database and also shows how data is related through key relationships. These relationships, which are also rules for how data is stored, hold the key to the value of a relational database.
Figure 1.5 shows a table called EMPLOYEES. A table is the most basic type of object where data is stored within a relational database. A database object is a defined structure that physically contains data or has an association to data stored in the database.
FIGURE 1.5
Table structure
Fields
Every table is divided into smaller entities called fields. A field is a column in a table that is designed to maintain specific information about every record in the table. The fields in the EMPLOYEES table consist of ID, LAST_NAME, and FIRST_NAME. They categorize the specific information that is maintained in a given table. Obviously, this is a simplistic example of the data that might be stored in a table such as EMPLOYEES.
Records, or Rows of Data
A record, also called a row of data, is a horizontal entry in a table. Looking at the last table, EMPLOYEES, consider the following first record in that table:
1 Smith Mary
The record consists of an employee identification, employee last name, and employee first name. For every distinct employee, there should be a corresponding record in the EMPLOYEES table.
A row of data is an entire record in a relational database table.
Columns
A column is a vertical entity in a table that contains all the information associated with a specific field in a table. For example, a column in the EMPLOYEES table for the employee’s last name consists of the following:
Smith Jones William Mitchell Burk
This column is based on the field LAST_NAME, the employee’s last name. A column pulls information about a certain field from every record within a table.
Referential Integrity
Referential integrity is the hallmark of any relational database. Figure 1.6 shows two tables, EMPLOYEES and DEPENDENTS, that are related to one another in our imaginary database. The DEPENDENTS table is simply a table that contains information about dependents of each employee in the database, such as spouses and children. As in the EMPLOYEES table, the DEPENDENTS table has an ID. The ID in the DEPENDENTS table references, or is related to, the ID in the EMPLOYEES table. Again, this is a simplistic example to show how relationships work in a database and to help you understand referential integrity.
FIGURE 1.6
Table relationships
The key point to note in Figure 1.6 is that there is a relationship between the two tables through the ID field. The ID field in EMPLOYEES is related to the ID field in DEPENDENTS. The ID field in EMPLOYEES is a primary key, whereas the ID field in DEPENDENTS is a foreign key. These types of keys are critical to any relational database structure and to referential integrity.
Primary Keys
A primary key is a column that makes each row of data in the table unique in a relational database. In Figure 1.6, the primary key in the EMPLOYEES table is ID, which is typically initialized during the table creation process. The primary key ensures that all employee identifications are unique, so each record in the EMPLOYEES table has its own ID. Primary keys alleviate the possibility of a duplicate record in a table and are used in other ways, which you learn more about as you progress throughout the book.
Primary keys are typically initialized during the table creation process, although a primary key can be added later as long as duplicate data does not already exist in the table.
Foreign Keys
A foreign key is a column in a table that references a column in another table. Primary and foreign keys establish relationships between tables in a relational database.
Columns identified as foreign keys have these characteristics:
▶ Do not have to have unique values
▶ Ensure that each entry in the column has a corresponding entry in the referenced table
▶ Ensure that column data (child records) are never deleted if corresponding data (parent records) is found in the table referenced (the primary key column)
In Figure 1.6, the foreign key in the DEPENDENTS table is ID, which is the column that contains the employee ID of the corresponding table in the EMPLOYEES table. Again, this is a simplistic example: Ideally, the DEPENDENTS table would have an ID for each dependent and an ID for each employee referenced. So for learning purposes, the ID in the DEPENDENTS table is a foreign key that references the ID in the EMPLOYEES table. The record, or row of data, in the EMPLOYEES table, is a parent record because of the primary key, which might have child records within the database. Likewise, the ID in the DEPENDENTS table is a foreign key, or child record, which requires a relationship to a parent record, or primary key, somewhere in the database.
When manipulating data in the database, keep the following points in mind:
▶ You add data into a column identified as a foreign key unless corresponding data already exists as a primary key in another table.
▶ You cannot delete data from a column identified as a foreign key unless you first remove any corresponding data from columns in tables with primary keys that are referenced by the foreign key.
Getting Data from Multiple Tables
In Figure 1.6, you should understand the primary and foreign keys that have been defined at this point. When these keys are defined upon table creation or modification, the database knows how to help maintain the integrity of data within and also knows how that data is referenced between tables.
Let’s say you want to know the dependents of Kelly Mitchell. You have likely already used a common-sense process while looking at Figure 1.6. However, this was probably your thought process:
▶ You see that Kelly Mitchell’s ID is 4.
▶ You look for Kelly Mitchell’s ID of 4 in the DEPENDENTS table.
▶ You see that Kelly Mitchell’s ID of 4 corresponds to three records in the DEPENDENTS table.
▶ Therefore, Kelly Mitchell’s dependents are Laura, Amy, and Kevin.
This common-sense process of finding information is almost exactly the process that occurs behind the scenes when you use SQL to “ask” a relational database about the data it contains.
NULL Values
NULL is the term used to represent a missing value. A NULL value in a table is a value in a field that appears to be blank; that field has no value. It is important to understand that a NULL value is different from a zero value or a field that contains spaces: A field with a NULL value has intentionally been left blank during record creation. For example, a table that contains a column called MIDDLE_NAME might allow null or missing values because not every person has a middle name. Records in tables that do not have an entry for a particular column signify a NULL value.
Additional table elements are discussed in detail during the next two hours.
Logical vs. Physical Database Elements
Within a relational database, both logical and physical elements exist throughout the database lifecycle. Logical elements are typically conceived as database structures during the planning and design phase. Physical structures are objects that are created later; they comprise the database itself that stores data that SQL and various applications access.
For example, logical elements might include the following elements during conception:
▶ Entities
▶ Attributes
▶ Relationships
▶ Information/data
Those logical elements later become the following physical elements during database creation:
▶ Tables
▶ Fields/columns
▶ Primary and foreign key constraints (built-in database rules)
▶ Usable data
Database Schemas
A schema is a group of related objects in a database. A schema is owned by a single database user, and objects in the schema can be shared with other database users. Multiple schemas can exist in a database. Figure 1.7 illustrates a database schema.
FIGURE 1.7
A schema