Basic Concepts for Guerrilla Oracle
If you're already familiar with the concepts and terminology associated with Oracle and RDBMS and SQL, then feel free to skip this chapter. For the rest of us, I'll try to be as brief and clear as possible while covering what I consider to be the basic information that you will need to make sense out of the remainder of this book.
I will cover the following five topics: databases, the Oracle Server environment, SQL programming languages, development tools, and the Web. Each topic builds on the previous one, and you need this basic understanding to successfully build your guerrilla database over the next few chapters.
To start, here's a quick review of some of the fundamental concepts for relational databases.
Relational Databases
RDBMS
RDBMS is an abbreviation of relational database management system. The history of databases is fascinating, and the reader is encouraged to spend some time researching. For our purposes here, know that the major hallmarks of an RDBMS are the following:
-
Data storage and retrieval are transparent to the programmer and the end user. Oracle's RDBMS handles where and how the data is stored, and how it is gathered for reports, views, and screen functions.
-
Data is contained in what are called tables, which arrange the data in rows and columns. A row contains all the data for a particular entity. Another way to look at it would be to consider each row a record, and the columns fields in the record. Proper terminology, however, is row and column, and these will be the terms used from now on. A row for a person's data might include name, address, phone, age, sex, and so forth. The row would contain all these pieces of data for a single person, and the columns would be name, address, and so forth:
Relationships
The next concept we have to be clear about is the RDBMS idea of relationships. The idea is actually very simple because we all naturally make relationships between bits of data every day of our lives.
As a concrete example, think about summarizing all the courses you and your friends have taken over the years. Some were at workshops, some online, some from books, some in formal classes. You would naturally want to just jot your name down, and then list the courses one after the other, then the next name and all of that person's courses. However, this approach would be pretty messy in a couple of years after all of you had taken several more courses, wouldn't it? You would run out of room and have to start writing in the columns and margins.
So almost intuitively you would create two tables—one for names and other personal info, and the other for courses that were taken. The information would now look like this (italic typeface identifies the primary keys, which we will discuss shortly, in the section on constraints):
PERSON Table |
COURSE Table |
---|---|
Person ID |
Course ID |
Date of Birth |
Date Taken |
Address |
Location |
Phone |
Number |
Again, what you've done is to make a natural connection, or relationship, between the two sets of data. This is very important for your understanding of how we will later create and link all the tables in your demonstration database.
To summarize, there are only a couple of types of relationships commonly used in the RDBMS world:
-
One-to-many. One person has taken many courses, one exam has many questions, one building has many rooms, and so on.
-
Many-to-many. For example, you as an Oracle expert work on many projects, and each project has many Oracle experts working on it. To create the relationship between the two tables—the one with you and all the Oracle experts and the one with all the projects—we need what is called an intersect table:
In this example of a many-to-many relationship, we have three tables: (1) the ORACLE EXPERT table, (2) the PROJECT table, and (3) the EXPERT-TO-PROJECT table. Think about this. We want to know which projects each of our experts has, and we want to know who all the experts working on a particular project are. So we can try to list all the projects for an expert in the ORACLE EXPERT table, and we can try to list all the experts on a project in the PROJECT table.
This approach quickly becomes a nightmare when project staffing changes or projects are added or dropped. If we did it this way, we would be constantly changing the two tables just trying to keep up. Instead, we create another table, the EXPERT-TO-PROJECT intersect table. Using this table, we can quickly get a list of all projects and who is working on them, and using just that little bit of information, we can go to the bigger tables and get detailed information on each expert and each employee.
This idea may be somewhat new to you, but try to work at grasping the general concept. You will see it again in later chapters, where we actually use it. For example, we will have a link table that is used to get information from both a question table and a test table. So stay tuned; Chapter 5, where this theory is put into practice, is just around the corner.
This is how tables link to each other, through relationships. To summarize, basic storage in an RDBMS is in tables. Tables have columns (Name, Address, Phone Number, and so on) and rows, where the data is stored. All the columns of a row represent an instance of that row. A column can have additional qualifications. For example, it can be a key, and constraints can be added.
Tables and relationships are the heart of an RDBMS.
Constraints
Constraints are another important piece of the RDBMS picture, one that you will be using as you develop your guerrilla database. In brief, a constraint is a rule or set of rules that apply to a column or combination of columns.
Before we terrorize you with the various kinds of constraints, let's take an English-language look at one that you've already done automatically. In the preceding PERSON table, notice the Person ID column. Without thinking, you knew that this had to be a unique column, or the table would not make any sense, right? Making this data element unique is a constraint. What you actually did was create a primary key for the table, and one of the characteristics of a primary key is that it is unique.
All the other pieces of data in a row must depend on the primary key column. In the PERSON table example, not much besides the person ID is unique. For example, several people can share the same phone number and address. Hence the person ID is the logical key, and because it is really unique for each person, it can be the primary key.
Here are some other constraints:
-
Foreign key. The foreign key refers to the primary key of another table. Using the PERSON and COURSE tables above, let's add another table called COURSE DESCRIPTION. Why? Well, suppose that you and five of your friends took the Oracle DBA course. You could enter the course number and description over and over for each of you, or you could put the description in another table, like this:
PERSON Table
COURSE DESCRIPTION Table
COURSE Table
Person ID
Course ID
Person ID
Name
Description
Course ID
Date of Birth
Cost
Date Taken
Address
Location
Phone Number
Grade
Hours
Look at the Course ID column in the COURSE table. It is the same as the Course ID column in the COURSE DESCRIPTION table. Since, as we just learned, the primary key must be unique, the Course ID column in the COURSE DESCRIPTION table has to be unique, and it is the primary key for that table. In the COURSE table, the Course ID column is a foreign key that refers back to the Course ID column in the COURSE DESCRIPTION table.
By linking both tables through a foreign key, you have also set the stage for what is called referential integrity. The value of the foreign key in the COURSE table depends on and exactly corresponds to the value of the primary key, which is the course ID in the COURSE DESCRIPTION table. Hence when you go to add a Course ID to the COURSE table, Oracle will automatically check to make sure that the ID exists in the COURSE DESCRIPTION table. This is referential integrity.
You can add the phrase “on delete cascade” to a foreign key. When you do this, Oracle will automatically delete any foreign-key rows in the table when the primary table row is deleted. For example, if you had taken a course called Intro to Oracle, with course ID 001, and decided to delete course ID 001 from the COURSE DESCRIPTION table, the “on delete cascade” statement would delete the entries for any course ID 001 in the COURSE table. If you didn't do this, you could end up having entries in the COURSE table with no match in the COURSE DESCRIPTION table.
-
Uniqueness. This concept was covered with the primary key. Just be aware that you can make other columns unique without making them keys.
-
Check, including NOT NULL. This powerful feature allows you to establish ranges for data. In the preceding example, you could say that you wanted only courses taken after 1979, when Oracle first appeared. You would then define the Date Taken column as a DATE and add CHECK (Date Taken greater than January 1, 1979).
NULL is a unique concept. It is the absence of any value; it is not zero or spaces. Adding the constraint NOT NULL to a column means that data must be entered when the row is created that uses that column. The NOT NULL constraint is a good way to make sure your users do not skip important columns when adding information to a table.
Jumping ahead a bit, you can get a look at the constraints on a table by checking the data dictionary. Appropriate commands are:
-
SELECT * FROM USER_CONSTRAINTS . . .
-
SELECT * FROM ALL_CONSTRAINTS where . . .
-
SELECT * FROM DBA_CONSTRAINTS
The data dictionary is introduced on the following page and will be covered in more detail in Chapter 10.