Building a Database
In this section, you will learn to build simple PostgreSQL databases. PostgreSQL supports a lot of functions and features to make the definition of a data structure easy. In addition, data can easily be inserted into tables using simple SQL commands.
Building Simple Data Structures
The core component of every data structure is a table. Tables are used to store the data, and most database operations are based on tables. Defining and removing tables correctly are essential steps when working with databases.
Creating Tables
In this section, you learn how to create tables with the help of SQL commands. If you want to create a table, you can use the CREATE TABLE command.
If you want to create a table called emp for storing names and salaries, you use the following command:
name=# CREATE TABLE emp(id serial, empname varchar(50), sal numeric(9,2));
The display is the following:
NOTICE: CREATE TABLE will create implicit sequence 'emp_id_seq' for SERIAL column 'emp.id' NOTICE: CREATE TABLE/UNIQUE will create implicit index 'emp_id_key' for table 'emp' CREATE
You can check to see whether the table has been created by using the \d commandin this case, \d emp:
name=# \d emp Table "emp" Attribute | Type | Modifier -----------+--------------+------------------------------------- -------- id | integer | not null default nextval('emp_id_seq'::text) empname | varchar(50) | sal | numeric(9,2) | Index: emp_id_key
The table has successfully been created and contains three columns. The first column is used as a sequence; every record will have a unique id. If the datatype serial is used for a column, a sequence will implicitly be created by the database. The second column will be used to store the name of the person. Because a name has no fixed length, you use the datatype varchar() with a maximum length of 50 characters. The third column will be used to store the salary. Salaries are usually decimal values. In this case, the salary can have up to 7 digits before the comma; 2 digits can be used after the comma. PostgreSQL automatically creates an index on the first column if no special primary key is defined.
Here is a second, slightly different CREATE TABLE command:
CREATE TABLE emp2 (id serial, empname varchar(50) UNIQUE, sal numeric(9,2) NOT NULL, currency varchar(4) DEFAULT 'USD');
The following is displayed:
NOTICE: CREATE TABLE will create implicit sequence 'emp2_id_seq' for SERIAL column 'emp2.id' NOTICE: CREATE TABLE/UNIQUE will create implicit index 'emp2_id_key' for table 'emp2' NOTICE: CREATE TABLE/UNIQUE will create implicit index 'emp2_empname_key' for table 'emp2' CREATE
We have created a table called emp2, but the column named empname can be used to store only unique names. This can be useful because it is extremely confusing to have two people with exactly the same name but with different salaries in the database. If someone tries to insert a name twice, the database will report an error. NOT NULL means that a correct salary has to be available. If no salary for a particular row is inserted into the table, an error will be displayed. The fourth column is used to store the currency of the salarythe default value is set to USD. The default value will be used if no value is inserted into the field.
If you don't know the syntax of the command by heart, you can simply use the \h CREATE TABLE command. The following lines will be displayed:
name=# \h CREATE TABLE Command: CREATE TABLE Description: Creates a new table Syntax: CREATE [ TEMPORARY | TEMP ] TABLE table ( column type [ NULL | NOT NULL ] [ UNIQUE ] [ DEFAULT value ] [column_constraint_clause | PRIMARY KEY } [ ... ] ] [, ... ] [, PRIMARY KEY ( column [, ...] ) ] [, CHECK ( condition ) ] [, table_constraint_clause ] ) [ INHERITS ( inherited_table [, ...] ) ]
Using Temporary Tables
Temporary tables are an extremely powerful feature of PostgreSQL. They can be used to store session-specific information in a comfortable way. Temporary tables can be created with the CREATE TEMPORARY TABLE command, which works like the CREATE TABLE command.
The difference between ordinary and temporary tables is that temporary tables are visible only while the session is active. If users quit the session, all temporary tables they have created will be deleted by the database.
If two users log in simultaneously, both users can create temporary tables with the same name because the temporary tables of one user can't be seen by the other user. This feature makes temporary tables easy and safe to use.
The DROP TABLE Command
The DROP TABLE command can be used to delete tables. Here is an example:
DROP TABLE emp2;
The table will be dropped immediately and no ROLLBACK can be performed. You will learn about ROLLBACK and transactions in Chapter 4, "PL/PGSQL."
Be careful when using DROP TABLE. DROP TABLE can also be used to drop multiple tables at once. If you want to drop more than one table, add a comma; the name of the table you want to drop to the statement and two tables will be deleted (as long as you have the permission and the tables are available).
DROP TABLE does not automatically drop sequences that are implicitly created when you use serials in a table. This is extremely important when you want to create a table with the same name and structure you have just dropped, because the sequence won't be overwritten. For solving the problem, use the DROP SEQUENCE command manually.
The ALTER TABLE Command
The ALTER TABLE command can be used to perform multiple operations. ALTER TABLE can be used to add columns to a table. The following example adds a column called currency to table emp (the one we created in the CREATE TABLE section):
ALTER TABLE emp ADD COLUMN currency varchar(4);
The new column is a varchar and can be up to 4 characters long. If we want to set the default value of that column to USD, we have to write a second SQL statement:
ALTER TABLE emp ALTER COLUMN currency SET DEFAULT 'USD';
The default value is set to USD, but ALTER TABLE can also be used to rename tables. Here is an example where the table emp is renamed to oldemp:
ALTER TABLE emp RENAME TO oldemp;
Columns can also be renamed:
ALTER TABLE emp RENAME COLUMN sal TO salary;
In the example, the column sal is renamed to salary. The ALTER TABLE command is indeed powerful and useful. Here is the complete definition of the command's syntax (try \h ALTER TABLE):
Command: ALTER TABLE Description: Modifies table properties Syntax: ALTER TABLE table [ * ] ADD [ COLUMN ] column type ALTER TABLE table [ * ] ALTER [ COLUMN ] column { SET DEFAULT value | DROP DEFAULT } ALTER TABLE table [ * ] RENAME [ COLUMN ] column TO newcolumn ALTER TABLE table RENAME TO newtable ALTER TABLE table ADD table constraint definition
You can see that setting or dropping default values and adding constraints can also be done easily.
Note
Renaming tables can sometimes be very tricky and can lead to confusion. We have renamed the table emp to oldemp, but what about the sequence used in the first column? It is still called emp_id_seq. If the number of tables and sequences increase, this can become a complicated and confusing issue.
Creating and Dropping Indices
Indices are used to speed up queries. PostgreSQL normally uses B-trees for indexing a column. It is also possible to create a single index for multiple columns, but let's look at the syntax of CREATE INDEX first:
persons=# \h CREATE INDEX Command: CREATE INDEX Description: Constructs a secondary index Syntax: CREATE [ UNIQUE ] INDEX index_name ON table [ USING acc_name ] ( column [ ops_name ] [, ...] ) CREATE [ UNIQUE ] INDEX index_name ON table [ USING acc_name ] ( func_name( column [, ... ]) [ ops_name ] )
Let's create an index:
CREATE UNIQUE INDEX idx_oldemp_empname ON oldemp (empname);
In this example, we create an index called idx_old_empname on table oldemp. The index is used to store unique values, which means that a name can appear only once in the table. The index is defined for the column empname. PostgreSQL uses B-trees for indices by default. If you want to use R-trees or hashes, the SQL command has to be modified slightly:
CREATE INDEX idx_oldemp_empname ON oldemp USING HASH (empname);
Note
Hashes and R-trees cannot be created on tables in combination with the UNIQUE constraint. It is also not possible to use T-trees for certain datatypes. If you try to, the following error messages is displayed:
name=# CREATE UNIQUE INDEX idx_oldemp_empname ON oldemp USING HASH (empname);
ERROR: DefineIndex: unique indices are only available with the btree access method
name=# CREATE UNIQUE INDEX idx_oldemp_empname ON oldemp USING RTREE (empname);
ERROR: DefineIndex: unique indices are only available with the btree access method
name=# CREATE INDEX idx_oldemp_empname ON oldemp USING RTREE (empname);
ERROR: DefineIndex: opclass "varchar_ops" not supported by access method "rtree"
Indices can also be defined for multiple columns; here is an example where one index is used for two columns:
CREATE INDEX idx_oldemp_salcur ON oldemp (salary,currency);
If you want to drop the index, use the following command:
DROP INDEX idx_oldemp_salcur;
You should use idx as a prefix or postfix for the name of your index (it helps you execute larger projects).