MySQL SQL Syntax and Use
- MySQL Naming Rules
- The Server SQL Mode
- Character Set Support
- Selecting, Creating, Dropping, and Altering Databases
- Creating, Dropping, Indexing, and Altering Tables
- Getting Information About Databases and Tables
- Performing Multiple-Table Retrievals with Joins
- Performing Multiple-Table Retrievals with Subqueries
- Performing Multiple-Table Retrievals with UNION
- Multiple-Table Deletes and Updates
- Performing Transactions
- Foreign Keys and Referential Integrity
- Using FULLTEXT Searches
- New Features in MySQL 5.0
Structured Query Language (SQL) is the language that the MySQL server understands, so fluency with SQL is necessary for effective communication with the server. When you use a program such as the mysql client, it functions primarily as a way for you to send SQL statements to the server to be executed. You must also know SQL if you write programs that use the MySQL interface provided by your programming language, because the interface functions as the means that allows you to communicate with the server by issuing SQL statements.
Chapter 1, "Getting Started with MySQL and SQL," presents a tutorial introduction to many of MySQL's capabilities. Now we'll build on that material to go into more detail on several areas of SQL implemented by MySQL:
-
Naming rules for referring to elements of databases
-
Changing the server SQL mode to affect its behavior
-
Support for multiple character sets
-
Creating and destroying databases, tables, and indexes
-
Obtaining information about your databases and tables
-
Retrieving data using joins, subqueries, and unions
-
Using multiple-table deletes and updates
-
Performing transactions that allow multiple statements to be treated as a unit
-
Setting up foreign key relationships
-
Using the FULLTEXT search engine
-
New features in MySQL 5.0: Views, stored procedures and functions, and triggers
Several aspects of how the MySQL server executes SQL statements can be modified by setting its SQL mode. Instructions for doing this are given in "The Server SQL Mode."
MySQL's SQL statements may be grouped into several broad categories; Table 2.1 lists representative statements for each. Some of the statements in the table are not covered here because they are more appropriately discussed in other chapters. For example, the administrative statements GRANT and REVOKE for setting up user privileges are dealt with in Chapter 11, "General MySQL Administration." Chapter 12, "MySQL and Security," provides further details on what privileges are available and what they allow. The syntax for all SQL statements implemented by MySQL is listed in Appendix E, "SQL Syntax Reference." That appendix also covers the syntax for using comments in your SQL statements. In addition, you should consult the MySQL Reference Manual for additional information, especially for changes made in the most recent versions of MySQL.
Table 2.1 Types of SQL Statements Supported by MySQL
Selecting, Creating, Dropping, and Altering Databases
USE
CREATE DATABASE
DROP DATABASE
ALTER DATABASE
Creating, Altering, and Dropping Tables and Indexes
CREATE TABLE
DROP TABLE
CREATE INDEX
DROP INDEX
ALTER TABLE
Getting Information About Databases and Tables
DESCRIBE
SHOW
Retrieving Information from Tables
SELECT
UNION
Performing Transactions
SET autocommit
START TRANSACTION
COMMIT
ROLLBACK
Modifying Information in Tables
DELETE
INSERT
LOAD DATA
REPLACE
UPDATE
Administrative Statements
FLUSH
GRANT
REVOKE
MySQL Naming Rules
Almost every SQL statement refers in some way to a database or its constituent elements. This section describes the syntax and case sensitivity rules for identifiers that refer to databases, tables, columns, indexes, and aliases.
Referring to Elements of Databases
When you use identifiers to refer to elements of databases, you are constrained by the characters you can use and the length that identifiers can be. The format of identifiers also depends on the context in which you use them. Another factor that affects naming rules is that the server can be configured to use different SQL modes.
-
Legal characters in identifiers. Unquoted identifiers can consist of any alphanumeric characters in the system default character set (utf8), plus the characters '_' and '$'. Identifiers can start with any character that is legal in an identifier, including a digit. However, an identifier cannot consist entirely of digits because that would make it indistinguishable from a number. MySQL's support for identifiers that begin with a number is somewhat unusual among database systems. If you use such an identifier, be particularly careful if it contains an 'E' or 'e' because those characters can lead to ambiguous expressions. For example, the expression 23e + 14 (with spaces surrounding the '+' sign) means column 23e plus the number 14, but what about 23e+14? Does it mean the same thing, or is it a number in scientific notation? You should also be careful about using identifiers such as 0x1020 that begin with 0x because they might be interpreted as hexadecimal constants.
Identifiers can be quoted (delimited) within backtick characters ('´'), which allows use of any character except backtick or a byte with value 0 or 255:
CREATE TABLE ´my table´ (´my column´ INT);
Quoting is useful when an identifier is an SQL keyword or contains spaces or other special characters. Quoting an identifier also allows it to be entirely numeric, something that is not true of unquoted identifiers. To include an identifier quote within a quoted identifier, double it.
For database and table identifiers, there are two additional constraints, even for identifiers that are quoted. First, you cannot use the '.' character, because it is used as the separator character in qualified name notation of the forms db_name.tbl_name and db_name.tbl_name.col_name. Second, you cannot use the Unix or Windows pathname separator characters ('/' or '\'). The pathname separator is disallowed in database and table identifiers because databases are represented on disk by directories, and tables are represented on disk by at least one file. Consequently, these types of identifiers must contain only characters that are legal in directory names and filenames. The Unix pathname separator is disallowed on Windows (and vice versa) to make it easier to transfer databases and tables between servers running on different platforms. (Suppose that you were allowed to use a slash in a table name on Windows. That would make it impossible to move the table to Unix, because filenames on that platform cannot contain slashes.)
Your operating system might impose additional constraints on database and table identifiers. See "Operating System Constraints on Database and Table Naming," in Chapter 10, "The MySQL Data Directory."
Column and table aliases can be fairly arbitrary. You should quote an alias within identifier quoting characters if it is an SQL keyword, is entirely numeric, or contains spaces or other special characters. Column aliases also can be quoted with single quotes or double quotes.
Server SQL mode. If the ANSI_QUOTES SQL mode is enabled, you can quote identifiers with double quotes (although backticks still are allowable).
CREATE TABLE "my table" ("my column" INT);
Note: Enabling ANSI_QUOTES has the additional effect that string literals must be written using single quotes. If you use double quotes, the server will interpret the value as an identifier, not as a string.
Function names normally are not reserved and can be used as identifiers without quotes. However, if the IGNORE_SPACES SQL mode is enabled, function names become reserved and must be quoted if used as identifiers.
For instructions on setting the SQL mode, see "The Server SQL Mode" later in this chapter.
Identifier length. Identifiers for databases, tables, columns, and indexes can be up to 64 characters long. Identifiers are stored using utf8 characters. (Before MySQL 4.1.5, the maximum identifier length is 64 bytes, not characters. Because utf8 characters take from one to three bytes each, the effective maximum identifier length is less than 64 characters if you use multi-byte characters.) Aliases can be up to 256 characters long.
Identifier qualifiers. Depending on context, an identifier might need to be qualified to make clear what it refers to. To refer to a database, just specify its name:
USE db_name; SHOW TABLES FROM db_name;
To refer to a table, you have two choices:
-
A fully qualified table name consists of a database identifier and a table identifier:
SHOW COLUMNS FROM db_name.tbl_name; SELECT * FROM db_name.tbl_name;
A table identifier by itself refers to a table in the default (current) database. If sampdb is the default database, the following statements are equivalent:
SELECT * FROM member; SELECT * FROM sampdb.member;
If no database has been selected, you cannot refer to a table without specifying a database qualifier because the server cannot tell which database the table belongs to.
To refer to a column, there are three choices: fully qualified, partially qualified, and unqualified.
-
A name written as db_name.tbl_name.col_name is fully qualified.
-
A partially qualified name written as tbl_name.col_name refers to a column in the named table in the default database.
-
An unqualified name written simply as col_name refers to whatever table is indicated by the surrounding context. The following two queries use the same column names, but the context supplied by the FROM clause of each statement indicates which table to select the columns from:
SELECT last_name, first_name FROM president; SELECT last_name, first_name FROM members;
It's usually unnecessary to supply fully qualified names, although it's always legal to do so if you like. If you select a database with a USE statement, that database becomes the default database and is implicit in every unqualified table reference. If you're using a SELECT statement that refers to only one table, that table is implicit for every column reference in the statement. It's necessary to qualify identifiers only when a table or database cannot be determined from context. For example, if a statement refers to tables from multiple databases, any table not in the default database must be referenced using the db_name.tbl_name form to let MySQL know which database contains the table. Similarly, if a query uses multiple tables and refers to a column name that is used in more than one table, it's necessary to qualify the column identifier with a table identifier to make it clear which column you mean.
If you use quotes when referring to a qualified name, quote individual parts of the name separately. For example:
SELECT * FROM ´sampdb´.´member´ WHERE ´sampdb´.´member´.´member_id´ > 100;
Do not quote the name as a whole. This statement is illegal:
SELECT * FROM ´sampdb.member´ WHERE ´sampdb.member.member_id´ > 100;
Case Sensitivity in SQL Statements
Case sensitivity rules in SQL statements vary for different parts of the statement, and also depend on what you are referring to and the operating system of the machine on which the server is running:
-
SQL keywords and function names. Keywords and function names are not case sensitive. They can be given in any lettercase. The following statements are equivalent:
SELECT NOW(); select now(); sElEcT nOw();
Database and table names. MySQL represents databases and tables using directories and files in the underlying filesystem on the server host. As a result, the default case sensitivity of database and table names depends on the way the operating system on that host treats filenames. Windows filenames are not case sensitive, so a server running on Windows does not treat database and table names as case sensitive. Servers running on Unix usually treat database and table names as case sensitive because Unix filenames are case sensitive. An exception is that names in HFS+ filesystems under Mac OS X are not case sensitive.
You should consider lettercase issues when you create databases and tables on a machine with case sensitive filenames if it is possible that you will someday move them to a machine where filenames are not case sensitive. Suppose that you create two tables named abc and ABC on a Unix server where those names are treated differently. You would have problems moving the tables to a Windows machine. abc and ABC would not be distinguishable there because names are not case sensitive. You would also have trouble replicating the tables from a Unix master server to a Windows slave server.
One way to avoid having case sensitivity become an issue is to pick a given lettercase and always create databases and tables using names in that lettercase. Then case of names won't be a problem if you move a database to a different server. I recommend using lowercase. This will help also if you are using InnoDB tables, because InnoDB stores database and table names internally in lowercase.
To force databases and tables to be created with lowercase names even if not specified that way in CREATE statements, configure the server by setting the lower_case_table_names system variable. See "Operating System Constraints on Database and Table Naming," in Chapter 10.
Column and index names. Column and index names are not case sensitive in MySQL. The following queries are equivalent:
SELECT name FROM student; SELECT NAME FROM student; SELECT nAmE FROM student;
Alias names. By default, table aliases are case sensitive. You can specify an alias in any lettercase (upper, lower, or mixed), but if you use it multiple times in a statement, you must use the same lettercase each time. If the lower_case_table_names variable is non-zero, table aliases are not case sensitive.
String values. Case sensitivity of a string value depends on whether it is a binary or non-binary string, and, for a non-binary string, on the collation of its character set. This is true for literal strings and the contents of string columns. For further information, see "String Values," in Chapter 3, "Working with Data in MySQL."
Regardless of whether a database or table name is case sensitive on your system, you must refer to it using the same lettercase throughout a given query. That is not true for SQL keywords, function names, or column and index names, all of which may be referred to in varying lettercase style throughout a query. However, the query will be more readable if you use a consistent lettercase rather than "ransom note" style (SelECt NamE FrOm ...).