What Is MySQL?
MySQL is a database engine that supports the SQL database query language. SQL is a standardized way of talking to databases of any sort, regardless of the underlying methods of saving and retrieving data.
SQL is pronounced multiple ways. I have seen colleges teach it pronounced sequel, and I have read books written by early pioneers of databases insist that it is pronounced ess-queue-ell.
I tend to pronounce it ess-queue-ell, because the MySQL creators pronounce each letter. I have found that everyone knows what I mean when I pronounce the letters. Some give me questioning looks when I say sequel.
MySQL saves the database in files on your hard drive. For best performance, these files must exist on a local hard drive. If you share drives across a network and place your database files on those drives, you will pay a significant performance penalty.
For your IMP project, you should put the database files on the same machine as the MySQL server. The load that IMP puts on the database is very small, but existing network traffic can cause remote drive access to be very slow.
MySQL can support large databases. The creators of MySQL are using databases as large as 50 million records. A maximum file size limit under Linux's ext2 file system is about 2GB. Therefore, each database is limited to 2GB when running on that file system. Future file systems for Linux will probably lift this limitation.
With your project, the 2GB limit allows for several tens of thousands of users. This is because of a good database design. IMP is well thought out in terms of data storage requirements.
MySQL Description
MySQL is a database server. It is fully multithreaded using kernel threads.
The term multithreaded refers to the capability to divide a job into small pieces to work on. Each piece is called a thread. Each thread can operate independently of other threads. When an application uses kernel threads on a multiple CPU machine, it can put some work off onto other CPUs for simultaneous execution. The term CPU is a brief way of referring to the processor, such as an Intel Pentium processor or an AMD K6 processor, the main computation unit in your computer.
MySQL has a robust API set supporting multiple programming languages. The languages supported are C, C++, Eiffel, Java, Perl, PHP, Python, and TCL. IMP uses the PHP API to communicate with MySQL.
API is short for Application Programming Interface. The MySQL API provides a short list of routines you can call within a program to talk to the database to save data to it or retrieve information from it. An API can be written for any type of server or operating system. For instance, the Apache Web server has a set of APIs for people who write Apache modules.
MySQL works on multiple platforms. The operating systems supported so far are as follows:
- AIX 4.x
- BSDI 2.x
- BSDI 3.0, 3.1, and 4.x
- DEC UNIX 4.x
- FreeBSD 2.x
- FreeBSD 3.x
- HP-UX 10.20
- HP-UX 11.x
- Linux 2.0+
- NetBSD 1.3/1.4 Intel and NetBSD 1.3 Alpha
- OpenBSD
- OS/2 Warp 3
- OS/2 Warp 4
- SGI Irix 6.x
- Solaris 2.5, 2.6, and 2.7 on SPARC and x86
- SunOS 4.x
- SCO OpenServer
- SCO UnixWare 7.0.1
- Tru64 Unix
- Win95
- Win98
- NT
Note
The Win95, Win98, and NT versions of MySQL version 3.22 require licenses for use. A shareware version of MySQL (version 3.22.30) has been released for a "try before buy" experience. MySQL 3.23 does not have these restrictions because it is released under the GPL license.
Read your Microsoft license agreement carefully. You might be prohibited from running MySQL on Win95, Win98, and NT workstation if more than 10 users are able to connect simultaneously.
Because IMP uses MySQL, PHP, and JavaScript, it can be installed on the platforms that support both MySQL and PHP running under a Web server. The user can be on any machine that hosts a browser capable of running JavaScript.
MySQL supports many data types. The supported data types include signed and unsigned integers in lengths of 1, 2, 3, 4, and 8 bytes. Other data types supported are floating-point numbers, variable character data, text fields, and binary data sets (called BLOBs). Also supported are date, time, and year fields. Two interesting data types supported are SET and ENUM fields. These types enable you to create lists of data that are stored in fields in the database.
You can have up to 16 indexes per table. Indexes are special lookup tables that MySQL maintains. Indexes enable you to get to data without having to do a row by row search for the data. In a large database, indexes can mean the difference between a very fast application and a disaster. You can use up to 16 columns or parts of columns for each index. Normally, MySQL uses a length of 256 bytes for each index. When creating a table, you can explicitly specify an index using fewer bytes.
All columns are created with default values with MySQL. You can use INSERT to insert a subset of values into columns. The columns you set to Null or leave out of the INSERT are set to their default values.
MySQL enables you to join tables into pseudo tables for efficient lookup of data. This is done using what is described as "an optimized one-sweep multijoin."
Under MySQL, the SQL statements SELECT and WHERE can have mathematical operators. They can also include functions. You can also mix tables from different databases in the same query.
If not being run using the --ansi flag, MySQL treats the || symbol as an OR, rather than the ANSI SQL concatenate feature. Instead, you use the CONCAT() function.
Here is an example to demonstrate some of the elements I have described. To print the first, middle, and last names for all people with monthly incomes greater than $3,000.00 and older than 29 years of age you could run this statement from the mysql command-line utility:
mysql> SELECT CONCAT(FirstName," ",MiddleI," ",LastName) mysql> FROM TheNameTable WHERE AnnualIncome/12 > 3000 AND Age >= 30;
MySQL fully supports SQL GROUP BY and ORDER BY clauses. The group functions AVG(), COUNT(), COUNT(DISTINCT), MAX(), MIN(), STD(), and SUM() are supported. Also included is ODBC and ANSI SQL syntax for LEFT OUTER JOIN.
MySQL has a privilege and password system based on system tables. This system is very flexible and requires some thought to implement more than simple rules. You can allow only certain users from certain machines to connect. The allowed users can be denied connection from other machines. Certain machines can be denied altogether, including the local machine. It is possible to lock your running programs out of the MySQL database if you get the permissions wrong.
Tip
You might find you have locked yourself out of MySQL during installation of IMP. I have done that myself. Don't worry! There is a way to get into the MySQL database, even if you have locked yourself out using the permissions tables.
If you start the mysqld server using the --skip-grant-tables command-line option, everyone is allowed access to all tables without passwords. This lets you fix the grant tables in the mysql database. You then restart the server to apply the new permissions.
MySQL fully supports the ISO-8859-1 Latin1 character set. This allows you to insert foreign language character sets into the database or into table names. All comparisons for normal string columns are not case sensitive. The sort order is also in the ISO-8859-1 Latin1 character set, although you can change that.
A few miscellaneous features are worth mentioning. MySQL also supports aliases on tables and columns. The row affecting commands (DELETE, INSERT, REPLACE, and UPDATE) returns the number of rows affected. You can also name a table the same name as a function. This requires that you not put a space between the function name and the parenthesis that follows it. For example: ABS() will work and ABS () will not work, if you have a table or column named ABS. Finally, MySQL can return error messages in many languages.
Excursion
You Should Not Use Confusing Features
Even though MySQL enables you to use function names as table names, you should not use this feature unless absolutely necessary. I have strong opinions about this from my 20 plus years of experience. Do not use confusing features!
I saw an email tagline that I loved. It said "eschew obfuscation." That is proper English. I had to use a dictionary to understand it. The term means, in its simplest form, "avoid making things hard to understand." The colloquial term is "Keep It Simple, Stupid" (KISS).
If I had not explained the above tagline, most of the population would not have understood it without going to a dictionary. When you design tables and databases and write programs, make them as simple and understandable as possible. This will help you months or years later when you must make changes.
With the power available to today's computer, it is reasonable to make the computer do most of the work, not the programmer. When you optimize for speed, only 10% of your program will need to be optimized. So, make everything you do crystal clear, even if there is another, more clever way to do it. Clever ways have a tendency to be hard to understand when a problem occurs and you are under pressure to fix the problem.
As you examine the IMP database structure, you will find the creators of IMP followed this tenet. The IMP database is a very simple database.
How It Works
The MySQL database server consists of a daemon that waits on a predetermined TCP/IP port for a client request. When a request comes in on that port, Linux runs the MySQL daemon.
A daemon is the name of a program that runs in the background. The Linux operating system has a program scheduler that checks to see which programs need to run for a little bit of time. This time period is typically in slices of 1/100th of a second (also called 10 milliseconds). That means that Linux checks for what software needs to execute 100 times a second.
How to Pronounce it
Reality is different from academia. I suspect that daemon needs to be pronounced with a long a sound, but every use of the word I have heard uses a long e sound. The dictionary backs this up, proclaiming daemon is a variant of demon, and gives the same pronunciation.
When a program needs to run, it is given a minimum of 10 milliseconds to execute. A well-designed 300MHz-Pentium system can execute around 100,000 computer instructions in 10 milliseconds.
Daemons are not allowed to run unless they need to do something. The process list is the list of daemons that the operating system has in its scheduler. The ps command gives information about the list of processes.
A TCP/IP port is a number tacked onto the end of a TCP/IP address. A real-world analogy is a suite number at an office building. If your mail is addressed to 1111 Office Blvd., Suite 3303, the mail is first delivered to your building. Then it is delivered to suite 3303. The 1111 is analogous to the TCP/IP address, and the suite is analogous to the port number. You have 65,536 possible ports available at every TCP/IP address under the IPV4 protocol.
After the MySQL daemon is executing, it holds a small dialog with the client. The client is authenticated against the MySQL authentication database. If the client is authorized, MySQL handles the request and feeds the requested data to the client. To free up the initial port to accept further requests, MySQL and the client program negotiate another port number on which to continue their conversation. After that port number is in use, the original port number is available for another connection.
You can use the ps command to determine if MySQL is running. The command ps ax | grep mysql should show you four processes running under Linux. The first one shown is the safe_mysql script that starts the server. One process shown is the one used by the Linux thread manager. Another process is to service connections from clients. The last process is to handle alarms and signals.
Whenever a program requests services from MySQL, it contacts MySQL through the network software layer, even if no physical network connection exists to the computer. When the connection stays on the local machine, it is often done through Unix sockets, which don't go through the network card drivers. The requesting program is called a client. The program being asked for a service is called a server.
MySQL reads and writes files to store its data. This means that copying all the data files to another storage device will totally back up your database. Note that the MySQL server must be shut down to do this. A raw disk storage system is under development. This storage system would not store data in files, and would require another backup method.
For your project, the simplest approach is to use a cron job to shut down the mysqld server in the early hours, copy the database file to the backup device, and then restart the mysqld server. The best way to shut down mysqld with cron is to use the system init script as follows:
/etc/rc.d/init.d/mysql stop
What It Does for You
MySQL hides the mechanics of the database storage mechanism from you. You interface to the database through the standardized SQL interface. There are very few SQL commands to learn. After you learn these commands, you can write database query applications that work regardless of the underlying database.
If you are careful to use only the SQL standard commands, the portability of your application is very high. I have used my knowledge of SQL to query databases from Informix, Oracle, Microsoft, and MySQL without having to worry about which database was running at the time.
All database engines extend the SQL standard slightly. If you study the documented extensions, you can make notes of what to avoid if portability is your main concern. Otherwise, the extensions are there to make your life easier. Don't be afraid to use them if portability is not your main goal.
Getting Online Help for MySQL
Every application in the MySQL suite responds to the --help command-line syntax. For example, running the command mysql --help prints out the list of command-line switches available to you.
Help is also available through the World Wide Web. If you go to http://www.mysql.com, you will find a documentation link. This link takes you to online Web pages that provide full documentation for MySQL. The Web pages are a bit terse at times, but every time I have checked on something the information there is correct.
The MySQL Web site also has links to several mailing lists. These mailing lists include people who understand MySQL, along with people just like you who are learning more about MySQL. These are an excellent source of help.