- Sams Teach Yourself SQL in 24 Hours, Third Edition
- Table of Contents
- Copyright
- About the Authors
- Acknowledgments
- Tell Us What You Think!
- Introduction
- Part I: A SQL Concepts Overview
- Hour 1. Welcome to the World of SQL
- SQL Definition and History
- SQL Sessions
- Types of SQL Commands
- An Introduction to the Database Used in This Book
- Summary
- Q&A
- Workshop
- Part II: Building Your Database
- Hour 2. Defining Data Structures
- What Is Data?
- Basic Data Types
- Summary
- Q&A
- Workshop
- Hour 3. Managing Database Objects
- What Are Database Objects?
- What Is a Schema?
- A Table: The Primary Storage for Data
- Integrity Constraints
- Summary
- Q&A
- Workshop
- Hour 4. The Normalization Process
- Normalizing a Database
- Summary
- Q&A
- Workshop
- Hour 5. Manipulating Data
- Overview of Data Manipulation
- Populating Tables with New Data
- Updating Existing Data
- Deleting Data from Tables
- Summary
- Q&A
- Workshop
- Hour 6. Managing Database Transactions
- What Is a Transaction?
- What Is Transactional Control?
- Transactional Control and Database Performance
- Summary
- Q&A
- Workshop
- Part III: Getting Effective Results from Queries
- Hour 7. Introduction to the Database Query
- What Is a Query?
- Introduction to the <tt>SELECT</tt> Statement
- Examples of Simple Queries
- Summary
- Q&A
- Workshop
- Hour 8. Using Operators to Categorize Data
- What Is an Operator in SQL?
- Comparison Operators
- Logical Operators
- Conjunctive Operators
- Negating Conditions with the <tt>NOT</tt> Operator
- Arithmetic Operators
- Summary
- Q&A
- Workshop
- Hour 9. Summarizing Data Results from a Query
- What Are Aggregate Functions?
- Summary
- Q&A
- Workshop
- Hour 10. Sorting and Grouping Data
- Why Group Data?
- The <tt>GROUP BY</tt> Clause
- <tt>GROUP BY</tt> Versus <tt>ORDER BY</tt>
- The <tt>HAVING</tt> Clause
- Summary
- Q&A
- Workshop
- Hour 11. Restructuring the Appearance of Data
- The Concepts of ANSI Character Functions
- Various Common Character Functions
- Miscellaneous Character Functions
- Mathematical Functions
- Conversion Functions
- The Concept of Combining Character Functions
- Summary
- Q&A
- Workshop
- Hour 12. Understanding Dates and Times
- How Is a Date Stored?
- Date Functions
- Date Conversions
- Summary
- Q&A
- Workshop
- Part IV: Building Sophisticated Database Queries
- Hour 13. Joining Tables in Queries
- Selecting Data from Multiple Tables
- Types of Joins
- Join Considerations
- Summary
- Q&A
- Workshop
- Hour 14. Using Subqueries to Define Unknown Data
- What Is a Subquery?
- Embedding a Subquery Within a Subquery
- Summary
- Q&A
- Workshop
- Hour 15. Combining Multiple Queries into One
- Single Queries Versus Compound Queries
- Why Would I Ever Want to Use a Compound Query?
- Compound Query Operators
- Using an <tt>ORDER BY</tt> with a Compound Query
- Using <tt>GROUP BY</tt> with a Compound Query
- Retrieving Accurate Data
- Summary
- Workshop
- Q&A
- Part V: SQL Performance Tuning
- Hour 16. Using Indexes to Improve Performance
- What Is an Index?
- How Do Indexes Work?
- The <tt>CREATE INDEX</tt> Command
- Types of Indexes
- When Should Indexes Be Considered?
- When Should Indexes Be Avoided?
- Summary
- Q&A
- Workshop
- Hour 17. Improving Database Performance
- What Is SQL Statement Tuning?
- Database Tuning Versus SQL Tuning
- Formatting Your SQL Statement
- Full Table Scans
- Other Performance Considerations
- Performance Tools
- Summary
- Q&A
- Workshop
- Part VI: Using SQL to Manage Users and Security
- Hour 18. Managing Database Users
- Users Are the Reason
- The Management Process
- Tools Utilized by Database Users
- Summary
- Q&A
- Workshop
- Hour 19. Managing Database Security
- What Is Database Security?
- How Does Security Differ from User Management?
- What Are Privileges?
- Controlling User Access
- Controlling Privileges Through Roles
- Summary
- Q&A
- Workshop
- Part VII: Summarized Data Structures
- Hour 20. Creating and Using Views and Synonyms
- What Is a View?
- Creating Views
- Dropping a View
- What Is a Synonym?
- Summary
- Q&A
- Workshop
- Hour 21. Working with the System Catalog
- What Is the System Catalog?
- How Is the System Catalog Created?
- What Is Contained in the System Catalog?
- Examples of System Catalog Tables by Implementation
- Querying the System Catalog
- Updating System Catalog Objects
- Summary
- Q&A
- Workshop
- Part VIII: Applying SQL Fundamentals in Today's World
- Hour 22. Advanced SQL Topics
- Advanced Topics
- Cursors
- Stored Procedures and Functions
- Triggers
- Dynamic SQL
- Call-Level Interface
- Using SQL to Generate SQL
- Direct Versus Embedded SQL
- Summary
- Q&A
- Workshop
- Hour 23. Extending SQL to the Enterprise, the Internet, and the Intranet
- SQL and the Enterprise
- Accessing a Remote Database
- Accessing a Remote Database Through a Web Interface
- SQL and the Internet
- SQL and the Intranet
- Summary
- Q&A
- Workshop
- Hour 24. Extensions to Standard SQL
- Various Implementations
- Examples of Extensions from Some Implementations
- Interactive SQL Statements
- Summary
- Q&A
- Workshop
- Part IX: Appendixes
- Appendix A. Common SQL Commands
- SQL Statements
- SQL Clauses
- Appendix B. Using MySQL for Exercises
- Windows Installation Instructions
- Linux Installation Instructions
- Appendix C. Answers to Quizzes and Exercises
- Hour 1, "Welcome to the World of SQL"
- Hour 2, "Defining Data Structures"
- Hour 3, "Managing Database Objects"
- Hour 4, "The Normalization Process"
- Hour 5, "Manipulating Data"
- Hour 6, "Managing Database Transactions"
- Hour 7, "Introduction to the Database Query"
- Hour 8, "Using Operators to Categorize Data"
- Hour 9, "Summarizing Data Results from a Query"
- Hour 10, "Sorting and Grouping Data"
- Hour 11, "Restructuring the Appearance of Data"
- Hour 12, "Understanding Dates and Time"
- Hour 13, "Joining Tables in Queries"
- Hour 14, "Using Subqueries to Define Unknown Data"
- Hour 15, "Combining Multiple Queries into One"
- Hour 16, "Using Indexes to Improve Performance"
- Hour 17, "Improving Database Performance"
- Hour 18, "Managing Database Users"
- Hour 19, "Managing Database Security"
- Hour 20, "Creating and Using Views and Synonyms"
- Hour 21, "Working with the System Catalog"
- Hour 22, "Advanced SQL Topics"
- Hour 23, "Extending SQL to the Enterprise, the Internet, and the Intranet"
- Hour 24, "Extensions to Standard SQL"
- Appendix D. <tt>CREATE TABLE</tt> Statements for Book Examples
- <tt>EMPLOYEE_TBL</tt>
- <tt>EMPLOYEE_PAY_TBL</tt>
- <tt>CUSTOMER_TBL</tt>
- <tt>ORDERS_TBL</tt>
- <tt>PRODUCTS_TBL</tt>
- Appendix E. <tt>INSERT</tt> Statements for Data in Book Examples
- <tt>INSERT</tt> Statements
- Appendix F. Glossary
- Appendix G. Bonus Exercises
Other Performance Considerations
There are other performance considerations that should be noted when tuning SQL statements. The following concepts are discussed in the next sections:
- Using the LIKE operator and wildcards
- Avoiding the OR operator
- Avoiding the HAVING clause
- Avoiding large sort operations
- Using stored procedures
Using the LIKE Operator and Wildcards
The LIKE operator is a useful tool that is used to place conditions on a query in a flexible manner. The placement and use of wildcards in a query can eliminate many possibilities of data that should be retrieved. Wildcards are very flexible for queries that search for similar data (data that is not equivalent to an exact value specified).
Suppose you want to write a query using the EMPLOYEE_TBL selecting the EMP_ID, LAST_NAME, FIRST_NAME, and STATE columns. You need to know the employee identification, name, and state for all the employees with the last name Stevens. Three SQL statement examples with different wildcard placements serve as examples.
QUERY1: SELECT EMP_ID, LAST_NAME, FIRST_NAME, STATE FROM EMPLOYEE_TBL WHERE LAST_NAME LIKE '%E%'; QUERY2: SELECT EMP_ID, LAST_NAME, FIRST_NAME, STATE FROM EMPLOYEE_TBL WHERE LAST_NAME LIKE '%EVENS%'; QUERY3: SELECT EMP_ID, LAST_NAME, FIRST_NAME, STATE FROM EMPLOYEE_TBL WHERE LAST_NAME LIKE 'ST%';
The SQL statements do not necessarily return the same results. More than likely, QUERY1 will return more rows than the other two queries. QUERY2 and QUERY3 are more specific as to the data desired for return, thus eliminating more possibilities than QUERY1 and speeding data retrieval time. Additionally, QUERY3 is probably faster than QUERY2 because the first letters of the string for which you are searching are specified (and the column LAST_NAME is likely to be indexed). QUERY3 can take advantage of an index.
Avoiding the OR Operator
Rewriting the SQL statement using the IN predicate instead of the OR operator consistently and substantially improves data retrieval speed. Your implementation will tell you about tools you can use to time or check the performance between the OR operator and the IN predicate. An example of how to rewrite a SQL statement by taking the OR operator out and replacing the OR operator with the IN predicate follows.
The following is a query using the OR operator:
SELECT EMP_ID, LAST_NAME, FIRST_NAME FROM EMPLOYEE_TBL WHERE CITY = 'INDIANAPOLIS' OR CITY = 'BROWNSBURG' OR CITY = 'GREENFIELD';
The following is the same query using the IN operator:
SELECT EMP_ID, LAST_NAME, FIRST_NAME FROM EMPLOYEE_TBL WHERE CITY IN ('INDIANAPOLIS', 'BROWNSBURG', 'GREENFIELD');
The SQL statements retrieve the very same data; however, through testing and experience, you find that the data retrieval is measurably faster by replacing OR conditions with the IN, as in the second query.
Avoiding the HAVING Clause
The HAVING clause is a useful clause; however, you can't use it without cost. Using the HAVING clause causes the SQL optimizer extra work, which results in extra time. If possible, SQL statements should be written without the use of the HAVING clause.
Avoid Large Sort Operations
Large sort operations mean the use of the ORDER BY, GROUP BY, and HAVING clauses. Subsets of data must be stored in memory or to disk (if there is not enough space in allotted memory) whenever sort operations are performed. You must often sort data. The main point is that these sort operations affect a SQL statement's response time. Because large sort operations cannot always be avoided, it is best to schedule queries with large sorts as periodic batch processes during off-peak database usage so that the performance of most user processes is not affected.
Use Stored Procedures
Stored procedures should be created for SQL statements executed on a regular basis—particularly large transactions or queries. Stored procedures are simply SQL statements that are compiled and permanently stored in the database in an executable format.
Normally, when a SQL statement is issued in the database, the database must check the syntax and convert the statement into an executable format within the database (called parsing). The statement, once parsed, is stored in memory; however, it is not permanent. This means that when memory is needed for other operations, the statement may be ejected from memory. In the case of stored procedures, the SQL statement is always available in an executable format and remains in the database until it is dropped like any other database object. Stored procedures are discussed in more detail in Hour 22, "Advanced SQL Topics." |
Disabling Indexes During Batch Loads
When a user submits a transaction to the database (INSERT, UPDATE, or DELETE), an entry is made to both the database table and any indexes associated with the table being modified. This means that if there is an index on the EMPLOYEE table, and a user updates the EMPLOYEE table, an update also occurs to the index associated with the EMPLOYEE table. In a transactional environment, the fact that a write to an index occurs every time a write to the table occurs is usually not an issue.
During batch loads, however, an index can actually cause serious performance degradation. A batch load may consist of hundreds, thousands, or millions of manipulation statements or transactions. Because of their volume, batch loads take a long time to complete and are normally scheduled during off-peak hours—usually during weekends or evenings. To optimize performance during a batch load—which may equate to decreasing the time it takes the batch load to complete from 12 hours to 6 hours—it is recommended that the indexes associated with the table affected during the load are dropped. When the indexes are dropped, changes are written to the tables much faster, so the job completes faster. When the batch load is complete, the indexes should be rebuilt. During the rebuild of the indexes, the indexes will be populated with all the appropriate data from the tables. Although it may take a while for an index to be created on a large table, the overall time expended if you drop the index and rebuild it is less.
Another advantage to rebuilding an index after a batch load completes is the reduction of fragmentation that is found in the index. When a database grows, records are added, removed, and updated, and fragmentation can occur. For any database that experiences a lot of growth, it is a good idea to periodically drop and rebuild large indexes. When an index is rebuilt, the number of physical extents that comprise the index is decreased, there is less disk I/O involved to read the index, the user gets results faster, and everyone is happy.