- Sams Teach Yourself SQL in 24 Hours, Third Edition
- Table of Contents
- Copyright
- About the Authors
- Acknowledgments
- Tell Us What You Think!
- Introduction
- Part I: A SQL Concepts Overview
- Hour 1. Welcome to the World of SQL
- SQL Definition and History
- SQL Sessions
- Types of SQL Commands
- An Introduction to the Database Used in This Book
- Summary
- Q&A
- Workshop
- Part II: Building Your Database
- Hour 2. Defining Data Structures
- What Is Data?
- Basic Data Types
- Summary
- Q&A
- Workshop
- Hour 3. Managing Database Objects
- What Are Database Objects?
- What Is a Schema?
- A Table: The Primary Storage for Data
- Integrity Constraints
- Summary
- Q&A
- Workshop
- Hour 4. The Normalization Process
- Normalizing a Database
- Summary
- Q&A
- Workshop
- Hour 5. Manipulating Data
- Overview of Data Manipulation
- Populating Tables with New Data
- Updating Existing Data
- Deleting Data from Tables
- Summary
- Q&A
- Workshop
- Hour 6. Managing Database Transactions
- What Is a Transaction?
- What Is Transactional Control?
- Transactional Control and Database Performance
- Summary
- Q&A
- Workshop
- Part III: Getting Effective Results from Queries
- Hour 7. Introduction to the Database Query
- What Is a Query?
- Introduction to the <tt>SELECT</tt> Statement
- Examples of Simple Queries
- Summary
- Q&A
- Workshop
- Hour 8. Using Operators to Categorize Data
- What Is an Operator in SQL?
- Comparison Operators
- Logical Operators
- Conjunctive Operators
- Negating Conditions with the <tt>NOT</tt> Operator
- Arithmetic Operators
- Summary
- Q&A
- Workshop
- Hour 9. Summarizing Data Results from a Query
- What Are Aggregate Functions?
- Summary
- Q&A
- Workshop
- Hour 10. Sorting and Grouping Data
- Why Group Data?
- The <tt>GROUP BY</tt> Clause
- <tt>GROUP BY</tt> Versus <tt>ORDER BY</tt>
- The <tt>HAVING</tt> Clause
- Summary
- Q&A
- Workshop
- Hour 11. Restructuring the Appearance of Data
- The Concepts of ANSI Character Functions
- Various Common Character Functions
- Miscellaneous Character Functions
- Mathematical Functions
- Conversion Functions
- The Concept of Combining Character Functions
- Summary
- Q&A
- Workshop
- Hour 12. Understanding Dates and Times
- How Is a Date Stored?
- Date Functions
- Date Conversions
- Summary
- Q&A
- Workshop
- Part IV: Building Sophisticated Database Queries
- Hour 13. Joining Tables in Queries
- Selecting Data from Multiple Tables
- Types of Joins
- Join Considerations
- Summary
- Q&A
- Workshop
- Hour 14. Using Subqueries to Define Unknown Data
- What Is a Subquery?
- Embedding a Subquery Within a Subquery
- Summary
- Q&A
- Workshop
- Hour 15. Combining Multiple Queries into One
- Single Queries Versus Compound Queries
- Why Would I Ever Want to Use a Compound Query?
- Compound Query Operators
- Using an <tt>ORDER BY</tt> with a Compound Query
- Using <tt>GROUP BY</tt> with a Compound Query
- Retrieving Accurate Data
- Summary
- Workshop
- Q&A
- Part V: SQL Performance Tuning
- Hour 16. Using Indexes to Improve Performance
- What Is an Index?
- How Do Indexes Work?
- The <tt>CREATE INDEX</tt> Command
- Types of Indexes
- When Should Indexes Be Considered?
- When Should Indexes Be Avoided?
- Summary
- Q&A
- Workshop
- Hour 17. Improving Database Performance
- What Is SQL Statement Tuning?
- Database Tuning Versus SQL Tuning
- Formatting Your SQL Statement
- Full Table Scans
- Other Performance Considerations
- Performance Tools
- Summary
- Q&A
- Workshop
- Part VI: Using SQL to Manage Users and Security
- Hour 18. Managing Database Users
- Users Are the Reason
- The Management Process
- Tools Utilized by Database Users
- Summary
- Q&A
- Workshop
- Hour 19. Managing Database Security
- What Is Database Security?
- How Does Security Differ from User Management?
- What Are Privileges?
- Controlling User Access
- Controlling Privileges Through Roles
- Summary
- Q&A
- Workshop
- Part VII: Summarized Data Structures
- Hour 20. Creating and Using Views and Synonyms
- What Is a View?
- Creating Views
- Dropping a View
- What Is a Synonym?
- Summary
- Q&A
- Workshop
- Hour 21. Working with the System Catalog
- What Is the System Catalog?
- How Is the System Catalog Created?
- What Is Contained in the System Catalog?
- Examples of System Catalog Tables by Implementation
- Querying the System Catalog
- Updating System Catalog Objects
- Summary
- Q&A
- Workshop
- Part VIII: Applying SQL Fundamentals in Today's World
- Hour 22. Advanced SQL Topics
- Advanced Topics
- Cursors
- Stored Procedures and Functions
- Triggers
- Dynamic SQL
- Call-Level Interface
- Using SQL to Generate SQL
- Direct Versus Embedded SQL
- Summary
- Q&A
- Workshop
- Hour 23. Extending SQL to the Enterprise, the Internet, and the Intranet
- SQL and the Enterprise
- Accessing a Remote Database
- Accessing a Remote Database Through a Web Interface
- SQL and the Internet
- SQL and the Intranet
- Summary
- Q&A
- Workshop
- Hour 24. Extensions to Standard SQL
- Various Implementations
- Examples of Extensions from Some Implementations
- Interactive SQL Statements
- Summary
- Q&A
- Workshop
- Part IX: Appendixes
- Appendix A. Common SQL Commands
- SQL Statements
- SQL Clauses
- Appendix B. Using MySQL for Exercises
- Windows Installation Instructions
- Linux Installation Instructions
- Appendix C. Answers to Quizzes and Exercises
- Hour 1, "Welcome to the World of SQL"
- Hour 2, "Defining Data Structures"
- Hour 3, "Managing Database Objects"
- Hour 4, "The Normalization Process"
- Hour 5, "Manipulating Data"
- Hour 6, "Managing Database Transactions"
- Hour 7, "Introduction to the Database Query"
- Hour 8, "Using Operators to Categorize Data"
- Hour 9, "Summarizing Data Results from a Query"
- Hour 10, "Sorting and Grouping Data"
- Hour 11, "Restructuring the Appearance of Data"
- Hour 12, "Understanding Dates and Time"
- Hour 13, "Joining Tables in Queries"
- Hour 14, "Using Subqueries to Define Unknown Data"
- Hour 15, "Combining Multiple Queries into One"
- Hour 16, "Using Indexes to Improve Performance"
- Hour 17, "Improving Database Performance"
- Hour 18, "Managing Database Users"
- Hour 19, "Managing Database Security"
- Hour 20, "Creating and Using Views and Synonyms"
- Hour 21, "Working with the System Catalog"
- Hour 22, "Advanced SQL Topics"
- Hour 23, "Extending SQL to the Enterprise, the Internet, and the Intranet"
- Hour 24, "Extensions to Standard SQL"
- Appendix D. <tt>CREATE TABLE</tt> Statements for Book Examples
- <tt>EMPLOYEE_TBL</tt>
- <tt>EMPLOYEE_PAY_TBL</tt>
- <tt>CUSTOMER_TBL</tt>
- <tt>ORDERS_TBL</tt>
- <tt>PRODUCTS_TBL</tt>
- Appendix E. <tt>INSERT</tt> Statements for Data in Book Examples
- <tt>INSERT</tt> Statements
- Appendix F. Glossary
- Appendix G. Bonus Exercises
The GROUP BY Clause
The GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data into groups. The GROUP BY clause follows the WHERE clause in a SELECT statement and precedes the ORDER BY clause.
The position of the GROUP BY clause in a query is as follows:
SELECT FROM WHERE GROUP BY ORDER BY
The GROUP BY clause must follow the conditions in the WHERE clause and must precede the ORDER BY clause if one is used.
The following is the SELECT statement's syntax, including the GROUP BY clause:
SELECT COLUMN1, COLUMN2 FROM TABLE1, TABLE2 WHERE CONDITIONS GROUP BY COLUMN1, COLUMN2 ORDER BY COLUMN1, COLUMN2
The following sections give examples and explanations of the GROUP BY clause's use in a variety of situations.
Grouping Selected Data
Grouping data is a simple process. The selected columns (the column list following the SELECT keyword in a query) are the columns that can be referenced in the GROUP BY clause. If a column is not found in the SELECT statement, it cannot be used in the GROUP BY clause. This is logical if you think about it—how can you group data on a report if the data is not displayed?
If the column name has been qualified, the qualified name must go into the GROUP BY clause. The column name can also be represented by a number, which is discussed later in this hour. When grouping the data, the order of columns grouped does not have to match the column order in the SELECT clause.
Group Functions
Typical group functions—those that are used with the GROUP BY clause to arrange data in groups—include AVG, MAX, MIN, SUM, and COUNT. These are the aggregate functions that you learned about during Hour 9, "Summarizing Data Results from a Query." Remember that the aggregate functions were used for single values in Hour 9; now, you use the aggregate functions for group values.
Creating Groups and Using Aggregate Functions
There are conditions that the SELECT clause has that must be met when using GROUP BY. Specifically, whatever columns are selected must appear in the GROUP BY clause, except for any aggregate values. The columns in the GROUP BY clause do not necessarily have to be in the same order as they appear in the SELECT clause. Should the columns in the SELECT clause be qualified, the qualified names of the columns must be used in the GROUP BY clause. The following are some examples of syntax for the GROUP BY clause:
Example |
SELECT EMP_ID, CITY |
FROM EMPLOYEE_TBL |
GROUP BY CITY, EMP_ID; |
The SQL statement selects the EMP_ID and the CITY from the EMPLOYEE_TBL and groups the data returned by the CITY and then EMP_ID. |
Example |
SELECT EMP_ID, SUM(SALARY) |
FROM EMPLOYEE_PAY_TBL |
GROUP BY SALARY, EMP_ID; |
This SQL statement returns the EMP_ID and the total of the salary groups, as well as groups both the salaries and employee IDs. |
Example |
SELECT SUM(SALARY) |
FROM EMPLOYEE_PAY_TBL; |
This SQL statement returns the total of all the salaries from the EMPLOYEE_PAY_TBL. |
Example |
SELECT SUM(SALARY) |
FROM EMPLOYEE_PAY_TBL |
GROUP BY SALARY; |
This SQL statement returns the totals for the different groups of salaries. |
Practical examples using real data follow. In this first example, you can see that there are three distinct cities in the EMPLOYEE_TBL table.
SELECT CITY FROM EMPLOYEE_TBL; CITY ------------- GREENWOOD INDIANAPOLIS WHITELAND INDIANAPOLIS INDIANAPOLIS INDIANAPOLIS 6 rows selected.
In the following example, you select the city and a count of all records for each city. You see a count on each of the three distinct cities because you are using a GROUP BY clause.
SELECT CITY, COUNT(*) FROM EMPLOYEE_TBL GROUP BY CITY; CITY COUNT(*) -------------- -------- GREENWOOD 1 INDIANAPOLIS 4 WHITELAND 1 3 rows selected.
The following is a query from a temporary table created based on EMPLOYEE_TBL and EMPLOYEE_PAY_TBL. You will soon learn how to join two tables for a query.
SELECT * FROM EMP_PAY_TMP; CITY LAST_NAM FIRST_NA PAY_RATE SALARY ------------ -------- ---------- ------------ ------ GREENWOOD STEPHENS TINA 30000 INDIANAPOLIS PLEW LINDA 14.75 WHITELAND GLASS BRANDON 40000 INDIANAPOLIS GLASS JACOB 20000 INDIANAPOLIS WALLACE MARIAH 11 INDIANAPOLIS SPURGEON TIFFANY 15 6 rows selected.
In the following example, you retrieve the average pay rate and salary on each distinct city using the aggregate function AVG. There is no average pay rate for GREENWOOD or WHITELAND because no employees living in those cities are paid hourly.
SELECT CITY, AVG(PAY_RATE), AVG(SALARY) FROM EMP_PAY_TMP GROUP BY CITY; CITY AVG(PAY_RATE) AVG(SALARY) ------------ ------------- ----------- GREENWOOD 30000 INDIANAPOLIS 13.5833333 20000 WHITELAND 40000 3 rows selected.
In the next example, you combine the use of multiple components in a query to return grouped data. You still want to see the average pay rate and salary, but only for INDIANAPOLIS and WHITELAND. You group the data by CITY, of which you have no choice because you are using aggregate functions on the other columns. Lastly, you want to order the report by 2, and then 3, which is the average pay rate and then average salary, respectively. Study the following details and output:
SELECT CITY, AVG(PAY_RATE), AVG(SALARY) FROM EMP_PAY_TMP WHERE CITY IN ('INDIANAPOLIS','WHITELAND') GROUP BY CITY ORDER BY 2,3; CITY AVG(PAY_RATE) AVG(SALARY) ------------ ------------- ----------- INDIANAPOLIS 13.5833333 20000 WHITELAND 40000
Values are sorted before NULL values; therefore, the record for INDIANAPOLIS was displayed first. GREENWOOD was not selected, but if it were, its record would have been displayed before WHITELAND's record because GREENWOOD's average salary is $30,000 (the second sort in the ORDER BY clause was on average salary).
The last example in this section shows the use of the MAX and MIN aggregate functions with the GROUP BY clause.
SELECT CITY, MAX(PAY_RATE), MIN(SALARY) FROM EMP_PAY_TMP GROUP BY CITY; CITY MAX(PAY_RATE) MIN(SALARY) ------------ ------------- ----------- GREENWOOD 30000 INDIANAPOLIS 15 20000 WHITELAND 40000 3 rows selected.
Representing Column Names with Numbers
Unlike the ORDER BY clause, the GROUP BY clause cannot be ordered by using an integer to represent the column name except when using a UNION and the column names are different. The following is an example of representing column names with numbers:
SELECT EMP_ID, SUM(SALARY) FROM EMPLOYEE_PAY_TBL UNION SELECT EMP_ID, SUM(PAY_RATE) FROM EMPLOYEE_PAY_TBL GROUP BY 2, 1;
This SQL statement returns the employee ID and the group totals for the salaries. When using the UNION operator, the results of the two SELECT statements are merged into one result set. The GROUP BY is performed on the entire result set. The order for the groupings is 2 representing salary, and 1 representing EMP_ID.