- 10.1 Introduction
- 10.2 Grouping on One Column
- 10.3 Grouping on Two or More Columns
- 10.4 Grouping on Expressions
- 10.5 Grouping of Null Values
- 10.6 Grouping with Sorting
- 10.7 General Rules for the GROUP BY Clause
- 10.8 The GROUP_CONCAT Function
- 10.9 Complex Examples with GROUP BY
- 10.10 Grouping with WITH ROLLUP
- 10.11 Answers
10.2 Grouping on One Column
The simplest form of the GROUP BY clause is the one in which only one column is grouped. Previous chapters gave several examples of statements with such a GROUP BY clause. For the sake of clarity, we show several other examples in this section.
Example 10.1: Get all the different town names from the PLAYERS table.
SELECT TOWN FROM PLAYERS GROUP BY TOWN
The intermediate result from the GROUP BY clause could look similar to this:
TOWN PLAYERNO NAME --------- -------------------------- ---------------------- Stratford {6, 83, 2, 7, 57, 39, 100} {Parmenter, Hope, ...} Midhurst {28} {Collins} Inglewood {44, 8} {Baker, Newcastle} Plymouth {112} {Bailey} Douglas {95} {Miller} Eltham {27, 104} {Collins, Moorman}
Explanation: All rows with the same TOWN form one group. Each row in the intermediate result has one value in the TOWN column, whereas all other columns can contain multiple values. To indicate that these columns are special, the values are placed between brackets. We show those columns in this way for illustrative purposes only; MySQL probably would solve this internally in a different way. Furthermore, these two columns cannot be presented like this because a column that is not grouped is completely omitted from the end result. We return to this topic later in the chapter.
The end result of the statement is:
TOWN --------- Stratford Midhurst Inglewood Plymouth Douglas Eltham
A frequently used term in this particular context is grouping. The GROUP BY clause in the previous statement has one grouping, which consists of only one column—the TOWN column. In this chapter, we sometimes represent this as follows: The result is grouped by [TOWN]. Later in this chapter, we give examples of groupings with multiple columns and GROUP BY clauses consisting of multiple groupings.
The earlier question could be solved more easily by leaving out the GROUP BY clause and adding DISTINCT to the SELECT clause instead (work this out by yourself). Using the GROUP BY clause becomes interesting when we extend the SELECT clause with aggregation functions.
Example 10.2: For each town, find the number of players.
SELECT TOWN, COUNT(*) FROM PLAYERS GROUP BY TOWN
The result is:
TOWN COUNT(*) --------- -------- Stratford 7 Midhurst 1 Inglewood 2 Plymouth 1 Douglas 1 Eltham 2
Explanation: In this statement, the result is grouped by [TOWN]. The COUNT(*) function is now executed against each grouped row (for each town) instead of against all rows.
In this result, the data is clearly aggregated. The individual data of players cannot be displayed anymore, and the data is aggregated by TOWN. The aggregation level of this result is TOWN.
Example 10.3: For each team, get the team number, the number of matches that has been played for that team, and the total number of sets won.
SELECT TEAMNO, COUNT(*), SUM(WON) FROM MATCHES GROUP BY TEAMNO
The result is:
TEAMNO COUNT(*) SUM(WON) ------ -------- -------- 1 8 15 2 5 9
Explanation: This statement contains one grouping, consisting of the TEAMNO column.
Example 10.4: For each team that is captained by a player resident in Eltham, get the team number and number of matches that has been played for that team.
SELECT TEAMNO, COUNT(*) FROM MATCHES WHERE TEAMNO IN (SELECT TEAMNO FROM TEAMS INNER JOIN PLAYERS ON TEAMS.PLAYERNO = PLAYERS.PLAYERNO WHERE TOWN = 'Eltham') GROUP BY TEAMNO
The result is:
TEAMNO COUNT(*) ------ -------- 2 5
The column on which the result has been grouped might also appear in the SELECT clause as a parameter within an aggregation function. This does not happen often, but it is allowed.
Example 10.5: Get each different penalty amount, followed by the number of times that the amount occurs in the PENALTIES table, and also show the result of that amount multiplied by the number.
SELECT AMOUNT, COUNT(*), SUM(AMOUNT) FROM PENALTIES GROUP BY AMOUNT
The PENALTIES table is grouped on the AMOUNT column first. The intermediate result could be presented as follows:
PAYMENTNO PLAYERNO PAYMENT_DATE AMOUNT --------- -------- ------------------------ ------ {5, 6} {44, 8} {1980-12-08, 1980-12-08} 25.00 {7} {44} {1982-12-30} 30.00 {4} {104} {1984-12-08} 50.00 {2, 8} {44, 27} {1981-05-05, 1984-11-12} 75.00 {1, 3} {6, 27} {1980-12-08, 1983-09-10} 100.00
Again, the values of the columns that are not grouped are placed between brackets, and the AMOUNT column shows only one value. However, that is not entirely correct. Behind the scenes, MySQL also creates a group for this column. So the intermediate result should, in fact, be presented as follows:
PAYMENTNO PLAYERNO PAYMENT_DATE AMOUNT --------- -------- ------------------------ ---------------- {5, 6} {44, 8} {1980-12-08, 1980-12-08} {25.00, 25.00} {7} {44} {1982-12-30} {30.00} {4} {104} {1984-12-08} {50.00} {2, 8} {44, 27} {1981-05-05, 1984-11-12} {75.00, 75.00} {1, 3} {6, 27} {1980-12-08, 1983-09-10} {100.00, 100.00}
The values in the AMOUNT column are also represented as a group now. Of course, only equal values appear in each group. And because it is a group, aggregation functions can be used.
The result is:
AMOUNT COUNT(*) SUM(AMOUNT) ------ -------- ----------- 25.00 2 50.00 30.00 1 30.00 50.00 1 50.00 75.00 2 150.00 100.00 2 200.00
However, this book does not present the values of the grouped columns between brackets.
Exercise 10.1: Show the different years in which players joined the club; use the PLAYERS table.
Exercise 10.2: For each year, show the number of players who joined the club.
Exercise 10.3: For each player who has incurred at least one penalty, give the player number, average penalty amount, and number of penalties.
Exercise 10.4: For each team that has played in the first division, give the team number, number of matches, and total number of sets won.