- 10.1 Introduction
- 10.2 Grouping on One Column
- 10.3 Grouping on Two or More Columns
- 10.4 Grouping on Expressions
- 10.5 Grouping of NULL Values
- 10.6 General Rules for the GROUP BY Clause
- 10.7 Complex Examples with GROUP BY
- 10.8 Grouping with WITH ROLLUP
- 10.9 Grouping with WITH CUBE
- 10.10 Grouping Sets
- 10.11 Grouping with ROLLUP and CUBE
- 10.12 Combining Grouping Sets
- 10.13 Answers
10.10 Grouping Sets
The GROUP BY clauses that have been described so far use the short notation for the specification of groupings. SQL also has a more extensive notation. So-called grouping sets specifications indicate on which expressions groupings must be performed.
Example 10.21. For each town, get the most recent date of birth.
With the shortest notation form, this statement looks as follows:
SELECT TOWN, MIN(BIRTH_DATE) FROM PLAYERS GROUP BY TOWN
When the extensive notation form is used, this similar formulation occurs:
SELECT TOWN, MIN(BIRTH_DATE) FROM PLAYERS GROUP BY GROUPING SETS ((TOWN))
The result of both statements is:
TOWN MIN(BIRTH_DATE) --------- --------------- Stratford 1948-09-01 Midhurst 1963-06-22 Inglewood 1962-07-08 Plymouth 1963-10-01 Douglas 1963-05-14 Eltham 1964-12-28
Explanation: Behind the term GROUPING SETS, the grouping sets specifications can be found. Within such a specification, several groupings can be specified. Each grouping is placed between brackets and the whole should also be placed between brackets—hence, the double brackets.
The advantage of the extensive notation form is that it offers more ways to group data. Several groupings can be specified, among other things, and combinations of ROLLUP and CUBE can be used.
Example 10.22. For each town, get the number of players, and for each sex, get the number of players as well.
Because a grouping is needed on two different columns, this question cannot be formulated with one GROUP BY clause (in which the short notation form is used). A way to combine these two groupings in one statement is to use an UNION operator.
SELECT CAST(NULL AS CHAR), TOWN, COUNT(*) FROM PLAYERS GROUP BY TOWN UNION SELECT SEX, CAST(NULL AS CHAR), COUNT(*) FROM PLAYERS GROUP BY SEX ORDER BY 2, 1
The result is:
SEX TOWN COUNT(*) --- --------- -------- ? Stratford 7 ? Midhurst 1 ? Inglewood 2 ? Plymouth 1 ? Douglas 1 ? Eltham 2 M ? 9 F ? 5
Explanation: The rows in this intermediate result in which the TOWN column has been filled came from the first select block. The rows in which TOWN is equal to NULL form the intermediate result of the second select block. In fact, these two rows form subtotals for each sex.
To simplify this type of statement, the grouping sets specification has been added to SQL. With this, one GROUP BY clause can be used to specify several groupings. The previous statement becomes:
SELECT SEX, TOWN, COUNT(*) FROM PLAYERS GROUP BY GROUPING SETS ((TOWN), (SEX)) ORDER BY 2, 1
Explanation: Behind the words GROUPING SETS, two groupings are now specified: (TOWN) and (SEX).
If one grouping consists of one expression, the brackets can be removed. So, GROUP BY GROUPING SETS ((TOWN), (SEX)) is equivalent to GROUP BY GROUPING SETS (TOWN, SEX).
Again, the GROUP BY clause, as we discussed in the previous chapters, is, in fact, a shortened notation for the one with grouping sets. Table 10.1 contains several examples of original formulations without grouping sets and their equivalents with grouping sets.
Table 10.1. Original GROUP BY Clauses and Their Equivalent Grouping Sets Specifications
Original Specification |
Specification with Grouping Sets |
GROUP BY A |
GROUP BY GROUPING SETS ((A)) |
or |
|
GROUP BY GROUPING SETS (A) |
|
GROUP BY A, B |
GROUP BY GROUPING SETS ((A, B)) |
GROUP BY YEAR(A), SUBSTR(B) |
GROUP BY GROUPING SETS ((YEAR(A), SUBSTR(B))) |
A special grouping is (). There is no expression between the brackets. In this case, all rows are placed in one group. We can calculate a grand total with that, for example.
Example 10.23. Find for each combination of sex-town the number of players, get for each sex the number of players, and get the total number of players in the entire table.
SELECT SEX, TOWN, COUNT(*) FROM PLAYERS GROUP BY GROUPING SETS ((SEX, TOWN), (SEX), ()) ORDER BY 1, 2
The result is:
SEX TOWN COUNT(*) --- --------- -------- M Stratford 7 M Inglewood 1 M Douglas 1 M ? 9 F Midhurst 1 F Inglewood 1 F Plymouth 1 F Eltham 2 F ? 5 ? ? 14
Explanation: The last row contains the total number of players and is added because of the grouping ().
Example 10.24. Get for each team and for each player individually the number of matches played.
SELECT TEAMNO, PLAYERNO, COUNT(*) FROM MATCHES GROUP BY GROUPING SETS (TEAMNO, PLAYERNO) ORDER BY 1, 2
The result is:
TEAMNO PLAYERNO COUNT(*) ------ -------- -------- 1 ? 8 2 ? 5 ? 2 1 ? 6 3 ? 8 2 ? 27 1 ? 44 1 ? 57 1 ? 83 1 ? 104 1 ? 112 2
Explanation: The first two rows in the result have been included because of the grouping on the TEAMNO column and the other rows because of the grouping on the PLAYERNO column.
This example clearly shows that brackets are important. The specification GROUPING SETS (TEAMNO, PLAYERNO) returns a different result than GROUPING SETS ((TEAMNO, PLAYERNO)). The second grouping sets specification results in one grouping on the combination of the two columns specified and the second grouping sets specification leads to two groupings.
Finally, here are a few abstract examples of certain GROUP BY clauses, including the groupings that are executed. E1, E2, and E3 stand for random expressions, and the symbol ∪ represents the union operator.
Table 10.2. The Relationship Between Grouping Sets Specifications and Groupings
GROUP BY Clause |
Groupings |
GROUP BY E1, E2, E3 |
[E1, E2, E3] |
GROUP BY GROUPING SETS (()) |
[] |
GROUP BY GROUPING SETS ((E1, E2, E3)) |
[E1, E2, E3] |
GROUP BY GROUPING SETS (E1, E2, E3) |
[E1] ∪ [E2] ∪ [E3] |
GROUP BY GROUPING SETS ((E1), (E2), (E3)) |
[E1] ∪ [E2] ∪ [E3] |
GROUP BY GROUPING SETS ((E1, E2), (E3)) |
[E1, E2] ∪ [E3] |
GROUP BY GROUPING SETS ((E1, E2), E3) |
[E1, E2] ∪ [E3] |
GROUP BY GROUPING SETS ((E1, E2), (E3, E4)) |
[E1, E2] ∪ [E3, E4] |
GROUP BY GROUPING SETS ((E1, (E2, E3))) |
Not allowed |
Exercise 10.24: Get the total number of penalties by using a grouping sets specification.
Exercise 10.25: Get for each combination of team number and player number the number of matches, give the number of matches for each team number, and find the total number of matches as well.
Exercise 10.26: Indicate which groupings must be specified for the following GROUP BY clauses:
- GROUP BY GROUPING SETS ((), (), (E1), (E2))
- GROUP BY GROUPING SETS (E1, (E2, E3),(E3, E4, E5))
- GROUP BY GROUPING SETS ((E1, E2), (), E3, (E2, E1))