Union
So far we’ve discussed finding the items that are common in two sets (intersection) and the items that are different (difference). The third type of set operation involves adding two sets (union).
Union in Set Theory
Union lets you combine two sets of similar information into one set. As a scientist, you might be interested in combining two sets of chemical or physical sample data. For example, a pharmaceutical research chemist might have two different sets of compounds that seem to provide a certain beneficial effect. The chemist can union the two sets to obtain a single list of all effective compounds.
Let’s take a look at union in action by examining two sets of numbers. The first set of numbers is as follows:
- 1, 5, 8, 9, 32, 55, 78
The second set of numbers is as follows:
- 3, 7, 8, 22, 55, 71, 99
The union of these two sets of numbers is the numbers in both sets combined into one new set:
- 1, 5, 8, 9, 32, 55, 78, 3, 7, 22, 71, 99
Note that the values common to both sets, 8 and 55, appear only once in the answer. Also, the sequence of the numbers in the result set is not necessarily in any specific order. When you ask a database system to perform a UNION, the values returned won’t necessarily be in sequence unless you explicitly include an ORDER BY clause. In SQL, you can also ask for a UNION ALL if you want to see the duplicate members.
The members of each set don’t have to be just single values. In fact, you’ll probably deal with sets of rows when working with SQL.
To find the union of two or more sets of complex members, all the members in each set you’re trying to union must have the same number and type of attributes. For example, suppose you have a complex set like the one below. Each row represents a member of the set (a stew recipe), and each column denotes a particular attribute (an ingredient).
Potatoes |
Water |
Lamb |
Peas |
Rice |
Chicken Stock |
Chicken |
Carrots |
Pasta |
Water |
Tofu |
Snap Peas |
Potatoes |
Beef Stock |
Beef |
Cabbage |
Pasta |
Water |
Pork |
Onions |
A second set might look like the following:
Potatoes |
Water |
Lamb |
Onions |
Rice |
Chicken Stock |
Turkey |
Carrots |
Pasta |
Vegetable Stock |
Tofu |
Snap Peas |
Potatoes |
Beef Stock |
Beef |
Cabbage |
Beans |
Water |
Pork |
Onions |
The union of these two sets is the set of objects from both sets. Duplicates are eliminated.
Potatoes |
Water |
Lamb |
Peas |
Rice |
Chicken Stock |
Chicken |
Carrots |
Pasta |
Water |
Tofu |
Snap Peas |
Potatoes |
Beef Stock |
Beef |
Cabbage |
Pasta |
Water |
Pork |
Onions |
Potatoes |
Water |
Lamb |
Onions |
Rice |
Chicken Stock |
Turkey |
Carrots |
Pasta |
Vegetable Stock |
Tofu |
Snap Peas |
Beans |
Water |
Pork |
Onions |
Combining Result Sets Using a Union
It’s a small leap from sets of complex objects to rows in SQL result sets. When you’re dealing with rows in a set of data that you fetch with SQL, the attributes are the individual columns. For example, suppose you have a set of rows returned by a query like the following one. (These are recipes from John’s cookbook.)
Recipe |
Starch |
Stock |
Meat |
Vegetable |
Lamb Stew |
Potatoes |
Water |
Lamb |
Peas |
Chicken Stew |
Rice |
Chicken Stock |
Chicken |
Carrots |
Veggie Stew |
Pasta |
Water |
Tofu |
Snap Peas |
Irish Stew |
Potatoes |
Beef Stock |
Beef |
Cabbage |
Pork Stew |
Pasta |
Water |
Pork |
Onions |
A second query result set might look like this one. (These are recipes from Mike’s cookbook).
Recipe |
Starch |
Stock |
Meat |
Vegetable |
Lamb Stew |
Potatoes |
Water |
Lamb |
Peas |
Turkey Stew |
Rice |
Chicken Stock |
Turkey |
Carrots |
Veggie Stew |
Pasta |
Vegetable Stock |
Tofu |
Snap Peas |
Irish Stew |
Potatoes |
Beef Stock |
Beef |
Cabbage |
Pork Stew |
Beans |
Water |
Pork |
Onions |
The union of these two sets is all the rows in both sets. Maybe John and Mike decided to write a cookbook together, too!
Recipe |
Starch |
Stock |
Meat |
Vegetable |
Lamb Stew |
Potatoes |
Water |
Lamb |
Peas |
Chicken Stew |
Rice |
Chicken Stock |
Chicken |
Carrots |
Veggie Stew |
Pasta |
Water |
Tofu |
Snap Peas |
Irish Stew |
Potatoes |
Beef Stock |
Beef |
Cabbage |
Pork Stew |
Pasta |
Water |
Pork |
Onions |
Turkey Stew |
Rice |
Chicken Stock |
Turkey |
Carrots |
Veggie Stew |
Pasta |
Vegetable Stock |
Tofu |
Snap Peas |
Pork Stew |
Beans |
Water |
Pork |
Onions |
Let’s assume you have a nice database containing all your favorite recipes. You really like recipes with either beef or onions, so you want a list of recipes that contain either ingredient. Figure 7-5 (on page 238) shows you the set diagram that helps you visualize how to solve this problem.
Figure 7-5 Finding out which recipes have either beef or onions
The upper circle represents the set of recipes that contain beef. The lower circle represents the set of recipes that contain onions. The union of the two circles gives you all the recipes that contain either ingredient, with duplicates eliminated where the two sets overlap. As you probably know, you first ask SQL to fetch all the recipes that have beef. In the second query, you ask SQL to fetch all the recipes that have onions. As you’ll see later, the SQL keyword UNION links the two queries to get the final answer.
By now you know that it’s not a good idea to design a recipes database with a single table. Instead, a correctly designed recipes database will have a separate Recipe_Ingredients table with one row per recipe per ingredient. Each ingredient row will have only one ingredient, so no one row can be both beef or onions at the same time. You’ll need to first find all the recipes that have a beef row, then find all the recipes that have an onions row, and then union them.
Problems You Can Solve with Union
A union lets you “mush together” rows from two similar sets—with the added advantage of no duplicate rows. Here’s a sample of the problems you can solve using a union technique with data from the sample databases:
- “Show me all the customer and employee names and addresses.”
- “List all the customers who ordered a bicycle combined with all the customers who ordered a helmet.”
- “List the entertainers who played engagements for customer Bonnicksen combined with all the entertainers who played engagements for customer Rosales.”
- “Show me the students who have an average score of 85 or better in Art together with the students who have an average score of 85 or better in Computer Science.”
- “Find the bowlers who had a raw score of 155 or better at Thunderbird Lanes combined with bowlers who had a raw score of 140 or better at Bolero Lanes.”
- “Show me the recipes that have beef together with the recipes that have garlic.”
As with other “pure” set operations, one of the limitations is that the values must match in all the columns in each result set. This works well if you’re unioning two or more sets from the same table—for example, customers who ordered bicycles and customers who ordered helmets. It also works well when you’re performing a union on sets from tables that have like columns—for example, customer names and addresses and employee names and addresses. We’ll explore the uses of the SQL UNION operator in detail in Chapter 10, “UNIONs.”
In many cases where you would otherwise union rows from the same table, you’ll find that using DISTINCT (to eliminate the duplicate rows) with complex criteria on joined tables will serve as well. We’ll show you all about solving problems this way using JOINs in Chapter 8, “INNER JOINs.”