Merging Multiple Data Sets
The MERGE command is a very handy tool (long awaited and frequently requested in developer forums) that allows us to add, update, or delete rows in a single transaction instead of having to execute separate INSERT, UPDATE, and DELETE statements. MERGE statements minimize the number of times the data in tables must be examined, thereby improving performance.
To illustrate this point, first I'll create a couple of temporary tables (notice that the values of the key_column match for some rows):
CREATE TABLE #dropme_merge_sample1 ( key_column INT, name_column VARCHAR(30) ) INSERT #dropme_merge_sample1 (key_column, name_column) VALUES (1, 'Abby'), (2, 'Becky'), (3, 'Chloe') CREATE TABLE #dropme_merge_sample2 ( key_column INT, name_column VARCHAR(30) ) INSERT #dropme_merge_sample2 (key_column, name_column) VALUES (1, 'Abby'), (4, 'Debby'), (5, 'Ella'), (2, 'Heather'), (3, 'Nicole')
Now let's see what happens when we merge the two tables, using the following command:
MERGE #dropme_merge_sample1 AS t USING #dropme_merge_sample2 AS s ON (t.key_column = s.key_column) WHEN NOT MATCHED BY TARGET THEN INSERT (key_column, name_column) VALUES (s.key_column, s.name_column) WHEN MATCHED THEN UPDATE SET t.name_column = s.name_column; SELECT * FROM #dropme_merge_sample1
Result:
key_column |
name_column |
1 |
Abby |
2 |
Heather |
3 |
Nicole |
4 |
Debby |
5 |
Ella |
As you can see, the MERGE command replaced the value of Becky with Heather, replaced Chloe with Nicole, and added Ella and Debby to the target table. In previous versions of SQL Server, we would have to write an INSERT statement to add rows and a separate UPDATE statement to replace existing values.
The MERGE command also allows specifying further conditions in addition to the match in the source and target tables. For example, the following MERGE command replaces Chloe, but not Becky, because I added the condition (bold in the example) that the key column of the source table must be greater than two:
MERGE #dropme_merge_sample1 AS t USING #dropme_merge_sample2 AS s ON (t.key_column = s.key_column) WHEN NOT MATCHED BY TARGET THEN INSERT (key_column, name_column) VALUES (s.key_column, s.name_column) WHEN MATCHED AND (s.key_column > 2) THEN UPDATE SET t.name_column = s.name_column;
It's important to note that the MERGE statement will fail if you attempt to update or delete more than one row based on the join (match) on an existing value. This safety net is designed to prevent us from shooting ourselves in the foot. For example, suppose we mistakenly added two rows having a key_column value of 2 in the source table, like this:
INSERT #dropme_merge_sample2 (key_column, name_column) VALUES (1, 'Abby'), (4, 'Debby'), (5, 'Ella'), (2, 'Heather'), (2, 'Nicole')
What is the MERGE command supposed to do? Should it replace Becky with Heather or Nicole? The MERGE command shown earlier won't change any rows; instead, it will fail and report the following error:
Msg 8672, Level 16, State 1, Line 1 The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.