Set Operations
Start course

In this Course, we go beyond the basics to empower you with the skills, how-to knowledge, and hands-on expertise to work through and solve complex business issues with SQL. We go deep and hands-on with transactions, subqueries, inline views, summarised queries, the exists predicate and set operations.

This Course is built from content created and delivered by Ian Wallington of QA.


Hi, let's look at SET Operations. This is more about Unions and Intersections. It's set theory, but it's a different kind of set theory from that which we've discovered already. It's not joining tables together, it's kind of sticking results together. And it can be used for all kinds of data. A typical use case is when you're retrieving similar data from different tables, not tables that are related together in some kind of hierarchy or parent child or one too many relationships. Say as I have in my database, I have two sources of employee information. We have sales people, and then we have managers of departments. They are all people in the company, but they're coming from different places in terms of the database and where they're stored. You might have a table of products and a table of services. You have costings for your products and your services. And what you want is a total price list or a catalog. So you can union your products and your services. 

I'm going to get a list of all the people in my database, managers and sales targets of course, I want to stick these two together. It's good to remember the rectangular nature of search results in set theory. All queries combined using Union, Intersect or Except must have an equal number of expressions in their target lists, not the same columns, but the same number of columns and the same basic data type again. Now if I want to modify, I don't want to reuse any information, but I want to modify a row, so those can be stuck together. You can't delete them. I still have a department with a no manager. I could perhaps either delete that or I could put a WHERE manager IS NOT NULL statement in there. I do have no manager. If there's a couple of things I need to change in here, this column is now heated with manager. It's taking the column headings, the column names from the first Select clause. Staff, staff name or role. So I've alias that. Now the other thing I would like to do within this results set is be able to see which of the results is a manager and which of them is a salesperson. Now that can be achieved by looking at my select one from earlier. 

If I do this, if I select literally in there, I directly create 'sp'. Up here, I probably want to write the status something like role or something along those lines. But now I have my managers and my sales people easily distinguished. At the moment the managers are coming out first and the salespeople last. If I put an Order By clause over the whole thing, and Order By sales under_target, and then I get a mix of managers and sales people. When I'm using Union, I'm ultimately doing what is called a Union All. This is kind of like the select statement, but it's kind of backwards as it were. The Select clause and the Select distinct is another form of a Select. Well, here we have Union. Union is Union distinct effectively. So the union will remove duplicates if there are duplicates but in doing so you'll have to go through every result set and remove all. In this particular case, I know there aren't any duplicates. So I have chosen the UNION ALL statement to simplify things. It does a distinct operation. Some very simple examples. These counties are represented in our database like counties. The companies are placed in the counties and the sales people are placed in the counties. Let's Union them and see what we get. Notice that the Union only get represented once in the results set. So keep thinking of Union as distinct. 

Here again is using a Union All because there will be no duplicates. This is companies and sales people. But that's Union All. The other function we have is Intersect, which again is part of set theory. If you have two Selects the Intersect is an intermediate result set of both. It finds those rows that are in both results set. You take both London customers and salesperson, Oracle names this as MINUS rather than Except, whereas a Microsoft sequel server MINUS code is EXCEPT. And what this function does is it removes from the first results set anything which appears in the second results set so it's one minus the other and here we have three London's and a Devon. Here we have Surreys, Hampshires and a London so it takes London out of this because it's also in the second result set. That's the set theory. Those things are in one, but not in the other. To find at least the company number and contact code or contacts that haven't been sold to, you can get all of the company number, contact code combinations from the contact table, i.e all the contact IDs. And then from the sale table, you get company number and contact code with the foreign key values. So you subtract that from all of the contacts, and you are left with those contacts that haven't been sold to. So that's how we can create that results set. 

Okay, that brings to the end our lecture. Thanks for your attention. I hope you've enjoyed this course. We welcome any feedback so please contact us at If you have any questions, queries or comments.

About the Author
Learning Paths

Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built  70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+  years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.

Covered Topics