Working with SQL
In this course, we first learn how to use SQL to alter and delete tables and how to use primary and foreign keys. We examine common joins between tables in a relational database. We then delve into inner, outer, left and right joins before learning to manipulate data using the INSERT, UPDATE and DELETE functions.
This course is built from content created and delivered by Ian Wallington of QA.
Hello and welcome. In this lecture, we are going to discuss other types of joins.
If you simply say join this table to that table you're going to do what's known as an inner join. What this means is you get a department for salesperson and sales people who are in the department. That is an inner join. There is another form of join known as a cross join.
I have created two tables I want to cross join. This is a diagram of a pair of Cartesian Xs X and Y coordinates. The idea is that you can write a query that puts one table on one axis and another table on another axis. And then, you get a row with your result set from every point of intersection, like a matrix.
Now, this is a cross join. This generates a Cartesian product.
So, what are my tables? In my first table, suits. I have the suits of playing cards. In my second table, I have the face values of cards. I'm not showing you the complete results here. How many rows in there do you think? 13 rows, 13 different face values in a suit. So, if I cross join this it's going to put one table on the X-axis and the other table on the Y-axis. So, I get a 4 by 13 matrix as a result set. The number of rows in it the number of cards in a cards pack. Now, that's a cross join. It generates a matrix. The number of rows you get is the number of rows in one table multiplied by the number of rows in the other.
Okay, so let's get back to our salesperson and department tables. I have departments 4 and 5, I also have department 11. But, in my results here, I only have departments 1, 2, and 3. That is because all of the sales people are either in department 1, department 2, or department 3. Because this is an inner join, departments that do not have sales people are not included in the result set.
Here, it is described in a Venn diagram. A set of sales people and a set of departments. I put it to you that departments 1, 2, and 3 are in that part of the diagram. They are departments with sales people. I also have departments 4 and 5 and I've put 11 in brackets. So, these are departments without sales people. This is currently an empty set. I do not have any sales people who are outside the departmental structure. I use green to highlight that as what I'm getting from my inner join. In tech terminology, it's an intersection.
Looking at my report here, somebody wants to know what happened to department 4? Department 4 isn't in that because it doesn't have any sales people. And then that person could ask "Could you just change that? "Just change it so that it includes all departments please." And the answer is "Yes, we can do that." What I need to do is make that not an inner join, but opposite of inner, an outer join. It is a little bit more complicated than that however because there are various forms of outer join.
Microsoft SQL Server and Oracle support a full join MySQL doesn't, it only supports left and right joins. Given that you have to stipulate left, right or in Microsoft SQL Server and Oracle, full the word outer becomes quite optional. But, if you just say join, the default is that's an inner join. If you say left or right, it's an outer join. But, if you want to know the difference between the left join and the left outer join it's the word outer and it is that simple. You do of course have to know it's that simple. What I'm going to show you is actually not difficult. The tricky bit here is in recognising that it isn't difficult.
In our diagram, one of those tables is on the left and the other is on the right of the diagram. Salesperson is the left hand set department is the right hand set not at all by coincidence. This diagram matches the tables that I put into my FROM clause. Salesperson, department with an outer join. You have to indicate to the engine which of the tables is the outer table. Or, in simpler terms, which table do you want to start with? Which table do you want all the rows from regardless of whether there is a matching row in the other table. And in this case, we want the department table to be the outer table. We want all the rows from the department table. Then, we have a left, right thing going on. The department table is on the right. It's as simple as that is a right outer join. Even if I put that in two lines, it is still a right outer join. Because logically that's on the left, that's on the right. The OUTER keyword is optional. What I'm now getting is what I have drawn in my diagram. Departments 1, 2, and 3 are appearing potentially multiple times. 2 and 3 are anyway. There's only one salesperson in department 1, but I'm also getting rows for departments 4, 5, and 11.
Okay, what if I were to change this? So, I've changed that to a left join. That will be the same as an inner join because I don't have any sales people outside the departmental structure. I get the same results as the inner join. That is now giving me the same results as the right join did. It is as simple as which way around you have the tables.
Now, what I'm going to stick in here is sp.emp_no. The reason why I'm including sp.emp_no is because that's the primary key on the salesperson table. The primary key column is mandatory and you could not have a null in your primary key column or columns. Having a null as an identifier won't work. If I were simply to query the salesperson table there isn't a null in there. However, in my result set there are a number of nulls. Those have been generated by the database engine in response to the fact that I haven't asked for the employee number column, but for those rows there, there is no employee number. Those nulls have been generated by the database engine in response to the fact that I've asked for the employee number column, but for those rows there, there is no employee number. And there are no sales people in department 4 to give me an employee number.
All of these rows here would be generated as a result. The inner part of the join, as it were, these are those that have been included because of the outer join. How can I change that query so it only returns the last 3 rows? We could test the employee number column. If I put emp_no in there IS NULL is exactly what I'm after. Emp_no in this particular case is unique, but is that joining the salesperson and department tables? The salesperson only occurs in one. This is those nulls because there's no match. If I do this without the WHERE clause those nulls are not coming from a table. They're not coming from a row in a table, they're coming from the fact that there is no row. There is no salesperson in department 4. It might not be literally represented like that, but it is a null. That's representing a hole in the result set. Then, you could use a full outer join and you'd get those in table A, but not in table B. Those in table B, but not in table A. And those in both table A.
So, a full outer join will give you the complete set. I could do a full outer join here because I don't have any salespeople that aren't in a department.
Okay, in summary, let's just review outer joins. So, the inner join finds all sales people, but not necessarily all managers. So, it omits managers of departments that have no people. The outer join includes rows that have no match, i.e. it has a literal meaning including every row from the table on the right is included. Left, right, and full outer joins are supported by most of the platforms. The substituted rows will contain nulls in all columns that's something that database management services do.
So, we could use Coalesce to replace any null with an alternative value, like using a database or column default value to replace that null. That would be one idea.
Okay, so that's the end of our other joins lecture. Close this one, I'll see you in the next one.
Head of Content
Andrew is an AWS certified professional who is passionate about helping others learn how to use and gain benefit from AWS technologies. Andrew has worked for AWS and for AWS technology partners Ooyala and Adobe. His favorite Amazon leadership principle is "Customer Obsession" as everything AWS starts with the customer. Passions around work are cycling and surfing, and having a laugh about the lessons learnt trying to launch two daughters and a few start ups.