SQL is the language of RDBMS’ses (Relational DataBase Management Systems) and the first word, relational, is the key concept behind these joins. We can create atomic tables but most often we create tables which use information from other tables to prevent data duplication and ease of access.
For example let me write a hypothetical user table:
When I want to search for all the admins, for example, this is a possible query I can write. Take note of the WHERE part:
SELECT * FROM user WHERE Role = 'Admin';
As you can see because the data is written in string, I need to write it exactly. What happens if I write “Admi” or “admin”. Then this query will happily return nothing. Or from the perspective of a programmer, these kinds of string data often complemented with textfields, rather than dropdown menus. Do we expect the user to type everything correctly?
That’s why we often separate these types of data into multiple tables and join them together to give the illusion of they are together in one table. In this second scenario we’ll have an User table and a Role table:
To access those tables together we can use an Inner Join. Inner join should be used if you want all the data corresponding to each other. I’ll discuss what I mean with another example, and this reason is the basis of outer joins. To get the same result as the previous structure of the table we should write this query:
SELECT User.Id, User.Name, Role.Name FROM User INNER JOIN Role ON User.Role_Id = Role.Id
This will glue two tables from the role_id and Role table’s Id column and will give the exact same result previously given. They are divided into two but can be “viewed” as one table. Also usually these kinds of value tables are complemented with dropdown elements which carry a value which is used in backend, and this value is often the id column of the datasource used, and a string label to show what it means to the user and this part is often populated from the textual column of the said datasource.
But what happens when data is mismatched? Like, for some weird reason, a user doesn’t have a role id? Then what?
In this case the inner join will not show Darren. Because it’ll go and try to match the empty data with the glued table, and when it doesn’t find a corresponding data, it decides to hide it.
To get the Darren into our data selection, we need to use Outer Joins!
Outer joins join the two tables and they also show data when a databridge, so to speak, cannot be formed between the glued columns. Usually the first table written in the query (which is in the left if you think graphically and raised in a culture where people read from left to right) is used as a reference. So when we write a query like that with our Darren in the mix, it’ll show every user. Role data however won’t be shown, and filled with NULLs for Darren:
SELECT * FROM User LEFT JOIN Role ON User.Role_Id = Role.Id
This basically says, “okay give me the User table and corresponding role data if you can find it. But most importantly I want the user table in its entirety!”
We can use it to find those orphaned data like the Darren record easily:
SELECT * FROM User LEFT JOIN Role ON User.Role_Id = Role.Id WHERE Role.Id = NULL
As another choice we can easily say RIGHT JOIN but this will effectively give us the same result as INNER JOIN. Because in this case we are saying, “take the Role table as reference and I don’t care any value that’s not connected to it in any way”.
Most commonly Right Outer Join is used rather than Left Outer Join but if a scenario arises for you to use the added table as reference, you have a way to do it.
Thanks for reading, happy SQL-ing!