Demystifying SQL Joins: A Comprehensive Guide — Codes With Pankaj

Codes With Pankaj
9 min readSep 19, 2023

--

Structured Query Language (SQL) is a powerful tool for managing and extracting data from relational databases. One of its fundamental capabilities is joining tables to combine and retrieve data from multiple sources. In this blog post, we will delve into the world of SQL joins, exploring five common types: INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN, and SELF JOIN.

INNER JOIN: Combining Data

SQL INNER JOIN is used to combine rows from two or more tables based on a related column between them. Only the rows with matching values in the specified columns will be included in the result set. Here’s an example of how to use an INNER JOIN:

Suppose we have two tables: customers and orders, and we want to retrieve a list of customers who have placed orders. The customers table contains information about customers, and the orders table contains information about orders, including a customer ID that links each order to a customer.

Here are the two tables:

customers table:

+----+-----------+-------------+
| id | name | email |
+----+-----------+-------------+
| 1 | John Doe | john@p4n.in |
| 2 | Jane Smith| jane@p4n.in |
| 3 | Bob Brown | bob@p4n.in |
+----+-----------+-------------+

orders table:

+-----+------------+--------+
| id | customer_id| amount |
+-----+------------+--------+
| 101 | 1 | 50.00 |
| 102 | 2 | 75.00 |
| 103 | 1 | 30.00 |
| 104 | 3 | 40.00 |
+-----+------------+--------+

Now, let’s perform an INNER JOIN to get a list of customers who have placed orders:

SELECT customers.name, orders.amount
FROM customers
INNER JOIN orders ON customers.id = orders.customer_id;

In this SQL query:

  • We specify the tables we want to join (customers and orders).
  • We use INNER JOIN to indicate that we only want to include rows with matching customer IDs in both tables.
  • We specify the join condition using the ON clause, which matches rows where the id column in the customers table matches the customer_id column in the orders table.

The result of this query will be :

+-----------+--------+
| name | amount |
+-----------+--------+
| John Doe | 50.00 |
| Jane Smith| 75.00 |
| John Doe | 30.00 |
| Bob Brown | 40.00 |
+-----------+--------+

This result set shows the names of customers who have placed orders along with the corresponding order amounts. The INNER JOIN ensures that only customers with matching orders are included in the result.

LEFT JOIN: Inclusive Extraction

SQL LEFT JOIN, also known as a LEFT OUTER JOIN, is used to retrieve all rows from the left table (the first table mentioned in the query) and the matched rows from the right table (the second table mentioned) based on a specified condition. If there are no matches in the right table, the result will still contain all the rows from the left table, with NULL values in the columns from the right table.

Here’s an example of how to use a LEFT JOIN:

Suppose we have two tables: employees and departments, and we want to retrieve a list of all employees and the department they belong to, if they are assigned to a department. The employees table contains information about employees, and the departments table contains information about departments, including a department ID that links employees to their respective departments.

Here are the two tables:

employees table:

+------+----------+-----------+
| emp_id | emp_name | dept_id |
+------+----------+-----------+
| 1 | John Doe | 101 |
| 2 | Jane Smith | 102 |
| 3 | Bob Brown | 101 |
| 4 | Alice Lee | NULL |
+------+----------+-----------+

departments table:

+--------+-------------+
| dept_id | dept_name |
+--------+-------------+
| 101 | HR |
| 102 | IT |
| 103 | Sales |
+--------+-------------+

Now, let’s perform a LEFT JOIN to get a list of all employees and their respective departments (if assigned to one):

SELECT employees.emp_name, departments.dept_name
FROM employees
LEFT JOIN departments ON employees.dept_id = departments.dept_id;

In this SQL query:

  • We specify the tables we want to join (employees and departments).
  • We use LEFT JOIN to indicate that we want all rows from the employees table and only the matching rows from the departments table.
  • We specify the join condition using the ON clause, which matches rows where the dept_id column in the employees table matches the dept_id column in the departments table.

The result of this query will be:

+----------+-------------+
| emp_name | dept_name |
+----------+-------------+
| John Doe | HR |
| Jane Smith| IT |
| Bob Brown | HR |
| Alice Lee | NULL |
+----------+-------------+

In the result set, you can see that all employees are included in the output, even if they are not assigned to a department (e.g., Alice Lee). When there is no match in the departments table for an employee, the dept_name column contains NULL.

RIGHT JOIN: The Mirror Image.

SQL RIGHT JOIN, also known as a RIGHT OUTER JOIN, is used to retrieve all rows from the right table (the second table mentioned in the query) and the matched rows from the left table (the first table mentioned) based on a specified condition. If there are no matches in the left table, the result will still contain all the rows from the right table, with NULL values in the columns from the left table.

Here’s an example of how to use a RIGHT JOIN:

Suppose we have two tables: orders and customers, and we want to retrieve a list of all orders and the customer information for customers who have placed those orders. The orders table contains information about orders, and the customers table contains information about customers, including a customer ID that links each order to a customer.

Here are the two tables:

orders table:

+-----+------------+--------+
| id | customer_id| amount |
+-----+------------+--------+
| 101 | 1 | 50.00 |
| 102 | 2 | 75.00 |
| 103 | 1 | 30.00 |
| 104 | 3 | 40.00 |
+-----+------------+--------+

customers table:

+----+-----------+------------+
| id | name | email |
+----+-----------+------------+
| 1 | John Doe | john@p4n.in|
| 2 | Jane Smith| jane@p4n.in|
| 3 | Bob Brown | bob@p4n.in |
+----+-----------+------------+

Now, let’s perform a RIGHT JOIN to get a list of all orders and the corresponding customer information (if available):

SELECT customers.name, orders.amount
FROM orders
RIGHT JOIN customers ON orders.customer_id = customers.id;

In this SQL query:

  • We specify the tables we want to join (orders and customers).
  • We use RIGHT JOIN to indicate that we want all rows from the customers table and only the matching rows from the orders table.
  • We specify the join condition using the ON clause, which matches rows where the customer_id column in the orders table matches the id column in the customers table.

The result of this query will be:

+-----------+--------+
| name | amount |
+-----------+--------+
| John Doe | 50.00 |
| Jane Smith| 75.00 |
| John Doe | 30.00 |
| Bob Brown | 40.00 |
+-----------+--------+

In the result set, you can see that all orders are included in the output, even if they do not have a matching customer (e.g., the customer information for order 104 is missing). When there is no match in the customers table for an order, the name column contains NULL.

FULL JOIN: Comprehensive Data Retrieval

FULL JOIN, also known as a FULL OUTER JOIN, combines the results of both a LEFT JOIN and a RIGHT JOIN. It returns all rows from both tables, matching rows where the specified condition is met, and filling in NULL values for columns where there is no match in either the left or the right table.

Here’s an example of how to use a FULL JOIN:

Suppose we have two tables: employees and departments, and we want to retrieve a list of all employees and their respective departments (if assigned to one), including employees who are not assigned to any department. The employees table contains information about employees, and the departments table contains information about departments, including a department ID that links employees to their respective departments.

Here are the two tables:

employees table:

+------+----------+-----------+
| emp_id | emp_name | dept_id |
+------+----------+-----------+
| 1 | John Doe | 101 |
| 2 | Jane Smith | 102 |
| 3 | Bob Brown | 101 |
| 4 | Alice Lee | NULL |
+------+----------+-----------+

departments table:

+--------+-------------+
| dept_id | dept_name |
+--------+-------------+
| 101 | HR |
| 102 | IT |
| 103 | Sales |
+--------+-------------+

Now, let’s perform a FULL JOIN to get a list of all employees and their respective departments (if available):

SELECT employees.emp_name, departments.dept_name
FROM employees
FULL JOIN departments ON employees.dept_id = departments.dept_id;

In this SQL query:

  • We specify the tables we want to join (employees and departments).
  • We use FULL JOIN to indicate that we want all rows from both the employees and departments tables.
  • We specify the join condition using the ON clause, which matches rows where the dept_id column in the employees table matches the dept_id column in the departments table.

The result of this query will be:

+------------+-------------+
| emp_name | dept_name |
+------------+-------------+
| John Doe | HR |
| Jane Smith | IT |
| Bob Brown | HR |
| Alice Lee | NULL |
| NULL | Sales |
+------------+-------------+

In the result set, you can see that all employees and departments are included in the output, and NULL values are filled in where there are no matches (e.g., Alice Lee is not assigned to any department, and there is a department with no employees in Sales).

SELF JOIN: Linking Data Within.

SQL SELF JOIN is a type of join where you join a table with itself. This can be useful when you have hierarchical data or when you want to compare rows within the same table. Here’s an example of how to use a SELF JOIN:

Suppose you have a table called employees with the following structure:

CREATE TABLE employees (
emp_id INT PRIMARY KEY,
emp_name VARCHAR(50),
manager_id INT
);

In this table, each row represents an employee, and the manager_id column refers to the employee's manager. The manager_id is a foreign key that points to the emp_id of another employee in the same table.

Here’s some sample data:

+-------+------------+------------+
| emp_id| emp_name | manager_id |
+-------+------------+------------+
| 1 | John Doe | NULL |
| 2 | Jane Smith | 1 |
| 3 | Bob Brown | 1 |
| 4 | Alice Lee | 2 |
| 5 | Carol White| 3 |
+-------+------------+------------+

In this example, John Doe is the top-level manager (manager_id is NULL), and employees report to their respective managers.

Now, let’s say we want to retrieve a list of employees along with the names of their managers. We can use a SELF JOIN to achieve this:

SELECT e1.emp_name AS employee_name, e2.emp_name AS manager_name
FROM employees e1
LEFT JOIN employees e2 ON e1.manager_id = e2.emp_id;

In this SQL query:

  • We give aliases to the same table to distinguish between the employee and manager roles: e1 represents the employee, and e2 represents the manager.
  • We perform a LEFT JOIN on the employees table, joining e1.manager_id with e2.emp_id. This establishes the relationship between employees and their managers.
  • We select the emp_name column from e1 as employee_name and the emp_name column from e2 as manager_name.

The result of this query will be:

+---------------+---------------+
| employee_name | manager_name |
+---------------+---------------+
| John Doe | NULL |
| Jane Smith | John Doe |
| Bob Brown | John Doe |
| Alice Lee | Jane Smith |
| Carol White | Bob Brown |
+---------------+---------------+

In the result set, you can see the list of employees along with their respective managers. John Doe, being a top-level manager, has NULL as the manager’s name. Other employees have their managers’ names listed.

Conclusion

SQL joins are powerful tools for extracting and combining data from relational databases. By understanding the five common types of joins — INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN, and SELF JOIN — you can manipulate and query your data effectively. Whether you’re retrieving data from different tables or establishing relationships within the same table, SQL joins are essential in your data manipulation toolkit.

--

--