SQL: JOIN and Table Alias

created:

updated:

tags: database sql

JOIN

JOIN allows us to combine data across two separate tables by unique key.

SELECT column, another_column, ...
FROM tableA
INNER/LEFT/RIGHT/FULL JOIN tableB
ON tableA.id = tableB.id
WHERE condition(s)
ORDER BY column, ... ASC/DESC
LIMIT num_limit OFFSET num_offset;

INNER JOIN

The INNER JOIN is a process that matches rows from the first table and the second table which have the same key (as defined by the ON constraint) to create a result row with the combined columns from both tables.

INNER JOIN and JOIN are equivalent terms.

USING syntax

When the column names for multiple table are the same, we can use USING syntax:

SELECT column, another_column, ...
FROM tableA
INNER JOIN tableB USING(id)
ORDER BY column, ... ASC/DESC
LIMIT num_limit OFFSET num_offset;

OUTER JOIN

OUTER JOIN is used when the two tables have asymmetric data (some rows may not have common key between the two tables).

LEFT JOIN

When joining table A to table B, a left table is table A, and a LEFT JOIN simply includes rows from A regardless of whether a matching row is found in B.

When there is no matching column found in table B, it will fill the colums of the right table (B) with null values.

Example of finding rows from the left table where it doesn’t match the right table:

SELECT a, value_a, b, value_b
FROM table_a
LEFT JOIN table_b ON value_a = value_b
WHERE b is NULL;

RIGHT JOIN

The RIGHT JOIN is the same, but reversed, keeping rows in B regardless whether a match is found in A.

FULL JOIN (FULL OUTER JOIN)

a FULL JOIN simply means that rows from both tables are kept, regardless of whether a matching row exists in the other table.

The full outer join or full join returns a result set that contains all rows from both left and right tables, with the matching rows from both sides if available. In case there is no match, the columns of the table will be filled with NULL.

NULLs

We can check if the data column is NULL or not by using constraint:

SELECT column, another_column, ...
FROM tableA
WHERE column IS NULL (or IS NOT NULL);

Queries with Aggregates

SELECT AGGREGATE_FUNC(column or expression) AS aggregate_description, ...
FROM tableA
WHERE constraint_expression;

Common Aggregate Functions

FunctionDescription
COUNT(*)A common function used to count the number of rows in the group if no column name is specified
COUNT(column)A common function used to count the number of rows in the group with non-NULL values in the specified column
MIN(column)Finds the smallest numerical value in the specified column for all rows in the group
MAX(column)Finds the largest numerical value in the specified column for all rows in the group
AVG(column)Finds the average numerical value in the specified column for all rows in the group
SUM(column)Finds the sum of all numerical values in the specified column for the rows in the group

Grouped Aggregate Functions

We can apply the aggregate functions to individual groups of data within that group by using GROUP BY clause.

SELECT AGGREGATE_FUNC(column or expression) AS aggregate_description, ...
FROM tableA
WHERE constraint_expression
GROUP BY column;

Filter Grouped Rows by HAVING Clause

SELECT group_by_column, AGGREGATE_FUNC(column_expression) as aggregate_result
FROM tableA
WHERE condition
GROUP BY column
HAVING group_condition;

HAVING clause constraint are written the same way as the WHERE clause constraints, and are applied to grouped rows.

The HAVING clause specifies a search condition for a group or an aggregate. The HAVING clause is often used with the GROUP BY clause to filter groups or aggregates based on a specified condition.

CROSS JOIN

A CROSS JOIN clause allows you to produce a Cartesian Product of rows in two or more tables. Different from other join clauses such as LEFT JOIN or INNER JOIN, the CROSS JOIN clause does not have a join predicate.

SELECT select_list
FROM T1
CROSS JOIN T2;

SELECT select_list
FROM T1, T2;

SELECT *
FROM T1
INNER JOIN T2 ON true;

NATURAL JOIN

A natural join is a join that creates an implicit join based on the same column names in the joined tables.

The convenience of the NATURAL JOIN is that it does not require you to specify the join clause because it uses an implicit join clause based on the common column. However, you should avoid using the NATURAL JOIN whenever possible because sometimes it may cause an unexpected result.

ex: When two tables have more than on common column name, using NATURAL JOIN may cause an unexpected result.

Table Alias

Similar to column alias, we can assign an alias for database table name and it has several practical advantages.

More readable table name

We can assign an alias for a long table name:

a_very_long_table_name AS alias

Table alias for JOIN clauses

Usually, we do JOIN to query data from multiple tables that have the same column name. If we use the same column name coming from multiple tables without mentioning the table name, it’ll throw an error. We need to specify the table name where the column name belongs:

table_name.column_name

In case table_name may be long, we can make it shorter using aliases:

SELECT c.customer_id, first_name, amount, payment_date
FROM customer c
INNER JOIN payment p
    ON p.customer_id = c.customer_id
ORDER BY payment_date DESC;

Table alias for self JOIN

A self-join is a regular join that joins a table to itself.

When we do self INNER JOIN or LEFT JOIN, we need to use table aliases, as we cannot reference the same table multiple names within a query.

SELF JOIN Example: Querying hierarchical data

CREATE TABLE employee (
  employee_id INT PRIMARY KEY,
  first_name VARCHAR (255) NOT NULL,
  last_name VARCHAR (255) NOT NULL,
  manager_id INT,
  FOREIGN KEY (manager_id) REFERENCES employee (employee_id)
  ON DELETE CASCADE
);

SELECT e.first_name employee, m.first_name manager
FROM employee e
INNER JOIN employee m 
    ON m.employee_id = e.manager_id
ORDER BY manager;

SELF JOIN Example: Comparing the rows with the same table

Finding all pair of films that have the same length:

SELECT f1.title, f2.title, f1.length
FROM film f1
INNER JOIN film f2 
    ON f1.film_id <> f2.film_id AND
       f1.length = f2.length;

An Example on How to JOIN three tables

We can have two INNER JOIN, one after another:

SELECT c.customer_id, c.first_name customer_first_name, s.first_name staff_first_name, amount, payment_date
FROM customer c
INNER JOIN payment p ON p.customer_id = c.customer_id
INNER JOIN staff s ON p.staff_id = s.staff_id
ORDER BY payment_date;

Joins

Reference