JOIN
JOIN
allows us to combine data across two separate tables by unique key.
SELECT column, another_column, ...
FROM tableA
INNER/LEFT/RIGHT/FULL JOIN tableB
ON tableA.id = tableB.id
WHERE condition(s)
ORDER BY column, ... ASC/DESC
LIMIT num_limit OFFSET num_offset;
INNER JOIN
The
INNER JOIN
is a process that matches rows from the first table and the second table which have the same key (as defined by theON
constraint) to create a result row with the combined columns from both tables.
INNER JOIN
and JOIN
are equivalent terms.
USING
syntax
When the column names for multiple table are the same, we can use USING
syntax:
SELECT column, another_column, ...
FROM tableA
INNER JOIN tableB USING(id)
ORDER BY column, ... ASC/DESC
LIMIT num_limit OFFSET num_offset;
OUTER JOIN
OUTER JOIN
is used when the two tables have asymmetric data (some rows may not have common key between the two tables).
LEFT JOIN
When joining table A to table B, a left table is table A, and a
LEFT JOIN
simply includes rows from A regardless of whether a matching row is found in B.
When there is no matching column found in table B, it will fill the colums of the right table (B) with null values.
Example of finding rows from the left table where it doesn’t match the right table:
SELECT a, value_a, b, value_b
FROM table_a
LEFT JOIN table_b ON value_a = value_b
WHERE b is NULL;
RIGHT JOIN
The
RIGHT JOIN
is the same, but reversed, keeping rows in B regardless whether a match is found in A.
FULL JOIN (FULL OUTER JOIN)
a
FULL JOIN
simply means that rows from both tables are kept, regardless of whether a matching row exists in the other table.
The full outer join or full join returns a result set that contains all rows from both left and right tables, with the matching rows from both sides if available. In case there is no match, the columns of the table will be filled with NULL.
NULLs
We can check if the data column is NULL
or not by using constraint:
SELECT column, another_column, ...
FROM tableA
WHERE column IS NULL (or IS NOT NULL);
Queries with Aggregates
SELECT AGGREGATE_FUNC(column or expression) AS aggregate_description, ...
FROM tableA
WHERE constraint_expression;
Common Aggregate Functions
Function | Description |
---|---|
COUNT(*) | A common function used to count the number of rows in the group if no column name is specified |
COUNT(column) | A common function used to count the number of rows in the group with non-NULL values in the specified column |
MIN(column) | Finds the smallest numerical value in the specified column for all rows in the group |
MAX(column) | Finds the largest numerical value in the specified column for all rows in the group |
AVG(column) | Finds the average numerical value in the specified column for all rows in the group |
SUM(column) | Finds the sum of all numerical values in the specified column for the rows in the group |
Grouped Aggregate Functions
We can apply the aggregate functions to individual groups of data within that group by using GROUP BY
clause.
SELECT AGGREGATE_FUNC(column or expression) AS aggregate_description, ...
FROM tableA
WHERE constraint_expression
GROUP BY column;
Filter Grouped Rows by HAVING
Clause
SELECT group_by_column, AGGREGATE_FUNC(column_expression) as aggregate_result
FROM tableA
WHERE condition
GROUP BY column
HAVING group_condition;
HAVING
clause constraint are written the same way as theWHERE
clause constraints, and are applied to grouped rows.
The
HAVING
clause specifies a search condition for a group or an aggregate. TheHAVING
clause is often used with theGROUP BY
clause to filter groups or aggregates based on a specified condition.
CROSS JOIN
A
CROSS JOIN
clause allows you to produce a Cartesian Product of rows in two or more tables. Different from other join clauses such as LEFT JOIN or INNER JOIN, theCROSS JOIN
clause does not have a join predicate.
SELECT select_list
FROM T1
CROSS JOIN T2;
SELECT select_list
FROM T1, T2;
SELECT *
FROM T1
INNER JOIN T2 ON true;
NATURAL JOIN
A natural join is a join that creates an implicit join based on the same column names in the joined tables.
The convenience of the
NATURAL JOIN
is that it does not require you to specify the join clause because it uses an implicit join clause based on the common column. However, you should avoid using theNATURAL JOIN
whenever possible because sometimes it may cause an unexpected result.
ex: When two tables have more than on common column name, using NATURAL JOIN
may cause an unexpected result.
Table Alias
Similar to column alias, we can assign an alias for database table name and it has several practical advantages.
More readable table name
We can assign an alias for a long table name:
a_very_long_table_name AS alias
Table alias for JOIN
clauses
Usually, we do JOIN
to query data from multiple tables that have the same column name.
If we use the same column name coming from multiple tables without mentioning the table name, it’ll throw an error.
We need to specify the table name where the column name belongs:
table_name.column_name
In case table_name
may be long, we can make it shorter using aliases:
SELECT c.customer_id, first_name, amount, payment_date
FROM customer c
INNER JOIN payment p
ON p.customer_id = c.customer_id
ORDER BY payment_date DESC;
Table alias for self JOIN
A self-join is a regular join that joins a table to itself.
When we do self INNER JOIN
or LEFT JOIN
, we need to use table aliases, as we cannot reference the same table multiple names within a query.
SELF JOIN Example: Querying hierarchical data
CREATE TABLE employee (
employee_id INT PRIMARY KEY,
first_name VARCHAR (255) NOT NULL,
last_name VARCHAR (255) NOT NULL,
manager_id INT,
FOREIGN KEY (manager_id) REFERENCES employee (employee_id)
ON DELETE CASCADE
);
SELECT e.first_name employee, m.first_name manager
FROM employee e
INNER JOIN employee m
ON m.employee_id = e.manager_id
ORDER BY manager;
SELF JOIN Example: Comparing the rows with the same table
Finding all pair of films that have the same length:
SELECT f1.title, f2.title, f1.length
FROM film f1
INNER JOIN film f2
ON f1.film_id <> f2.film_id AND
f1.length = f2.length;
An Example on How to JOIN three tables
We can have two INNER JOIN
, one after another:
SELECT c.customer_id, c.first_name customer_first_name, s.first_name staff_first_name, amount, payment_date
FROM customer c
INNER JOIN payment p ON p.customer_id = c.customer_id
INNER JOIN staff s ON p.staff_id = s.staff_id
ORDER BY payment_date;