JOIN
It lets you combine multiple data sources (subqueries or tables) by equality of values in the specified columns or expressions (the JOIN
keys).
Syntax
SELECT ... FROM table_1
-- first JOIN step:
<Join_Type> JOIN table_2 <Join_Condition>
-- left subquery -- entries in table_1
-- right subquery -- entries in table_2
-- next JOIN step:
<Join_Type> JOIN table_n <Join_Condition>
-- left subquery -- JOIN result in the previous step
-- right subquery -- entries in table_n
-- JOIN can include the following steps
...
WHERE ...
At each JOIN step, rules are used to establish correspondences between rows in the left and right data subqueries, creating a new subquery that includes every combination of rows that meet the JOIN conditions.
Attention!
Since columns in YQL are identified by their names, and you can't have two columns with the same name in the subquery, SELECT * FROM ... JOIN ...
can't be executed if there are columns with identical names in the joined tables.
Types of join
INNER
(default): Rows from joined subqueries that don't match any rows on the other side won't be included in the result.LEFT
: If there's no value in the right subquery, it adds a row to the result with column values from the left subquery, usingNULL
in columns from the right subqueryRIGHT
: If there's no value in the left subquery, it adds the row to the result, including column values from the right subquery, but usingNULL
in columns from the left subqueryFULL
=LEFT
+RIGHT
LEFT/RIGHT SEMI
: One side of the subquery is a whitelist of keys, its values are not available. The result includes columns from one table only, no Cartesian product is created.LEFT/RIGHT ONLY
: Subtracting the sets by keys (blacklist). It's almost the same as addingIS NULL
to the key on the opposite side in the regularLEFT/RIGHT
JOIN, but with no access to values: the same asSEMI
JOIN.CROSS
: A full Cartesian product of two tables without specifying key columns and no explicitON/USING
.EXCLUSION
: Both sides minus the intersection.
Note
NULL
is a special value to denote nothing. Hence, NULL
values on both sides are not treated as equal to each other. This eliminates ambiguity in some types of JOIN
and avoids a giant Cartesian product otherwise created.
Conditions for joining
For CROSS JOIN
, no join condition is specified. The result includes the Cartesian product of the left and right subquery, meaning it combines everything with everything. The number of rows in the resulting subquery is the product of the number of rows in the left and right subqueries.
For any other JOIN types, specify the condition using one of the two methods:
USING (column_name)
. Used if both the left and right subqueries share a column whose equality of values is a join condition.ON (equality_conditions)
. Lets you set a condition of equality for column values or expressions over columns in the left and right subqueries or use several such conditions combined byand
.
Examples
SELECT a.value as a_value, b.value as b_value
FROM a_table AS a
FULL JOIN b_table AS b USING (key);
SELECT a.value as a_value, b.value as b_value
FROM a_table AS a
FULL JOIN b_table AS b ON a.key = b.key;
SELECT a.value as a_value, b.value as b_value, c.column2
FROM a_table AS a
CROSS JOIN b_table AS b
LEFT JOIN c_table AS c ON c.ref = a.key and c.column1 = b.value;
To make sure no full scan of the right joined table is required, a secondary index can be applied to the columns included in the Join condition. Accessing a secondary index should be specified explicitly in JOIN table_name VIEW index_name AS table_alias
format.
For example, creating an index to use in the Join condition:
ALTER TABLE b_table ADD INDEX b_index_ref GLOBAL ON(ref);
Using the created index:
SELECT a.value as a_value, b.value as b_value
FROM a_table AS a
INNER JOIN b_table VIEW b_index_ref AS b ON a.ref = b.ref;