Joins & relationships
Status: 🟩 COMPLETE Last updated: 2026-06-19 Plain-English tagline: How tables connect to each other. One-to-many, many-to-many, foreign keys, junction tables, the four JOIN types. The piece of SQL that confuses people the most — and rewards understanding the most.
In plain English
Real data lives across multiple tables. Users in one. Posts in another. Comments in a third. To answer “what comments did George leave on Sarah’s post?” you need to combine information from all three.
Joins are how SQL combines tables. Relationships are the conceptual connections (a user has many posts; a post has many comments) that joins navigate.
A few core ideas:
- Foreign keys are the connection points — a column in one table that references the primary key of another (
posts.author_id→users.id). - One-to-many means one row in table A relates to many rows in table B. (One user, many posts.)
- Many-to-many means rows on both sides can have many on the other. (One user in many teams; one team has many users.) Requires a junction table.
- Joins are how queries traverse these relationships.
Modeling relationships well is half of database design. Joining tables fluently is half of writing queries. Most real-world SQL questions are “how do I join these tables?”
Why it matters
- Real data is relational. Almost every webapp’s data has relationships. You can’t model anything substantial without them.
- Joins are where queries go wrong. Wrong join type → missing rows or duplicates. Wrong join condition → cross product disaster. Understanding joins prevents these.
- ORMs hide joins but don’t eliminate them. Knowing what’s happening underneath lets you debug and optimize.
- Performance depends on indexes that support joins. Joins on indexed columns: fast. On unindexed: slow.
Foreign keys
A foreign key is a column whose value must match a primary key value in another table.
CREATE TABLE users (
id uuid PRIMARY KEY,
email text UNIQUE
);
CREATE TABLE posts (
id uuid PRIMARY KEY,
author_id uuid REFERENCES users(id), -- foreign key
title text
);The REFERENCES users(id) says: “Every value of posts.author_id must be a real user’s id.” The database enforces this:
- Insert a post with non-existent author_id → error
- Delete a user with existing posts → error (or cascade, depending on config)
Foreign keys are how the database enforces referential integrity — your data can’t get into inconsistent states.
ON DELETE actions
When the referenced row is deleted, what happens to the referencing row?
| Action | Behavior |
|---|---|
CASCADE | Delete the referencing rows too |
SET NULL | Set the foreign key column to NULL |
RESTRICT | Prevent the deletion (default in some DBs) |
NO ACTION | Like RESTRICT but checked later in the transaction |
-- If a user is deleted, delete all their posts
author_id uuid REFERENCES users(id) ON DELETE CASCADE
-- If a user is deleted, leave the post but null the author
author_id uuid REFERENCES users(id) ON DELETE SET NULLChoose deliberately. CASCADE is convenient but can cause unintended mass deletions. SET NULL preserves the row but loses the link. RESTRICT makes deletion explicit.
Relationship cardinalities
One-to-one
Each row in A has at most one corresponding row in B (and vice versa).
CREATE TABLE users (id uuid PRIMARY KEY, ...);
CREATE TABLE user_settings (
user_id uuid PRIMARY KEY REFERENCES users(id),
theme text
);The user_id is both the PK of user_settings AND a FK to users — enforcing one-to-one.
Use when: settings or profile data that’s optional / has its own access patterns. Often could just be more columns on the main table; split when there’s a reason (different access frequency, different RLS rules, optional).
One-to-many (the most common)
One row in A has many corresponding rows in B.
-- One user has many posts
CREATE TABLE posts (
id uuid PRIMARY KEY,
author_id uuid REFERENCES users(id)
);The “many” side carries the foreign key. Add an index for fast lookup.
Many-to-many
Rows on both sides can have many on the other. Implemented via a junction table:
-- Users can be in many teams; teams have many users
CREATE TABLE users (id uuid PRIMARY KEY, ...);
CREATE TABLE teams (id uuid PRIMARY KEY, ...);
CREATE TABLE team_members (
team_id uuid REFERENCES teams(id) ON DELETE CASCADE,
user_id uuid REFERENCES users(id) ON DELETE CASCADE,
role text,
joined_at timestamptz DEFAULT now(),
PRIMARY KEY (team_id, user_id)
);Junction tables often carry their own data — here, role and joined_at describe the membership relationship itself.
The composite primary key (team_id, user_id) prevents duplicate memberships.
Self-referential
Tables that reference themselves.
-- Categories can have parent categories
CREATE TABLE categories (
id uuid PRIMARY KEY,
name text,
parent_id uuid REFERENCES categories(id)
);
-- Employees and managers
CREATE TABLE employees (
id uuid PRIMARY KEY,
name text,
manager_id uuid REFERENCES employees(id)
);Useful for trees, hierarchies, social follows.
The four JOIN types
INNER JOIN — only matching rows
SELECT posts.title, users.name
FROM posts
INNER JOIN users ON users.id = posts.author_id;Returns: every post that has an author + every author that has a post. Either side missing a match → row dropped.
INNER JOIN is the default; you can write just JOIN.
LEFT JOIN — all from left, matching from right
SELECT users.name, posts.title
FROM users
LEFT JOIN posts ON posts.author_id = users.id;Returns: every user, with their posts (if any). Users with no posts show NULL for post columns.
Useful for “show me each user with their data, including users who have none.”
RIGHT JOIN — mirror of LEFT
Same as LEFT JOIN with tables swapped. Rarely used in practice; just swap which table is “left” in your query.
FULL OUTER JOIN — all rows from both sides
SELECT users.name, posts.title
FROM users
FULL OUTER JOIN posts ON posts.author_id = users.id;Returns: every row from both tables. NULLs where no match exists on either side. Used in data-quality checks (“what’s orphaned?”) more than business queries.
A concrete example: querying related data
Schema:
users: id, name, email
posts: id, author_id (FK), title, body, created_at
comments: id, post_id (FK), author_id (FK), body, created_atQ1: All posts with their author’s name
SELECT posts.title, posts.created_at, users.name AS author
FROM posts
JOIN users ON users.id = posts.author_id;Q2: All users with their post count (including users with 0 posts)
SELECT users.name, COUNT(posts.id) AS post_count
FROM users
LEFT JOIN posts ON posts.author_id = users.id
GROUP BY users.id, users.name
ORDER BY post_count DESC;LEFT JOIN ensures users with 0 posts are still included with post_count = 0.
Q3: Posts with comment counts
SELECT posts.title, COUNT(comments.id) AS comment_count
FROM posts
LEFT JOIN comments ON comments.post_id = posts.id
GROUP BY posts.id, posts.title;Q4: George’s posts with their comments
SELECT posts.title, comments.body, comment_authors.name AS commenter
FROM posts
LEFT JOIN comments ON comments.post_id = posts.id
LEFT JOIN users AS comment_authors ON comment_authors.id = comments.author_id
WHERE posts.author_id = (SELECT id FROM users WHERE name = 'George');Note users AS comment_authors — joining the same table twice requires aliases.
Q5: Most commented posts
SELECT posts.title, COUNT(comments.id) AS comment_count
FROM posts
JOIN comments ON comments.post_id = posts.id
GROUP BY posts.id, posts.title
ORDER BY comment_count DESC
LIMIT 10;The N+1 problem (again)
// Bad: N+1
const posts = await db.posts.findMany();
for (const post of posts) {
post.author = await db.users.find(post.author_id); // separate query each loop
}
// Good: one query with JOIN
const posts = await db.posts.findMany({ include: { author: true } });
// Generates SQL with JOINORMs help with include / with syntax. Raw SQL has JOIN. App-code loops are slow.
For the Supabase JS SDK:
const { data } = await supabase
.from("posts")
.select(`
*,
author:users (id, name, email)
`);Supabase translates this to a JOIN internally.
Indexes for joins
Joins are fast when join columns are indexed.
-- Joining on posts.author_id frequently? Index it.
CREATE INDEX posts_author_idx ON posts(author_id);
-- Joining comments to posts? Index comments.post_id.
CREATE INDEX comments_post_idx ON comments(post_id);For composite indexes: if you frequently filter AND join, index on (filter_column, join_column).
Foreign keys do NOT auto-index in Postgres. Add manually for every FK you join on.
Common gotchas
-
Forgetting indexes on foreign keys. Postgres doesn’t auto-index FKs. JOIN performance suffers.
-
Cartesian product from missing ON.
JOIN users ... ON ...without the ON returns every-row Ă— every-row. Disaster. -
Wrong JOIN type.
INNER JOINdrops rows with no match.LEFT JOINkeeps them. Using INNER when you wanted LEFT silently loses data. -
Joining on wrong column.
posts.author_id = users.author_idinstead ofusers.id— looks similar, totally wrong. -
Joining many-to-many without dedup.
SELECT users.* FROM users JOIN posts ...returns duplicate users (one per post). UseDISTINCTor aggregate. -
GROUP BY mistakes after JOIN. Selecting non-aggregated columns that aren’t in GROUP BY — error or surprising result.
-
ON DELETE CASCADEyou didn’t think through. Deleting a user might cascade to delete posts, comments, votes, profile, sessions, everything. Sometimes right, sometimes catastrophic. -
Junction table without composite PK. Allows duplicate “George is in Team A” rows. Use composite PK.
-
COUNT(*)vsCOUNT(table.column)after LEFT JOIN. With LEFT JOIN, unmatched rows have NULLs.COUNT(*)counts all rows (including the NULL ones);COUNT(comments.id)counts only non-NULL. -
Multiple JOINs creating row explosion.
users JOIN posts JOIN comments— a user with 10 posts and 100 comments per post gives 1000 rows for that one user. Aggregate or use subqueries. -
Filter in WHERE vs ON.
LEFT JOIN ... ON ... AND posts.published = truekeeps users with no published posts (showing NULL).WHERE posts.published = trueafter LEFT JOIN turns it back into an INNER JOIN. Subtle. -
SELECT *after JOIN. Pulls all columns from all tables. Ambiguous if columns share names. Be explicit. -
Joining huge tables without limits. Returning a billion-row result kills your app and your DB. Always LIMIT.
-
Self-join without alias. Joining a table to itself requires aliases; otherwise column references are ambiguous.
-
Wrong direction of relationship. “Each post has many authors” vs. “each author has many posts.” Schema and queries depend on which.
-
Many-to-many via array column.
tags text[]makes “posts tagged X” awkward. Junction table is better. -
Joining on non-indexed text columns. Joins on
lower(email)without an expression index = full table scan. -
Implicit type casts in JOIN.
JOIN ... ON a.id = b.id::textwhen types differ — slow and surprising. -
JOIN order changing results. It shouldn’t for INNER joins. For outer joins, order matters semantically.
-
Not using LATERAL JOINs when needed. For correlated subqueries that should logically be joins. Advanced but powerful.
-
GROUP BY with selected columns that aren’t grouped. Strict SQL errors; lenient SQL gives arbitrary results.
-
Window functions vs aggregations confusion.
COUNT(*) OVER (PARTITION BY ...)is different fromCOUNT(*)withGROUP BY. Both have their uses. -
Junction table updates. Adding a row to a junction can fail if FK references invalid. Insert parent rows first.
-
Cascade chains. Cascading deletes through 5 tables. Hard to reason about; surprises in production.
See also
- What is a database? đźź©
- SQL — the language 🟩 — JOIN syntax in depth
- Postgres 🟩 🟦
- Schema design 🟩 — designs relationships
- Indexes & performance 🟩 — fast joins need indexes
- Transactions & ACID đźź©
- Migrations đźź©
- Supabase 🟩 🟦
- Row-Level Security 🟩 — RLS can use joins in policies
- Glossary: Joins (SQL), Key (database — primary, foreign)