Understanding Correlated Subqueries in PostgreSQL (with Practical Examples)

Correlation (aka correlated subquery) means: a subquery that references columns from the outer query, so it must be re-evaluated for each outer row.

It’s super common for:

  • “Pick rows where something exists / doesn’t exist”
  • “Compare a row to an aggregate of its group”
  • “Top-N per group” (with some variations)

Example tables (with sample data)

1) Correlated subquery with EXISTS (classic “has related rows”)

Goal

Return customers who have at least one order.

What happens (row-by-row intuition)

✅ Output: Amina, Omar


2) Correlated subquery with NOT EXISTS (“has no related rows”)

Goal

Return customers with no orders.

✅ Output: Sara

Tip: NOT EXISTS is usually safer than NOT IN (...) when NULLs could appear.


3) Correlated subquery in SELECT (computed per row)

Goal

Show each customer with their total spent.

Result

4) Correlated subquery for “greater than group average”

Goal

List orders whose amount is above that customer’s average order amount.

Intuition table

✅ Output: order 101


5) “Latest order per customer” using a correlated subquery

Goal

Return each customer’s most recent order.

If two orders tie on the same max date, you’ll get both (often OK, sometimes not).


Performance notes (quick but important) ⚙️

  • A correlated subquery can run “per row”, but PostgreSQL often optimizes it well (especially EXISTS).
  • Indexes matter a lot:
    • orders(customer_id)
    • orders(customer_id, order_date) for latest-per-customer patterns


Leave a Reply

Your email address will not be published. Required fields are marked *

A Backend Engineer who enjoys building scalable systems, clean architectures, and high-performance applications.

About Me

Contact

Copyright: © 2026. All Rights Reserved.