Window functions perform calculations across a set of rows that are related to the current row — without collapsing those rows into a single result the way GROUP BY does.
Why this appears in interviews
Window functions appear in virtually every data engineering interview. Knowing GROUP BY is table stakes. Window functions separate candidates who can write complex analytical queries from those who cannot.
The mental model — a sliding frame
You want a column showing each salesperson's running total without losing row-level detail. GROUP BY destroys the detail. A window function slides a frame over the rows, computes within that frame for each row, without collapsing them. Every row keeps its identity; a new computed column is added.
The window is defined by: PARTITION BY (which rows belong together), ORDER BY (order within each partition), and optional frame clause (which rows to include).
The four essential window functions
-- ROW_NUMBER(): unique sequential integer per row within partition
SELECT employee_id, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;
-- WHERE rank = 1 gets highest-paid per department
RANK(): Same rank for ties, then skips numbers (1, 1, 3). DENSE_RANK(): Same rank for ties, no skips (1, 1, 2).
-- LAG() and LEAD(): access previous/subsequent row values
SELECT date, revenue,
LAG(revenue, 1) OVER (ORDER BY date) AS prev_day_revenue,
revenue - LAG(revenue, 1) OVER (ORDER BY date) AS daily_change
FROM daily_revenue;
-- SUM() as window function: running totals and moving averages
SELECT date, revenue,
SUM(revenue) OVER (ORDER BY date) AS running_total,
AVG(revenue) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS avg_7d
FROM daily_revenue;
PARTITION BY vs GROUP BY
-- GROUP BY: One row per department, employee rows lost
SELECT department, AVG(salary)
FROM employees GROUP BY department;
-- PARTITION BY: All rows preserved, dept avg added as new column
SELECT employee_id, department, salary,
AVG(salary) OVER (PARTITION BY department) AS dept_avg,
salary - AVG(salary) OVER (PARTITION BY department) AS diff_from_avg
FROM employees;
Common interview mistakes
Mistake 1: Using a self-join when LAG() is the right tool. "I would join the table to itself to get the previous day's revenue." LAG() does this in one line.
Mistake 2: Confusing ROW_NUMBER and RANK. ROW_NUMBER always gives unique numbers. RANK gives same number to ties but skips the next number. DENSE_RANK gives same number without skipping.
Mistake 3: Forgetting window functions run after WHERE. You cannot filter on a window function result in a WHERE clause — wrap it in a subquery or CTE.
Key vocabulary
- Window function — A function that operates on a set of related rows without collapsing them into a single row.
- PARTITION BY — Divides rows into groups for the window function. Like GROUP BY but preserves individual rows.
- Frame clause — Specifies which rows within the partition are included for each row (
ROWS BETWEEN X PRECEDING AND Y FOLLOWING).