0

Window functions — ROW_NUMBER, RANK, LAG, LEAD, SUM OVER

Advanced5 min read·eng-06-009
sqlinterview

Concept

Window functions perform calculations across a set of rows that are related to the current row — without collapsing rows the way GROUP BY does. A window function keeps all rows in the result, attaching a computed value to each.

Anatomy: function_name() OVER (PARTITION BY ... ORDER BY ... ROWS/RANGE ...)

  • PARTITION BY: Like GROUP BY within the window — resets the function's state for each partition. Optional.
  • ORDER BY: Determines the order within each partition. Required for ranking and cumulative functions.
  • ROWS/RANGE: Defines the window frame within the partition.

Ranking functions:

  • ROW_NUMBER(): Unique sequential number (1, 2, 3...). Ties get different numbers.
  • RANK(): Like ROW_NUMBER but ties get the same rank. Gaps after ties: 1, 2, 2, 4.
  • DENSE_RANK(): Like RANK but no gaps: 1, 2, 2, 3.

Offset functions:

  • LAG(col, n): Value of col from n rows BEFORE the current row. Default n=1.
  • LEAD(col, n): Value of col from n rows AFTER the current row.

Aggregate window functions: SUM(), AVG(), COUNT(), MIN(), MAX() used as window functions. With ORDER BY, they compute running totals/averages.

Window functions vs GROUP BY: GROUP BY collapses rows to one per group. Window functions keep all rows and add a computed column.

SQL support: Standard in PostgreSQL, MySQL 8.0+, SQLite 3.25+. Not available in MySQL 5.x.

Code Example

sql
-- Ranking employees by salary within each department
SELECT
    name,
    department,
    salary,
    ROW_NUMBER()  OVER (PARTITION BY department ORDER BY salary DESC) AS row_num,
    RANK()        OVER (PARTITION BY department ORDER BY salary DESC) AS rank,
    DENSE_RANK()  OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank
FROM employees;

-- Result:
-- name  | dept | salary | row_num | rank | dense_rank
-- Alice | Eng  | 120000 | 1       | 1    | 1
-- Bob   | Eng  | 100000 | 2       | 2    | 2
-- Carol | Eng  | 100000 | 3       | 2    | 2  (tied)
-- Dave  | Eng  |  90000 | 4       | 4    | 3  (rank has gap, dense_rank doesn't)
-- Eve   | HR   |  80000 | 1       | 1    | 1  (resets for new department)

-- Running total (cumulative SUM)
SELECT
    order_date,
    amount,
    SUM(amount) OVER (ORDER BY order_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM orders;

-- Month-over-month comparison with LAG
SELECT
    month,
    revenue,
    LAG(revenue, 1) OVER (ORDER BY month)                          AS prev_month_revenue,
    revenue - LAG(revenue, 1) OVER (ORDER BY month)               AS revenue_change,
    ROUND(
        (revenue - LAG(revenue, 1) OVER (ORDER BY month))
        / LAG(revenue, 1) OVER (ORDER BY month) * 100, 2
    ) AS pct_change
FROM monthly_revenue;

-- Get the top 1 employee per department (using ROW_NUMBER in a subquery)
SELECT name, department, salary
FROM (
    SELECT name, department, salary,
           ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
    FROM employees
) ranked
WHERE rn = 1;

-- Moving average (last 3 months)
SELECT
    month,
    revenue,
    AVG(revenue) OVER (ORDER BY month ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS avg_3months
FROM monthly_revenue;