0

WHERE clause compilation — conditions and bindings

Advanced5 min read·fw-07-003
sql

Concept

The WHERE clause compiler is the most complex part of a query builder because it must handle three concerns simultaneously: boolean logic (AND/OR), grouping (parentheses around OR sub-expressions), and the full variety of condition types (basic comparison, IN, NOT IN, NULL, BETWEEN, EXISTS). Getting the grouping wrong produces queries that are syntactically valid but semantically incorrect — WHERE a = 1 OR b = 2 AND c = 3 is not the same as WHERE (a = 1 OR b = 2) AND c = 3.

The rule for when to wrap in parentheses: any group of conditions joined by OR that appears alongside AND conditions must be parenthesised as a unit. Laravel solves this in Grammar::compileWheresToArray() by collecting SQL fragments and then joining them — the first element gets no boolean prefix, subsequent elements get their boolean value prepended. For nested groups, the builder supports a closure-based where(function($q) { ... }) syntax that creates a sub-builder, compiles it into a parenthesised fragment, and embeds that fragment as a single where record of type nested.

Named bindings require discipline when conditions are built incrementally. Each call to where() must generate a unique placeholder name, even if the same column or value appears multiple times. The counter-based approach from the previous lesson (:where_0`, `:where_1) ensures uniqueness. PDO will reject a statement where the same named placeholder appears twice and is bound to different values — the second binding silently overwrites the first in some drivers.

The BETWEEN condition is worth calling out explicitly. BETWEEN :where_0 AND :where_1 uses two bindings for one condition. The WHERE compiler must register both values in the bindings map and produce placeholders for each. This is why conditions are stored as structured arrays rather than partially-compiled strings — only at compile time do we generate the final SQL with the correct number of placeholders.

Condition typeSQL outputBinding count
where('col', 'val')col = :where_N1
whereIn('col', [1,2,3])col IN (:w_N, :w_N+1, :w_N+2)3
whereNull('col')col IS NULL0
whereBetween('col', [1,10])col BETWEEN :w_N AND :w_N+12
where(fn($q) => ...)(nested SQL)varies

Code Example

php
<?php
declare(strict_types=1);

namespace Lumen\Database;

/**
 * Extended SelectQuery with nested where groups and BETWEEN support.
 *
 * Builds on fw-07-001. Only the new/changed methods are shown here;
 * in the real framework file this merges into SelectQuery.
 */
class SelectQuery
{
    // ... all state from fw-07-001 ...

    protected string $table = '';
    protected array $columns = ['*'];
    protected array $wheres = [];
    protected array $joins = [];
    protected array $orderBys = [];
    protected array $groups = [];
    protected ?int $limitValue  = null;
    protected ?int $offsetValue = null;
    protected array $bindings = [];
    private int $bindingCounter = 0;

    public function __construct(protected readonly Connection $connection) {}

    // ----------------------------------------------------------------
    // New: nested where group (closure-based)
    // ----------------------------------------------------------------

    /**
     * Add a nested where group enclosed in parentheses.
     *
     *   $query->where('active', 1)
     *         ->where(function(SelectQuery $q) {
     *             $q->where('role', 'admin')->orWhere('role', 'editor');
     *         });
     *
     * Compiles to: `active` = :w_0 AND (`role` = :w_1 OR `role` = :w_2)
     */
    public function where(string|callable $column, mixed $operatorOrValue = null, mixed $value = null, string $boolean = 'AND'): static
    {
        if (is_callable($column)) {
            return $this->whereNested($column, $boolean);
        }

        [$operator, $resolvedValue] = $this->prepareOperatorAndValue($operatorOrValue, $value);
        $placeholder = $this->addBinding($resolvedValue);

        $this->wheres[] = [
            'type'     => 'basic',
            'column'   => $column,
            'operator' => $operator,
            'value'    => $placeholder,
            'boolean'  => $boolean,
        ];

        return $this;
    }

    protected function whereNested(callable $callback, string $boolean = 'AND'): static
    {
        // Create a child builder that shares our binding counter so
        // placeholder names remain globally unique across parent + child.
        $nested = $this->newQuery();
        $callback($nested);

        // Merge the child's bindings into our own
        foreach ($nested->getBindings() as $key => $val) {
            $this->bindings[$key] = $val;
        }

        $this->wheres[] = [
            'type'    => 'nested',
            'query'   => $nested,
            'boolean' => $boolean,
        ];

        return $this;
    }

    protected function newQuery(): static
    {
        // Share the binding counter by passing by reference is not clean;
        // instead, start the child counter after ours.
        $child = new static($this->connection);
        $child->bindingCounter = $this->bindingCounter;
        // After child is built we sync the counter back — handled by
        // merging bindings above, so child counter state doesn't matter.
        return $child;
    }

    // ----------------------------------------------------------------
    // New: BETWEEN
    // ----------------------------------------------------------------

    /**
     * @param array{0: mixed, 1: mixed} $values
     */
    public function whereBetween(string $column, array $values, string $boolean = 'AND', bool $not = false): static
    {
        [$from, $to] = $values;
        $pFrom = $this->addBinding($from);
        $pTo   = $this->addBinding($to);

        $this->wheres[] = [
            'type'    => $not ? 'not_between' : 'between',
            'column'  => $column,
            'from'    => $pFrom,
            'to'      => $pTo,
            'boolean' => $boolean,
        ];

        return $this;
    }

    // ----------------------------------------------------------------
    // Existing methods (unchanged from fw-07-001)
    // ----------------------------------------------------------------

    public function orWhere(string|callable $column, mixed $operatorOrValue = null, mixed $value = null): static
    {
        return $this->where($column, $operatorOrValue, $value, 'OR');
    }

    public function whereIn(string $column, array $values, string $boolean = 'AND', bool $not = false): static
    {
        $placeholders = array_map(fn($v) => $this->addBinding($v), $values);

        $this->wheres[] = [
            'type'    => $not ? 'not_in' : 'in',
            'column'  => $column,
            'values'  => $placeholders,
            'boolean' => $boolean,
        ];

        return $this;
    }

    public function whereNull(string $column, string $boolean = 'AND', bool $not = false): static
    {
        $this->wheres[] = [
            'type'    => $not ? 'not_null' : 'null',
            'column'  => $column,
            'boolean' => $boolean,
        ];
        return $this;
    }

    public function whereNotNull(string $column): static
    {
        return $this->whereNull($column, 'AND', true);
    }

    protected function addBinding(mixed $value): string
    {
        $key = ':where_' . $this->bindingCounter++;
        $this->bindings[$key] = $value;
        return $key;
    }

    protected function prepareOperatorAndValue(mixed $operatorOrValue, mixed $value): array
    {
        $ops = ['=', '<', '>', '<=', '>=', '<>', '!=', 'LIKE', 'NOT LIKE'];
        if ($value === null && !in_array($operatorOrValue, $ops, true)) {
            return ['=', $operatorOrValue];
        }
        return [$operatorOrValue, $value];
    }

    // ----------------------------------------------------------------
    // Compiler integration: compile the WHERE clause
    // ----------------------------------------------------------------

    public function compileWheres(): string
    {
        if (empty($this->wheres)) {
            return '';
        }

        $parts = [];

        foreach ($this->wheres as $i => $where) {
            $sql = match ($where['type']) {
                'basic'       => $this->compileWhereBasic($where),
                'in'          => $this->compileWhereIn($where, false),
                'not_in'      => $this->compileWhereIn($where, true),
                'null'        => $this->compileWhereNull($where, false),
                'not_null'    => $this->compileWhereNull($where, true),
                'between'     => $this->compileWhereBetween($where, false),
                'not_between' => $this->compileWhereBetween($where, true),
                'nested'      => $this->compileWhereNested($where),
                default       => throw new \LogicException("Unknown where type: {$where['type']}"),
            };

            $prefix = $i === 0 ? '' : $where['boolean'] . ' ';
            $parts[] = $prefix . $sql;
        }

        return 'WHERE ' . implode(' ', $parts);
    }

    private function compileWhereBasic(array $where): string
    {
        return "`{$where['column']}` {$where['operator']} {$where['value']}";
    }

    private function compileWhereIn(array $where, bool $not): string
    {
        $kw = $not ? 'NOT IN' : 'IN';
        return sprintf('`%s` %s (%s)', $where['column'], $kw, implode(', ', $where['values']));
    }

    private function compileWhereNull(array $where, bool $not): string
    {
        $kw = $not ? 'IS NOT NULL' : 'IS NULL';
        return "`{$where['column']}` {$kw}";
    }

    private function compileWhereBetween(array $where, bool $not): string
    {
        $kw = $not ? 'NOT BETWEEN' : 'BETWEEN';
        return sprintf('`%s` %s %s AND %s', $where['column'], $kw, $where['from'], $where['to']);
    }

    private function compileWhereNested(array $where): string
    {
        // Recursively compile the child query's where clause (without leading "WHERE")
        $inner = $where['query']->compileWheres();
        $inner = preg_replace('/^WHERE /', '', $inner);
        return '(' . $inner . ')';
    }

    // Stubs for compilation (full versions in fw-07-002)
    public function from(string $t): static { $this->table = $t; return $this; }
    public function select(string|array $c = ['*']): static { $this->columns = (array)$c; return $this; }
    public function getTable(): string  { return $this->table; }
    public function getColumns(): array { return $this->columns; }
    public function getWheres(): array  { return $this->wheres; }
    public function getJoins(): array   { return $this->joins; }
    public function getOrderBys(): array { return $this->orderBys; }
    public function getGroups(): array  { return $this->groups; }
    public function getLimitValue(): ?int  { return $this->limitValue; }
    public function getOffsetValue(): ?int { return $this->offsetValue; }
    public function getBindings(): array { return $this->bindings; }
}

Interview Q&A

Q: How do you ensure that WHERE a = 1 OR b = 2 doesn't accidentally override WHERE c = 3 AND (a = 1 OR b = 2) when OR conditions are mixed with AND?

The caller is responsible for grouping via nested closures: $q->where('c', 3)->where(fn($n) => $n->where('a', 1)->orWhere('b', 2)). The nested closure approach mirrors Laravel's where(function($query) {...}) API. The compiler wraps the nested sub-builder's output in parentheses, producing WHERE c = :w_0 AND (a = :w_1 OR b = :w_2). Without the nesting, ->where('a', 1)->orWhere('b', 2)->where('c', 3) compiles to WHERE a = :w_0 OR b = :w_1 AND c = :w_2, where SQL's standard operator precedence (AND before OR) may produce unexpected results.


Q: How does Illuminate handle the whereRaw() case where the developer provides raw SQL?

Illuminate\Database\Query\Builder::whereRaw() stores the condition as type raw with the raw SQL string and a separate bindings array of positional or named values. The compiler emits the raw string verbatim. This creates a security escape hatch: the developer takes responsibility for using ? or named placeholders rather than interpolating values. Laravel wraps whereRaw calls with a deprecation hint in static analysis tools like PHPStan/Larastan to flag potentially unsafe usage.


Q: Why must the nested WHERE query's binding counter start from the parent's current count?

If the nested builder starts its counter at zero, it will generate :where_0 which collides with the parent's :where_0. PDO named bindings must be unique within a statement. By initialising the child's counter to $this->bindingCounter before the closure runs, every binding generated in the nested context gets a unique name. After the closure completes, the parent merges the child's bindings into its own map — at this point uniqueness is guaranteed because no two keys can be the same.