0

Builder state — storing wheres, selects, joins as arrays

Intermediate5 min read·fw-07-001
sql

Concept

A query builder's internal state is the foundation of everything else. Before we can compile SQL, we need a principled data structure that records every clause the caller has requested. The SelectQuery class holds this state as typed arrays and scalar values: $table (string), $columns (array), $wheres (array of condition records), $joins (array of join records), $orderBys (array), $groups (array), $limit (?int), $offset (?int), and $bindings (flat array of values for PDO). Each fluent method mutates a copy of the builder (or the builder itself, depending on the design choice) and returns $this, enabling chaining.

The key design decision is whether conditions are stored as raw SQL fragments or as structured records. Laravel's Illuminate\Database\Query\Builder stores wheres as arrays of type, column, operator, value, and boolean (AND/OR). This structured approach means the compiler can make intelligent decisions — for example, grouping OR conditions inside parentheses without the caller needing to manage that. A raw-string approach is simpler to implement but makes it impossible to later optimise or rewrite conditions.

Named bindings (:email syntax) are cleaner than positional ? for complex queries because they can appear in any order and be reused. However, PDO's named binding system requires that every named placeholder be unique within a statement. Our builder resolves this by auto-generating unique names: :where_0, :where_1, etc., keyed to a counter that increments with each condition. This keeps the binding map unambiguous without requiring callers to manage names.

The $wheres array stores each condition as an associative array: ['type' => 'basic', 'column' => 'email', 'operator' => '=', 'value' => ':where_0', 'boolean' => 'AND']. The actual PDO binding value goes into $bindings[':where_0']. When the compiler assembles the WHERE clause, it reads from $wheres to build the SQL string and passes $bindings to PDO's execute(). This separation keeps SQL construction and data transport in distinct, testable layers.

State propertyPHP typePurpose
$tablestringTarget table name
$columnsstring[]SELECT columns
$wheresarray[]WHERE condition records
$joinsarray[]JOIN clause records
$orderBysarray[]ORDER BY direction records
$limit?intLIMIT value
$offset?intOFFSET value
$bindingsarray<string,mixed>Named PDO bindings

Code Example

php
<?php
declare(strict_types=1);

namespace Lumen\Database;

/**
 * Fluent Query Builder — state container.
 *
 * Holds the query state accumulated by fluent calls.
 * The compile*() methods (next lesson) read this state
 * and produce SQL strings + a flat bindings array.
 *
 * Laravel equivalent: Illuminate\Database\Query\Builder
 */
class SelectQuery
{
    protected string $table = '';

    /** @var string[] */
    protected array $columns = ['*'];

    /**
     * Each element: [
     *   'type'    => 'basic'|'nested'|'raw',
     *   'column'  => string,
     *   'operator'=> string,
     *   'value'   => string,   // named placeholder e.g. :where_0
     *   'boolean' => 'AND'|'OR',
     * ]
     *
     * @var array<int, array<string, mixed>>
     */
    protected array $wheres = [];

    /**
     * Each element: [
     *   'type'  => 'inner'|'left'|'right'|'cross',
     *   'table' => string,
     *   'first' => string,
     *   'operator' => string,
     *   'second'=> string,
     * ]
     *
     * @var array<int, array<string, mixed>>
     */
    protected array $joins = [];

    /**
     * Each element: ['column' => string, 'direction' => 'ASC'|'DESC']
     *
     * @var array<int, array<string, string>>
     */
    protected array $orderBys = [];

    /** @var string[] */
    protected array $groups = [];

    protected ?int $limitValue  = null;
    protected ?int $offsetValue = null;

    /**
     * Named PDO bindings accumulated by where*() calls.
     * Key: ':where_N', value: mixed (scalar)
     *
     * @var array<string, mixed>
     */
    protected array $bindings = [];

    /** Counter used to generate unique named placeholders. */
    private int $bindingCounter = 0;

    // ----------------------------------------------------------------
    // Connection reference
    // ----------------------------------------------------------------

    public function __construct(
        protected readonly Connection $connection
    ) {}

    // ----------------------------------------------------------------
    // Clause builders — return $this for chaining
    // ----------------------------------------------------------------

    public function from(string $table): static
    {
        $this->table = $table;
        return $this;
    }

    /**
     * @param string|string[] $columns
     */
    public function select(string|array $columns = ['*']): static
    {
        $this->columns = is_array($columns) ? $columns : func_get_args();
        return $this;
    }

    /**
     * Add a basic WHERE condition.
     *
     * Supports two call signatures:
     *   where('column', 'value')          → implicitly uses '='
     *   where('column', '>=', 'value')
     */
    public function where(string $column, mixed $operatorOrValue, mixed $value = null, string $boolean = 'AND'): static
    {
        [$operator, $value] = $this->prepareOperatorAndValue($operatorOrValue, $value);

        $placeholder = $this->addBinding($value);

        $this->wheres[] = [
            'type'     => 'basic',
            'column'   => $column,
            'operator' => $operator,
            'value'    => $placeholder,
            'boolean'  => $boolean,
        ];

        return $this;
    }

    public function orWhere(string $column, mixed $operatorOrValue, mixed $value = null): static
    {
        return $this->where($column, $operatorOrValue, $value, 'OR');
    }

    /**
     * @param mixed[] $values
     */
    public function whereIn(string $column, array $values, string $boolean = 'AND', bool $not = false): static
    {
        $placeholders = [];
        foreach ($values as $value) {
            $placeholders[] = $this->addBinding($value);
        }

        $this->wheres[] = [
            'type'    => $not ? 'not_in' : 'in',
            'column'  => $column,
            'values'  => $placeholders, // store placeholder names
            'boolean' => $boolean,
        ];

        return $this;
    }

    public function whereNull(string $column, string $boolean = 'AND', bool $not = false): static
    {
        $this->wheres[] = [
            'type'    => $not ? 'not_null' : 'null',
            'column'  => $column,
            'boolean' => $boolean,
        ];

        return $this;
    }

    public function whereNotNull(string $column): static
    {
        return $this->whereNull($column, 'AND', true);
    }

    public function orderBy(string $column, string $direction = 'ASC'): static
    {
        $this->orderBys[] = [
            'column'    => $column,
            'direction' => strtoupper($direction) === 'DESC' ? 'DESC' : 'ASC',
        ];
        return $this;
    }

    public function orderByDesc(string $column): static
    {
        return $this->orderBy($column, 'DESC');
    }

    public function groupBy(string ...$columns): static
    {
        array_push($this->groups, ...$columns);
        return $this;
    }

    public function limit(int $value): static
    {
        $this->limitValue = max(0, $value);
        return $this;
    }

    public function offset(int $value): static
    {
        $this->offsetValue = max(0, $value);
        return $this;
    }

    // ----------------------------------------------------------------
    // Join builders
    // ----------------------------------------------------------------

    public function join(string $table, string $first, string $operator, string $second, string $type = 'inner'): static
    {
        $this->joins[] = compact('type', 'table', 'first', 'operator', 'second');
        return $this;
    }

    public function leftJoin(string $table, string $first, string $operator, string $second): static
    {
        return $this->join($table, $first, $operator, $second, 'left');
    }

    // ----------------------------------------------------------------
    // Internal helpers
    // ----------------------------------------------------------------

    /**
     * Normalise the two-arg and three-arg where() signatures.
     *
     * @return array{0: string, 1: mixed}
     */
    private function prepareOperatorAndValue(mixed $operatorOrValue, mixed $value): array
    {
        $validOperators = ['=', '<', '>', '<=', '>=', '<>', '!=', 'LIKE', 'NOT LIKE'];

        if ($value === null && !in_array($operatorOrValue, $validOperators, true)) {
            // Two-arg form: where('col', 'val')
            return ['=', $operatorOrValue];
        }

        return [$operatorOrValue, $value];
    }

    /**
     * Register a value in the bindings map and return its named placeholder.
     */
    protected function addBinding(mixed $value): string
    {
        $key = ':where_' . $this->bindingCounter++;
        $this->bindings[$key] = $value;
        return $key;
    }

    // ----------------------------------------------------------------
    // State accessors (used by the compiler and tests)
    // ----------------------------------------------------------------

    public function getTable(): string        { return $this->table; }
    public function getColumns(): array       { return $this->columns; }
    public function getWheres(): array        { return $this->wheres; }
    public function getJoins(): array         { return $this->joins; }
    public function getOrderBys(): array      { return $this->orderBys; }
    public function getGroups(): array        { return $this->groups; }
    public function getLimitValue(): ?int     { return $this->limitValue; }
    public function getOffsetValue(): ?int    { return $this->offsetValue; }
    public function getBindings(): array      { return $this->bindings; }
}

Interview Q&A

Q: Why store WHERE conditions as structured arrays instead of raw SQL fragments?

Structured storage gives the compiler full control over how conditions are rendered. With raw SQL fragments, you cannot distinguish an AND condition from an OR condition without parsing the SQL back — meaning you cannot group OR conditions inside parentheses, cannot re-order conditions for optimisation, and cannot support nested condition groups (like WHERE (a = 1 OR b = 2) AND c = 3). Structured storage also makes the builder fully testable without executing a database query: you can assert on getWheres() directly. Laravel's Illuminate\Database\Query\Builder uses this exact structured-array approach for every clause type.


Q: How does Illuminate's Query Builder handle the same where() state?

Illuminate\Database\Query\Builder stores conditions in $this->wheres as arrays with a type key that dispatches to different compiler methods. The compileWheres() method in Illuminate\Database\Query\Grammars\Grammar iterates the wheres array and calls compileWhereBasic(), compileWhereIn(), compileWhereNull(), etc. The Grammar class is a separate object — the builder holds state, the grammar handles SQL dialect differences. Our single-class approach merges these responsibilities for simplicity.


Q: What is the risk of using positional ? bindings instead of named bindings in a query builder?

Positional bindings require that the bindings array be in exactly the same order as the ? placeholders in the compiled SQL. As the builder adds more clause types (WHERE, HAVING, JOIN ON conditions), maintaining positional order becomes fragile. Named bindings allow the compiler to assemble the SQL string in any order (e.g., compiling JOINs before WHEREs) without worrying about placeholder position. The trade-off is that named placeholders must be unique across the entire statement — which is why the builder uses an incrementing counter.