0

Arrays as hash maps — O(1) lookup, collision handling

Intermediate5 min read·eng-07-001
interviewcompare

Concept

PHP arrays are hash maps — the most versatile data structure in PHP. Under the hood, PHP's array is implemented as an ordered hash table (a HashTable struct in the Zend engine). It combines the fast lookup of a hash map with the ordering of a linked list.

Hash map fundamentals:

  • A hash function converts a key (string or integer) into an integer index (the bucket position).
  • The value is stored at that bucket.
  • Lookup: hash the key → find the bucket → O(1) average.

Collision handling: Two different keys can hash to the same bucket. PHP handles collisions with chaining — each bucket has a linked list of entries. Multiple keys that hash to the same bucket form a chain; the lookup follows the chain until the matching key is found.

PHP array as hash map:

php
$map = [];
$map['user:42'] = $user;
isset($map['user:42']); // O(1) — hash lookup

Integer keys: PHP uses them directly as the bucket index (with modulo for the table size), bypassing the string hash. $arr[0], $arr[42] are particularly fast.

Memory layout: PHP arrays store (key, value, hash, next-in-chain) per entry. They maintain insertion order via an internal linked list through the entries. This is why PHP arrays are ordered AND have O(1) lookup — unlike typical hash maps.

Time complexity:

  • Insert: O(1) amortized (occasional resize doubles the table).
  • Lookup (isset): O(1) average.
  • Iteration: O(n) in insertion order.
  • Merge (array_merge): O(n).

Resize cost: When the hash table is full, PHP allocates a new table (2×), rehashes all entries — O(n) for that operation. Amortized over all insertions, cost is O(1) per insert.

Code Example

php
<?php
// PHP array as hash map — O(1) lookup
$cache = [];
$cache['user:42']    = ['id' => 42, 'name' => 'Alice'];
$cache['user:99']    = ['id' => 99, 'name' => 'Bob'];

// O(1) check and retrieval
if (isset($cache['user:42'])) {
    $user = $cache['user:42']; // O(1)
}

// Counting word frequencies — classic hash map problem
function wordFrequency(string $text): array
{
    $words = str_word_count(strtolower($text), 1);
    $freq  = [];
    foreach ($words as $word) {
        $freq[$word] = ($freq[$word] ?? 0) + 1; // O(1) per word
    }
    arsort($freq); // sort by frequency
    return $freq;
}

// Deduplication — O(n) using hash map keys
function unique(array $items): array
{
    $seen   = [];
    $result = [];
    foreach ($items as $item) {
        if (!isset($seen[$item])) {
            $seen[$item] = true;
            $result[]    = $item;
        }
    }
    return $result;
    // Note: array_unique() does the same thing but uses sorting internally
}

// Two-sum problem — O(n) with hash map
// "Find two numbers in $nums that add up to $target"
function twoSum(array $nums, int $target): ?array
{
    $seen = [];
    foreach ($nums as $i => $num) {
        $complement = $target - $num;
        if (isset($seen[$complement])) {
            return [$seen[$complement], $i]; // found pair
        }
        $seen[$num] = $i; // store index
    }
    return null; // no pair found
}

$result = twoSum([2, 7, 11, 15], 9); // [0, 1] — indices of 2 and 7

// Grouping — group users by role O(n)
$users  = [['name' => 'Alice', 'role' => 'admin'], ['name' => 'Bob', 'role' => 'user'], ['name' => 'Carol', 'role' => 'admin']];
$byRole = [];
foreach ($users as $user) {
    $byRole[$user['role']][] = $user;
}
// $byRole['admin'] → [Alice, Carol], $byRole['user'] → [Bob]