Set operations: array_unique, array_diff, array_intersect
Concept
PHP's set operation functions treat arrays as sets and perform intersection, difference, and deduplication — operations rooted in set theory. They compare by value, not by position.
array_unique($array, $flags) removes duplicate values, keeping the first occurrence and preserving its key. $flags controls the comparison type: SORT_STRING (default), SORT_NUMERIC, SORT_REGULAR (type-juggling comparison). Returns an array with gaps in integer keys — call array_values() to re-index if needed.
array_diff($array, $arrays...) returns elements from the first array that are NOT present in any of the subsequent arrays. Compares by value (string comparison). Keys are preserved. array_diff_key compares by key instead. array_diff_assoc compares both key and value.
array_intersect($array, $arrays...) returns elements present in ALL arrays. Keys from the first array are preserved. array_intersect_key and array_intersect_assoc provide key-based and key+value-based variants.
Performance: These functions use hash-based lookups internally for the value comparisons — O(n) not O(n²). But they serialize values to strings for comparison, which can cause surprises with objects or mixed types. For large datasets, verify performance with profiling — sometimes a custom loop with a $seen = [] map is faster and more explicit.
Code Example
<?php
declare(strict_types=1);
// array_unique — remove duplicates
$tags = ['php', 'laravel', 'php', 'api', 'laravel', 'backend'];
$unique = array_unique($tags);
// [0 => 'php', 1 => 'laravel', 3 => 'api', 5 => 'backend'] — gaps in keys!
$clean = array_values(array_unique($tags)); // re-index
// ['php', 'laravel', 'api', 'backend']
// Numeric deduplication
$nums = [1, '1', 1.0, 2, 2];
print_r(array_unique($nums, SORT_NUMERIC));
// [0 => 1, 3 => 2] — 1, '1', 1.0 all treated as 1
// array_diff — elements in first but not in others
$all = ['apple', 'banana', 'cherry', 'date'];
$remove = ['banana', 'date'];
$result = array_diff($all, $remove);
// [0 => 'apple', 2 => 'cherry'] — keys preserved!
// array_diff_key — find keys present in first but not in second
$full = ['name' => 'Alice', 'age' => 30, 'email' => 'alice@ex.com'];
$allowed = ['name' => '', 'email' => ''];
$extra = array_diff_key($full, $allowed); // ['age' => 30] — not in whitelist
$filtered = array_intersect_key($full, $allowed); // ['name' => 'Alice', 'email' => '...']
// array_intersect — elements present in all arrays
$admins = [1, 5, 10, 15];
$editors = [5, 10, 20, 30];
$both = array_intersect($admins, $editors); // [1 => 5, 2 => 10]
// Practical: whitelist filtering with array_intersect_key
function filterAllowed(array $data, array $allowedKeys): array
{
return array_intersect_key($data, array_flip($allowedKeys));
}
$input = ['name' => 'Bob', 'password' => 'secret', 'role' => 'admin'];
$safe = filterAllowed($input, ['name', 'role']);
// ['name' => 'Bob', 'role' => 'admin'] — password excludedInterview Q&A
Q: What is the difference between array_diff and array_diff_key?
array_diff compares array values — it returns elements from the first array whose values don't appear in any of the other arrays. Keys are preserved but irrelevant to the comparison. array_diff_key compares array keys — it returns elements from the first array whose keys don't appear in any of the other arrays, regardless of values. array_diff_assoc requires both the key AND the value to match before considering an element a duplicate. A common idiom: array_diff_key($data, array_flip($blacklist)) to remove specific keys from an array (note the array_flip to turn the value list into a key list).
Q: Why does array_unique preserve the original keys, and when is that a problem?
array_unique preserves the first occurrence's key to enable round-trip fidelity — you can compare the result with the original to see which keys were duplicates. The problem is that the result is no longer a sequential 0-indexed array, which breaks code that assumes sequential indexing (like $arr[0], $arr[count($arr)-1], or json_encode encoding it as an object instead of an array). Always call array_values(array_unique($arr)) if you need a clean sequential array.