Locale-aware string comparison and sorting
Concept
String comparison in PHP is byte-by-byte by default, which breaks for internationalized text. "Müller" and "Mueller" sort differently under byte comparison than they would in German alphabetical order. The Collator class from the intl extension provides locale-aware comparison and sorting.
strcmp / strcasecmp: byte-by-byte comparison, case-sensitive and case-insensitive respectively. Returns negative, zero, or positive integer — not strictly -1, 0, 1. Used with usort as usort($arr, 'strcmp'). strnatcmp / strnatcasecmp: "natural order" comparison — sorts file10.txt after file9.txt rather than before. Essential for user-facing file listings.
Collator (intl extension): Locale-aware comparison that respects language-specific rules — Swedish ä sorts after z, German ü sorts as u in most contexts, etc. Collator::compare() returns -1/0/1. Pass to usort via [$collator, 'compare']. Collator::sort() sorts an array in-place using locale rules.
setlocale: Sets the locale for C library functions like strcoll, strftime. Fragile — it sets a global state that affects all string operations in the current request, and locale availability is OS-dependent. Avoid in modern PHP; use intl instead.
Code Example
<?php
declare(strict_types=1);
// Byte comparison — fast but locale-unaware
$names = ['Müller', 'Mueller', 'Maier', 'Ångström'];
usort($names, 'strcmp');
print_r($names);
// Sorted by byte value — 'Ångström' (0xC3) sorts after all ASCII
// Natural order comparison
$files = ['file10.txt', 'file2.txt', 'file1.txt', 'file20.txt'];
usort($files, 'strnatcmp');
print_r($files);
// ['file1.txt', 'file2.txt', 'file10.txt', 'file20.txt'] — correct!
// Byte sort gives wrong order:
usort($files, 'strcmp');
print_r($files);
// ['file1.txt', 'file10.txt', 'file2.txt', 'file20.txt'] — wrong!
// Locale-aware sorting with Collator (requires intl extension)
if (class_exists('Collator')) {
$names = ['Zürich', 'Äpfel', 'Brot', 'ångström'];
// German locale
$collator = new Collator('de_DE');
$collator->sort($names);
print_r($names);
// ['Äpfel', 'ångström', 'Brot', 'Zürich'] — German alphabetical order
// Swedish locale — different rules
$collatorSv = new Collator('sv_SE');
$namesSv = ['Ångström', 'Boken', 'Äpple', 'Zebra'];
$collatorSv->sort($namesSv);
// In Swedish: Ä and Å sort AFTER Z
print_r($namesSv);
// Case-insensitive comparison
$collator->setStrength(Collator::SECONDARY); // ignores case and accents
$result = $collator->compare('Müller', 'muller'); // 0 — treated as equal
}
// strnatcasecmp — natural + case-insensitive
$mixed = ['Item10', 'item2', 'ITEM1', 'Item20'];
usort($mixed, 'strnatcasecmp');
print_r($mixed); // ['ITEM1', 'item2', 'Item10', 'Item20']