String concatenation performance — the . operator vs sprintf vs interpolation
Intermediate5 min read·php-15-005
performance
Concept
PHP string operations vary dramatically in performance. Understanding when to use built-in string functions vs regex, and which functions are fast vs slow, prevents common performance bottlenecks.
Key performance principles:
- Built-in functions beat regex:
str_contains(),str_starts_with(),str_ends_with(),strpos()are C-level and far faster thanpreg_match()for simple checks. Use them when you don't need pattern matching. - Avoid concatenation in loops:
$str .= "piece"in a 10,000 iteration loop creates 10,000 intermediate strings. Accumulate into an array andimplode()at the end — this is significantly faster. - sprintf vs concatenation: For simple cases, concatenation is marginally faster. For complex formatting,
sprintfis cleaner and comparable in speed. - Regular expression compilation:
preg_match()compiles the pattern on each call. PHP caches the compiled form (PCRE cache, default 4096 patterns), but cache thrashing on many unique patterns is a risk. Reuse the same patterns. - mb_ functions are slower:
mb_strlen(),mb_substr(), etc. involve encoding detection and conversion. Use only when you actually need multibyte support. str_split()vsmb_str_split():str_split()splits by bytes;mb_str_split()(PHP 7.4+) splits by characters. For ASCII data, always usestr_split().
Code Example
php
<?php
declare(strict_types=1);
$haystack = "Hello, World! This is a test string.";
$needle = "World";
// FAST: built-in byte-level functions
$found = str_contains($haystack, $needle); // PHP 8.0+
$found = strpos($haystack, $needle) !== false; // equivalent, slightly more work to write
// SLOW: regex for simple containment
$found = (bool) preg_match('/World/', $haystack); // overkill — don't do this
// SLOW: string building in loop
function buildSlow(array $items): string
{
$result = '';
foreach ($items as $item) {
$result .= "$item, "; // each iteration allocates new string
}
return rtrim($result, ', ');
}
// FAST: collect then join
function buildFast(array $items): string
{
return implode(', ', $items); // one allocation
}
// Regex precompilation cache
function classifyEmailDomains(array $emails): array
{
$pattern = '/^[^@]+@(gmail|yahoo|outlook)\.com$/i'; // same pattern each call
return array_filter($emails, fn($e) => preg_match($pattern, $e));
}
// The pattern is compiled once and cached — subsequent calls reuse the compiled form
// str_split vs explode for tokenization
$csv = "Alice,Bob,Charlie";
$parts = explode(',', $csv); // FAST: delimiter-based
$chars = str_split("Hello", 2); // ["He", "ll", "o"] — byte chunks
// sprintf vs concatenation
$name = "Alice";
$score = 95.5;
$msg1 = "User: $name, Score: " . number_format($score, 1); // interpolation + concat
$msg2 = sprintf("User: %s, Score: %.1f", $name, $score); // sprintf
// Both are fine — sprintf is cleaner for multiple substitutions
// mb_ only when needed
$utf8 = "Héllo"; // 5 characters, 6 bytes
echo strlen($utf8); // 6 (bytes — WRONG for character count)
echo mb_strlen($utf8); // 5 (characters — correct)
echo str_split($utf8, 1)[1]; // potentially garbage (splits by byte)
echo mb_substr($utf8, 1, 1); // "é" (correct)