String immutability and copy-on-write internals
Concept
PHP strings are immutable value types — every string operation that "modifies" a string actually creates a new one. Internally this is managed through a combination of zval reference counting and copy-on-write (COW), the same mechanism that governs arrays.
When you write $b = $a where $a is a string, PHP does not allocate a new zend_string. Instead, both $a and $b's zvals point to the same zend_string object, and its refcount increments to 2. Only when you write to $b (e.g., $b .= '!') does PHP "separate" — it decrements the original zend_string's refcount, allocates a fresh one, writes the modification there, and assigns it to $b. The original $a is completely unaffected.
Interned strings are a special case: PHP interns (permanently de-duplicates) string literals and identifiers that appear in source code. An interned string has IS_STR_INTERNED flag set, which disables reference counting entirely — it lives for the full request (or the lifetime of the opcache entry) and is never freed. This is why PHP is memory-efficient for repeated string constants.
The performance implications:
- String functions like
strtolower($s)return a newzend_string—$sis unchanged, a new allocation happens $s .= $chunkin a loop: if$shas refcount 1 (only one variable points to it), PHP 7+ can sometimes resize in-place usingrealloc— a significant optimization for string building- Passing a string to a function by value: only incurs cost if the function actually writes to it (COW), so large strings are cheap to pass by value
Code Example
<?php
declare(strict_types=1);
// Immutability — string functions return new strings
$original = 'Hello World';
$upper = strtoupper($original);
echo $original; // "Hello World" — unchanged
echo $upper; // "HELLO WORLD" — new string
// COW demonstration with memory tracking
$big = str_repeat('x', 1_000_000); // 1MB string
$before = memory_get_usage();
$copy = $big; // No allocation — just refcount bump
$after_assign = memory_get_usage();
echo ($after_assign - $before); // ~0 bytes
$copy[0] = 'Y'; // Write triggers separation
$after_write = memory_get_usage();
echo ($after_write - $before); // ~1MB — actual copy made
// Efficient string building — refcount-1 in-place optimization
$s = '';
for ($i = 0; $i < 100_000; $i++) {
$s .= 'x'; // PHP can realloc in-place when refcount == 1
}
// $s is the only variable pointing to the string, so PHP avoids extra allocs
// Contrast with this (creates an extra reference, defeats optimization):
$ref = &$s; // now refcount == 2
$s .= 'y'; // must separate — can't resize in-place when refcount > 1
unset($ref);
// String interning — identical literals share storage
$a = 'hello';
$b = 'hello';
// Both point to the same interned zend_string — verified with xdebug_debug_zval
// substr() returns a new string (copy on return)
$sub = substr($s, 0, 10);
// $sub is a new zend_string; $s is unmodified