preg_match, preg_match_all, preg_replace, preg_split
Concept
PHP provides four core regex execution functions, each serving a distinct purpose. Understanding which to use avoids common bugs like treating a "no match" as an error, or re-running regex when you needed all matches.
preg_match($pattern, $subject, &$matches) tests whether the pattern matches and optionally captures the first match. Returns 1 on match, 0 on no match, false on regex error. The $matches array is indexed: $matches[0] is the full match, $matches[1] is the first capture group.
preg_match_all($pattern, $subject, &$matches) finds all non-overlapping matches. Returns the count of matches. $matches[0] is an array of full matches, $matches[1] is an array of first capture group values for each match. Use PREG_SET_ORDER flag to get matches indexed by match number first, group second — often more ergonomic when iterating.
preg_replace($pattern, $replacement, $subject, $limit) performs search-and-replace. Returns the modified string (or original if no match). The $limit parameter (-1 for unlimited) caps replacement count.
preg_split($pattern, $subject, $limit, $flags) splits a string by a regex delimiter. PREG_SPLIT_DELIM_CAPTURE includes captured groups in the result. PREG_SPLIT_NO_EMPTY filters out empty strings.
Error handling: Always check the return value of preg_match for === false (not just falsy) to detect malformed patterns. In PHP 8, preg_last_error_msg() returns a human-readable error string.
Code Example
<?php
declare(strict_types=1);
$log = '2024-01-15 ERROR: Connection failed
2024-01-15 INFO: Retry attempt 1
2024-01-16 ERROR: Timeout after 30s';
// preg_match — first match only
if (preg_match('/(\d{4}-\d{2}-\d{2}) (ERROR|INFO): (.+)/', $log, $m)) {
echo $m[1]; // "2024-01-15"
echo $m[2]; // "ERROR"
echo $m[3]; // "Connection failed"
}
// preg_match_all — all matches, default PREG_PATTERN_ORDER
preg_match_all('/(\d{4}-\d{2}-\d{2}) (ERROR): (.+)/m', $log, $matches);
// $matches[0] = all full matches
// $matches[1] = all dates
// $matches[2] = all levels
// $matches[3] = all messages
print_r($matches[3]); // ["Connection failed", "Timeout after 30s"]
// PREG_SET_ORDER — easier to iterate
preg_match_all('/(\d{4}-\d{2}-\d{2}) (\w+): (.+)/m', $log, $sets, PREG_SET_ORDER);
foreach ($sets as $set) {
// $set[0]=full, $set[1]=date, $set[2]=level, $set[3]=message
echo "{$set[2]}: {$set[3]}\n";
}
// preg_split — split on multiple delimiters
$csv = 'one,two;three|four';
$parts = preg_split('/[,;|]/', $csv);
// ["one", "two", "three", "four"]
// Split and keep delimiters with PREG_SPLIT_DELIM_CAPTURE
$parts = preg_split('/(,)/', 'a,b,c', flags: PREG_SPLIT_DELIM_CAPTURE);
// ["a", ",", "b", ",", "c"]
// Error detection
$result = preg_match('/[invalid/', 'test');
if ($result === false) {
echo preg_last_error_msg(); // "Compilation failed: missing terminating ] for character class at offset 1"
}