Reading large files with generators — memory-efficient processing
Advanced5 min read·php-13-009
performance
Concept
PHP provides path manipulation functions for extracting components from file paths, resolving relative paths, and working with the filesystem path tree. These are essential for building portable code that doesn't assume specific directory structures.
Path component functions:
dirname(string $path, int $levels = 1): Returns the parent directory.dirname('/var/www/app/config.php')→'/var/www/app'. Pass$levels = 2to go up two levels.basename(string $path, ?string $suffix = null): Returns the final component.basename('/var/www/app/config.php')→'config.php'. With$suffix:basename('/var/www/app/config.php', '.php')→'config'.pathinfo(string $path, int $options = PATHINFO_ALL): Returns array or specific component. Keys:dirname,basename,filename(without extension),extension. PassPATHINFO_EXTENSIONto get just the extension.
Path resolution:
realpath(string $path): Resolves..,., and symlinks to an absolute canonical path. Returnsfalseif path doesn't exist. Use to validate that a user-supplied path doesn't escape a sandboxed directory (comparestrpos(realpath($userPath), realpath($allowedBase))— path traversal defense).__DIR__: Magic constant — the directory of the current script. Always absolute, never affected by CWD. Use__DIR__ . '/config.php'instead of bare'config.php'.__FILE__: Magic constant — the absolute path of the current script.
Code Example
php
<?php
declare(strict_types=1);
$path = '/var/www/app/public/../config/database.php';
echo dirname($path); // '/var/www/app/public/../config'
echo dirname($path, 2); // '/var/www/app/public/..'
echo basename($path); // 'database.php'
echo basename($path, '.php'); // 'database'
$info = pathinfo($path);
// [
// 'dirname' => '/var/www/app/public/../config',
// 'basename' => 'database.php',
// 'extension' => 'php',
// 'filename' => 'database',
// ]
echo pathinfo($path, PATHINFO_EXTENSION); // 'php'
echo pathinfo($path, PATHINFO_FILENAME); // 'database'
// realpath — resolves .. and symlinks, requires path to exist
$real = realpath('/var/www/app/public/../config/database.php');
// '/var/www/app/config/database.php'
// Path traversal defense
function safeReadFile(string $baseDir, string $userInput): string
{
$baseDir = rtrim(realpath($baseDir), '/');
$fullPath = realpath($baseDir . '/' . $userInput);
if ($fullPath === false || !str_starts_with($fullPath, $baseDir . '/')) {
throw new \InvalidArgumentException("Path traversal attempt: $userInput");
}
return file_get_contents($fullPath);
}
// Try to read '../../etc/passwd' — realpath resolves it, str_starts_with blocks it
// safeReadFile('/var/www/uploads', '../../etc/passwd'); // throws
// __DIR__ — robust relative path resolution
$configPath = __DIR__ . '/../config/app.php'; // relative to this file, not CWD
$config = require $configPath;
// Build paths portably (avoid hardcoding /)
$paths = [
__DIR__,
'config',
'database.php',
];
$joined = implode(DIRECTORY_SEPARATOR, $paths);
// On Windows: C:\var\www\config\database.php
// On Linux: /var/www/config/database.php