CAP theorem — what it means for a PHP app developer
Advanced5 min read·eng-11-007
interview
Concept
CAP theorem states that a distributed data system can guarantee at most two of three properties:
- C — Consistency: Every read receives the most recent write or an error. All nodes see the same data at the same time.
- A — Availability: Every request receives a response (not an error), though it may not be the most recent data.
- P — Partition Tolerance: The system continues operating even when network partitions occur (nodes can't communicate with each other).
The key insight: Network partitions (P) WILL happen in any distributed system. You can't avoid them. Therefore, you must choose between C and A:
- CP systems: Sacrifice availability. When a partition occurs, refuse requests that might return inconsistent data. Systems: HBase, MongoDB (write concern = majority), ZooKeeper.
- AP systems: Sacrifice consistency. When a partition occurs, continue serving requests, possibly with stale data. Systems: DynamoDB, Cassandra, CouchDB.
What this means for PHP developers:
- MySQL (single node): Not distributed — CAP doesn't strictly apply. Provides ACID guarantees.
- MySQL with replication: CP-ish. If the primary fails and a replica isn't fully synced, you choose: block reads (consistent but unavailable) or serve stale data (available but inconsistent).
- Redis: Typically AP. Redis Sentinel/Cluster can have brief inconsistencies during failover.
- Read replicas: An AP compromise — reads from replicas may be slightly stale (available, not always consistent).
PACELC: A refinement of CAP that says even when there's NO partition, you still trade off between latency and consistency. A useful model for everyday decisions.
Practical takeaway: For most PHP apps, use a single-region relational DB (ACID compliant), Redis for caching (accept eventual consistency), and design your API to tolerate stale cache reads.
Code Example
php
<?php
// CAP in practice — choosing consistency vs availability for cache reads
// ============================================================
// AVAILABILITY PREFERRED — serve stale data rather than fail
// ============================================================
class ProductService
{
public function getProduct(int $id): array
{
try {
// Try cache first
if ($cached = Cache::get("product:{$id}")) {
return $cached; // might be slightly stale — that's OK
}
// Cache miss → try DB
$product = Product::findOrFail($id);
Cache::put("product:{$id}", $product->toArray(), 3600);
return $product->toArray();
} catch (\Exception $e) {
// If both cache and DB fail (partition), return a degraded response
\Log::error("Product service unavailable: {$e->getMessage()}");
return ['id' => $id, 'name' => 'Product temporarily unavailable', 'available' => false];
}
// Chose AVAILABILITY: always return something, even if stale or degraded
}
}
// ============================================================
// CONSISTENCY PREFERRED — fail rather than return stale data
// ============================================================
class InventoryService
{
public function getStock(int $productId): int
{
// For stock levels, staleness could cause overselling — prefer consistency
// Skip cache entirely for this critical data
return Product::lockForUpdate()->findOrFail($productId)->stock;
// If DB is unavailable: throw exception → show error to user
// Chose CONSISTENCY: fail rather than show wrong stock count
}
}
// ============================================================
// EVENTUAL CONSISTENCY — accept lag, design around it
// ============================================================
class OrderAnalytics
{
public function getTotalRevenue(): float
{
// Analytics don't need to be real-time — eventual consistency is fine
return Cache::remember('analytics:total_revenue', 3600, function () {
return Order::where('status', 'completed')->sum('total'); // might be 1hr old
});
// User sees revenue that's up to 1 hour stale — acceptable for a dashboard
}
}
// ============================================================
// Replication lag — the real-world AP compromise
// ============================================================
// In config/database.php with 'sticky' => true:
// After a write in the current request, all subsequent reads use the write connection
// (primary) instead of a replica — ensures the writer sees their own writes
// without 'sticky' => false: you might write to primary, then read from a replica
// that hasn't received the replication update yet → inconsistency!