Backend Development10 min read

Scaling Laravel APIs: From 500 to 10,000 Concurrent Users

Practical techniques for scaling Laravel beyond its reputation — queue-based processing, Redis caching strategies, and database optimization patterns from the Avidnote project.

LaravelScalingRedisPostgreSQLPerformance

Laravel's Scaling Bottleneck Is Not Laravel

The common myth is that PHP/Laravel can't scale. The reality: Laravel's bottleneck is almost always the database, not the framework. A well-optimized Laravel API on PHP 8.3 with OPcache handles thousands of requests per second. The real challenge is what happens behind each request — N+1 queries, missing indexes, and synchronous processing of slow operations. Here's exactly how I took the Avidnote API from choking at 500 concurrent users to comfortably serving 10,000+.

Step 1: Eliminate N+1 Queries with Eager Loading

The single highest-impact change. I audited every endpoint with Laravel Debugbar and found 47 N+1 query patterns. The literature review endpoint alone was executing 200+ queries per request. By adding proper eager loading with constrained relationships, total query count dropped by 85%.

// Before: N+1 disaster (200+ queries)
$papers = Paper::all();
foreach ($papers as $paper) {
    $paper->authors; // Lazy load per iteration
    $paper->citations; // Another lazy load
}

// After: 3 queries total
$papers = Paper::with([
    'authors:id,name,paper_id',
    'citations' => fn($q) => $q->select('id', 'title', 'paper_id')->limit(10),
])->paginate(25);

Step 2: Queue Long-Running Operations

AI inference calls took 2-8 seconds each. Running them synchronously meant the PHP worker was blocked, and under load, the worker pool was exhausted in seconds. Moving these to Laravel Horizon queues with auto-scaling workers (4-16 based on queue depth) eliminated all timeout errors. The API now returns a 202 Accepted with a job ID, and the client polls for results.

Step 3: Multi-Layer Caching with Event-Driven Invalidation

Time-based cache expiry serves stale data. No caching hammers the database. The sweet spot: event-driven invalidation using Laravel model observers. When a model updates, only its specific cache keys are purged. This gave us a 92% cache hit rate in production while serving fresh data.

The Results

Average API response time dropped from 1.8s to 72ms. Database query volume reduced by 78%. The platform now handles 10,000+ concurrent users with 99.9% uptime. Total cost of infrastructure: ~$400/month on AWS. Laravel scales just fine — you just need to know where to look.