Scaling live video without losing your weekends

Lessons from running a streaming platform at hundreds of thousands of users.

Apr 14, 2026·11 min read·Infrastructure

Scaling live video without losing your weekends

Live video punishes every shortcut. A delay you could ignore in a request/response API shows up here as a stutter that thousands of people feel at once. The work is mostly about protecting a latency budget you decided on up front.

The latency budget

Pick a glass-to-glass number and treat it as a contract. Ours was under two seconds. Every component — ingest, transcode, delivery, player — gets a slice of that budget, and any change that overruns its slice is a regression, full stop.

If you cannot say where the milliseconds go, you cannot say why the stream is slow.

Independent scaling

Ingest, transcode, and delivery have wildly different load profiles. Coupling them means scaling all three for the worst case of any one. Splitting them let transcode ride an autoscaling pool while delivery leaned on a CDN edge that got cheaper per viewer as the audience grew.

What broke first

It is never the part you fortified. For us it was connection churn during traffic spikes — thundering reconnects that the control plane could not absorb. The fix was boring: backoff, jitter, and capacity headroom you only appreciate at 2am.

Roy van Kaathoven

Technical founder energy, freelance availability