AI Engineer YouTube · May 31, 2026

Engineering voice agents: Latency, quality, and scale — Rishabh Bhargava, Together AI

Engineering voice agents: Latency, quality, and scale — Rishabh Bhargava, Together AI video thumbnail
Why it matters

Users notice latency above 500ms and hang up above one second. In an already optimized pipeline, 75ms of network latency from models sitting in a different data center adds 30% overhead. Colocating everything in the same building drops that to around 5ms. Rishabh Bhargava from Together AI walks through the full speech

My takeaway: Users notice latency above 500ms and hang up above one second. In an already optimized pipeline, 75ms of network latency from models sitting in a different data center adds 30% overhead. Colocating everything in the same building drops that to around 5ms. Rishabh Bhargava from Together AI walks through the full speech