AI Engineer YouTube · June 5, 2026

Beyond Transcription: Building Voice AI That Understands Conversations — Hervé Bredin, pyannoteAI

Beyond Transcription: Building Voice AI That Understands Conversations — Hervé Bredin, pyannoteAI video thumbnail
Why it matters

The open ASR leaderboard reports Nvidia Parakeet at 11.4% word error rate on AMI meeting data. Hervé Bredin runs the same model on the same dataset and gets 26%. Same model, same recordings, different microphone: the leaderboard uses headset audio, he uses the table mic. Most voice AI benchmarks are measuring single sp

My takeaway: The open ASR leaderboard reports Nvidia Parakeet at 11.4% word error rate on AMI meeting data. Hervé Bredin runs the same model on the same dataset and gets 26%. Same model, same recordings, different microphone: the leaderboard uses headset audio, he uses the table mic. Most voice AI benchmarks are measuring single sp