AI Engineer YouTube · May 27, 2026

The maturity phases of running evals — Phil Hetzel, Braintrust

The maturity phases of running evals — Phil Hetzel, Braintrust video thumbnail
Why it matters

Most teams approach evals like unit tests and try to cover every possible failure. Phil Hetzel from Braintrust argues that is the wrong frame: enumerate your known failure modes, cover those specifically, and ship. The goal is a flywheel where production traces surface what is going wrong, feed back into offline experi

My takeaway: Most teams approach evals like unit tests and try to cover every possible failure. Phil Hetzel from Braintrust argues that is the wrong frame: enumerate your known failure modes, cover those specifically, and ship. The goal is a flywheel where production traces surface what is going wrong, feed back into offline experi