AI Engineer YouTube · May 12, 2026

Dark Factory: How OpenClaw Ships Faster Than You Can Read the Diff — Vincent Koc

Dark Factory: How OpenClaw Ships Faster Than You Can Read the Diff — Vincent Koc video thumbnail
Why it matters

Static benchmarks made sense for static software. Agents that adapt to users, rewrite their own harnesses, and shift behavior over time break that assumption. This talk is about what evaluation looks like when the system you're measuring keeps changing underneath you. Vincent Koc traces the arc from prompt engineering

My takeaway: Static benchmarks made sense for static software. Agents that adapt to users, rewrite their own harnesses, and shift behavior over time break that assumption. This talk is about what evaluation looks like when the system you're measuring keeps changing underneath you. Vincent Koc traces the arc from prompt engineering