AI Engineer YouTube · May 31, 2026

Spec-Driven Testing for Agents With A Brain the Size of A Planet — Steven Willmott, SafeIntelligence

Spec-Driven Testing for Agents With A Brain the Size of A Planet — Steven Willmott, SafeIntelligence video thumbnail
Why it matters

Wrapping a malicious instruction in a poem is an effective jailbreak against large models and not against small ones. Small models don't understand the poem. Large models do and execute the instruction. Steven Willmott from Safe Intelligence argues this is one reason bigger is not straightforwardly safer: a larger mode

My takeaway: Wrapping a malicious instruction in a poem is an effective jailbreak against large models and not against small ones. Small models don't understand the poem. Large models do and execute the instruction. Steven Willmott from Safe Intelligence argues this is one reason bigger is not straightforwardly safer: a larger mode