Why it matters
Ask a chat model which Pokemon names end in aw and it fails, even though it knows every Pokemon by heart. Ask Claude Code and it writes a script, fetches the list, and filters for the answer in seconds. Thariq Shihipar, who works on Claude Code at Anthropic, calls that gap capability overhang: models get smarter in spi
My takeaway: Field Guide to Fable — Thariq Shihipar, Anthropic is a model-evaluation signal. The practical read is to tie capability claims to evidence, launch criteria, and regression tests rather than relying on demos or benchmark headlines.