Why it matters
AI Engineer session on How fast are LLM inference engines anyway?, presented by Charles Frye, Modal. It adds practical context for how teams are building and operating AI systems in production.
My takeaway: Useful for model evaluation because it ties capability claims to benchmarks, training decisions, or deployment tradeoffs instead of relying on surface-level demos.