AI Engineer · April 21, 2026

Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX — Adrien Grondin, Locally AI

Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX — Adrien Grondin, Locally AI video thumbnail
Why it matters

AI Engineer session on Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX, presented by Adrien Grondin, Locally AI. It adds practical context for how teams are building and operating AI systems in production.

My takeaway: Useful for model evaluation because it ties capability claims to benchmarks, training decisions, or deployment tradeoffs instead of relying on surface-level demos.