AI Engineer YouTube · May 26, 2026

Run Frontier AI at Home — Alex Cheema, EXO Labs

Run Frontier AI at Home — Alex Cheema, EXO Labs video thumbnail
Why it matters

Running GLM 5.1, a trillion parameter model released the day before this workshop, across four Mac Studios costs around $40,000 in hardware and tops out at roughly 20 tokens per second. Alex Cheema from EXO Labs thinks both numbers have about 100x left in them. The workshop covers what that 100x looks like across the s

My takeaway: Running GLM 5.1, a trillion parameter model released the day before this workshop, across four Mac Studios costs around $40,000 in hardware and tops out at roughly 20 tokens per second. Alex Cheema from EXO Labs thinks both numbers have about 100x left in them. The workshop covers what that 100x looks like across the s