AI Engineer · April 10, 2026

Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI

Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI video thumbnail
Why it matters

AI Engineer session on Judge the Judge: Building LLM Evaluators That Actually Work with GEPA, presented by Mahmoud Mabrouk, Agenta AI. It adds practical context for how teams are building and operating AI systems in production.

My takeaway: Useful for agent design because it shows how orchestration, tool use, and system boundaries affect reliability and production behavior.