Prompt engineering is not just about better outputs. In practice it shapes reliability, scope, fallback behavior, and how well an AI system resists misuse when instructions, tools, and untrusted content collide.
Prompt Engineering
Prompt design patterns, instruction hierarchy, and defensive prompt construction.
- Instruction hierarchy and role separation
- Clear task boundaries, fallback behavior, and refusal handling
- Prompt structures that support monitoring and repeatable evaluation
- Overloading prompts with too many responsibilities
- Relying on wording instead of system controls
- Treating prompts as static text instead of part of application design
- Teams operating prompt-heavy workflows
- Builders refining assistant and agent behavior
- Reviewers trying to connect prompt design to safety and risk
Current notes, events, and source material
These items are included because they add useful evidence, framing, implementation detail, or upcoming context for teams working in this area.
Play video
How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS
Claude would fake running tests by touching the expected output file. Nick Ni, DX engineer at WorkOS, fixed it by SHA-256 hashing the actual test output and verifying it cryptographically. His principle: make it easier to do the real work than to lie about it, and enforce that through code and state machines, not promp
Play video
How We Built Zeta2: Training an Edit Prediction Model in Production — Ben Kunkle, Zed
To validate settled data, Zed ran 10 frontier model predictions per example and measured Levenshtein distance to the final state. For 100,000 training examples that is a million frontier model requests, which is prohibitively expensive. The fix: Zeta 2's student model now approaches teacher quality, so they run it 50 t
Play video
Why (Senior) Engineers Struggle to Build AI Agents — Philipp Schmid, Google DeepMind
A `deleteItem` endpoint is obvious to the developer who built it. An agent only sees the function schema and docstring. Philipp Schmid from Google DeepMind argues this is why senior engineers struggle most: they carry years of implicit context that agents do not, and design tools assuming it. He names four other shifts
Play video
Reachy Mini: the $300 open source robot you can actually hack — Andres Marafioti, Hugging Face
Qwen3-TTS shipped at 0.8x real time: one second of audio took 1.2 seconds to generate. Andres Marafioti from Hugging Face spent two weeks fixing it. The culprits were no streaming, 500 autoregressive steps per audio packet with a CPU GPU round trip on each, and a dynamic KV cache that blocked compilation. Static KV cac
Play video
Why your agents need decision traces, not just documents — Zach Blumenfeld, Neo4j
A knowledge base tells a financial analyst agent the risk factors. A context graph tells it whether to reject or accept, because it also carries past decision traces, the reasoning behind them, and how similar cases resolved. Zach from Neo4j walks through how context graphs extend a standard RAG setup with three layers
Play video
Reverse engineering a Viking VOIP phone protocol with Claude Code — Boris Starkov, Eleven Labs
A Viking VoIP phone sat in the ElevenLabs San Francisco office for a year. Three senior engineers and ChatGPT could not get it working. Boris from ElevenLabs cracked the undocumented protocol with Claude Code in a couple of days: brute forced all 676 possible two letter command combinations, found 80 valid ones, then s
Play video
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
Traditional observability answers one question: is the system up? Phil Hetzel from Braintrust argues that question is not the right one for agents. An individual agent trace can exceed a gigabyte. A single span can hit 20 megabytes. The data is semistructured, packed with unstructured text, and still arrives in real ti
Play video
Most Enterprise Agentic Projects Are Doomed, Here's Why — Jess Grogan-Avignon & Jack Wang, Accenture
Jess Grogan-Avignon and Jack Wang at Accenture built an agentic application in two weeks. Getting it to production took another 12 months. Not because the code was wrong. Because the infrastructure team, the security team, the AI gateway team, the data governance team, and the application team all had to align before a
Play video
Context Graphs for Explainable, Decision-Aware AI Agents — Andreas Kollegger & Zaid Zaim, Neo4j
Prescribing drug X is correct 99% of the time for symptom Y. For the 1% where it is fatal, statistical reasoning does not help you. Andreas Kollegger calls this reference class validation: before the agent acts, it has to know which group it is in. Context graphs give agents the why. Not just knowledge and tools but th
Play video
Comprehend First, Code Later: The AI Skill I Rely On Daily — Priscila Andre de Oliveira, Sentry
Priscila Andre de Oliveira analyzed 116 of her own Claude sessions from daily work at Sentry. 67% were comprehension. 2% were code generation. Working in a codebase with 15 years of history, around 100 PRs merged per day, and 100,000 organizations depending on it, the unlock is not generation but understanding. She bui
Play video
Why Rust is the Ideal Language for Vibe-Coding — Daniel Szoke, Sentry
TypeScript is easy for models to write because it imposes few constraints. Those same missing constraints let models introduce data races that compile, run, and only fail intermittently. A thread safety bug in Rust does not compile. The compiler names the unsound type, explains why it cannot be sent between threads, an
Play video
The maturity phases of running evals — Phil Hetzel, Braintrust
Most teams approach evals like unit tests and try to cover every possible failure. Phil Hetzel from Braintrust argues that is the wrong frame: enumerate your known failure modes, cover those specifically, and ship. The goal is a flywheel where production traces surface what is going wrong, feed back into offline experi
Play video
Run Frontier AI at Home — Alex Cheema, EXO Labs
Running GLM 5.1, a trillion parameter model released the day before this workshop, across four Mac Studios costs around $40,000 in hardware and tops out at roughly 20 tokens per second. Alex Cheema from EXO Labs thinks both numbers have about 100x left in them. The workshop covers what that 100x looks like across the s
Play video
What the Best Agents Share — Mardu Swanepoel, Flinn AI
Harvey, Cursor, Manus, and Claude operate in completely different domains but share four patterns: focus modes that constrain the action space to improve output quality, transparent execution that surfaces tool calls and reasoning to build user trust, personalization that optimizes for speed to understanding rather tha
Play video
Stop babysitting your agents... — Brandon Walsenuk, Unblocked
Same prompt. Same agent. Same model. Without a context engine: 2.5 hours, 20.9 million tokens, multiple rounds of human correction, and code that compiled but would have broken the entire system if it shipped. With one: 25 minutes, 10.8 million tokens, and a senior engineer who gave one nitpick and approved the merge.
Play video
Agentic Evaluations at Scale, For Everybody — Nicholas Kang & Michael Aaron, Google DeepMind
On SWE-Bench Pro, six frontier models land within a couple of percentage points of each other. The harness they run inside shifts performance by 22%. A competing lab once took a Kaggle benchmark, reran it with their own compaction settings, and published much better results. Neither number was wrong. Both were useless.
Play video
Does GenAI "belong" to data scientists? — Phil Hetzel, Braintrust
At most traditional enterprises, GenAI got handed to the ML platform team because it had AI in the name. Phil Hetzel from Braintrust argues that was the wrong move, not because data scientists lack value, but because Anthropic and OpenAI already ran the data pipeline. What is left is prompt and context engineering, dis
Play video
Bounded Autonomy: Between Free Will and Determinism — Angus J. McLean, Oliver
Angus McLean spent time building a complex agent application to generate his CV. Four letters beat it: HTML. He puts the improvement at 100x. The talk is from Oliver's AI Director, where agents generate around 4,000 creative assets a day for 200 plus brands, assets you have probably seen and had no idea were AI. The co
Play video
How Google DeepMind Runs Agents at Scale — KP Sawhney & Ian Ballantyne, Google DeepMind
Google DeepMind employees have worse token quotas than paying customers. That is not a mistake. KP Sawhney explains: customers get priority, and if an internal team spikes usage on a cluster someone monitoring 24/7 will just call and ask them to stop. This panel covers how DeepMind thinks about agents at scale from the
Play video
Scaling the Next Paradigm of Heterogeneous Intelligence — Adrian Bertagnoli, Callosum
A mixture of Qwen 3 VL8B and Kimi K2.5 beat the state of the art on Video Web Arena, outperforming the leading GPT and Gemini models by 18 and 25 percent while costing 3.7 times less and running 3 times faster. The reason it worked is that visual web navigation decomposes into subtasks that do not all need a frontier m
Play video
Your Agent Is an Infinite Canvas — RL Nabors, Dressed for Space
RL Nabors built a comic reader that renders inside Claude. Full panels, navigation, transcript mode, design matched to the original site. No browser tabs. She is reading her own web comic archive entirely through an agent, and it looks like the website. The talk is a case against chat as the permanent UI of agentic sof
Play video
The Missing Primitive for Agent Swarms — Lou Bichard, Ona
Stripe called theirs Minions. RAMP called theirs Inspect. Both are internal infrastructure for running fleets of background agents, and both teams built it from scratch. Lou Bichard's argument is that this shouldn't keep happening. The talk breaks down what agent swarm infrastructure actually needs: a runtime (largely
Play video
Prompt to Pipeline: Building with Google's Gen Media Stack — Paige & Guillaume, Google DeepMind
A public domain book, a notebook, and three gen media models. Guom from Google DeepMind fed Wind in the Willows into Gemini, generated character portraits with Nano Banana, animated chapter scenes with VO, and scored each chapter with LIA, all live in the workshop. The full three hour session covers more ground. Paige
Play video
Fast Models Need Slow Developers — Sarah Chieng, Cerebras
Codex Spark, a model Cerebras built with OpenAI, generates code at 1,200 tokens per second. The Sonnet and Opus families run at 40 to 60. At that 20x difference, a context window that used to take ten minutes to fill now takes 30 seconds, and every habit built around slow generation starts producing technical debt at a
Play video
Lobster Trap: OpenClaw in Containers from Local to K8s and Back — Sally Ann O'Malley, Red Hat
Sharing a good agent setup usually means handing someone a pile of markdown, config files, and YAML and hoping they reproduce what you have. The answer in this demo is a container image: spin up a sub agent in two seconds from a Podman command, flip a flag for Kubernetes, and your personal setup becomes the team baseli
Play video
AI on Android: Ask me Anything — Florina Muntenescu & Oli Gaymond, Google DeepMind
Gemini Nano on device weighs three to four gigabytes. Shipping that per app is not realistic, which is why AI core puts it in the system once and every app shares it. Foreground apps get top priority. Background batch jobs queue and run overnight on charge. The developer never manages any of that. The tradeoff is reach
Play video
Cooking with Agents in VS Code — Liam Hampton, Microsoft
One codebase, three problems, three agents running at the same time. Liam Hampton from Microsoft demos the full loop in VS Code: a local agent with Claude Opus writing and fixing unit tests with him in the loop, a background agent using a git work tree to build a front end from a GitHub issue without him touching it, a
Play video
Scaling Agents on Kubernetes with acpx and ACP — Onur Solmaz, OpenClaw
OpenClaw receives 300 to 500 pull requests per day. Most arrive AI generated, most are not mergeable, and every one of them is signal about something broken in the codebase. Onur Solmaz built acpx to process them without him in the loop. acpx is a headless CLI for the Agent Client Protocol. It replaces PTY scraping wit
Play video
Your Coding Agent Should Do AI System Engineering — Ben Burtenshaw, Hugging Face
An agent written RMSNorm kernel hit 1.88x speedups on H100s. A finetuned Qwen3 0.6B hit 35% on LiveCodeBench. Neither result required a systems engineer. Just coding agents with the right skills loaded. Ben Burtenshaw from Hugging Face walks through three levels: using Claude Code interactively to write and benchmark C
Play video
Any-to-Any: Building Native Multimodal Agents - Patrick Löber, Google DeepMind
Draw arrows on a map and ask Gemini to generate a picture of what you see. It produces the Golden Gate Bridge. Not because it matched pixels, but because the image generation model is built on top of Gemini's world understanding and knows what those arrows are pointing at. Patrick Löber walks through the full any-to-an
Play video
Skill issue: Lessons from skilling up coding agents to use Langfuse - Marc Klingen, Clickhouse
Without a skill, Claude Code adds Langfuse using stale pre-training context, ships broken instrumentation, then catches the failure and fetches current docs to fix it. The resulting trace captures two LLM calls with no visibility into what the agent actually did. Marc Klingen covers the six learnings from building a sk
Play video
From 46% to 90%: Fine-Tuning Tiny LLMs for On-Device Agents — Cormac Brick, Google
Function Gemma ships at 270 million parameters and processes nearly 2,000 tokens per second prefill on a Pixel 7. Out of the box, on a fixed set of app intents, it hits 46% accuracy. Fine-tuned on a synthetically generated dataset, it clears 90% on eight of ten functions. Cormac Brick covers the two options developers
Play video
What Breaks When You Build AI Under Sovereignty Constraints - Bilge Yücel, deepset GmbH
If you send EU citizen data to an embedding API hosted in Virginia, you have already violated GDPR. That is one hidden assumption. Most production AI systems have dozens more, baked into the architecture long before anyone asked whether the system was sovereign. Bilge Yücel walks through the four sovereignty pillars (d
Play video
Don't Build Slop (4 Levels of AI Agent Maturity) - Ara Khan, Cline
The prompt for GPT-5.3 is one-third the size of the one written for GPT-5. Frontier models are so capable that longer system prompts cause sensory overload and degrade performance. The rule Ara Khan keeps returning to: every single thing you add to an agent risks making it worse. The talk breaks agent-building into fou
Play video
Personalization in the Era of LLMs - Shivam Verma, Spotify
Spotify represents Ariana Grande and Bruno Mars as sequences of six tokens. The first two are shared because both are pop artists. The remaining tokens diverge to capture what makes each distinct. That is a Semantic ID, and it is how Spotify teaches open-weight LLMs to reason over a catalog of 100 million tracks the sa
Play video
Rewiring the State — Eoin Mulgrew, 10 Downing Street
The cabinet office was about to spend one and a half million pounds on an outside law firm to analyze the UK statute book. One engineer embedded with the in-house legal team for two weeks instead. The tool now lives with that team and can be run whenever they want. Eoin Mulgrew from the Number 10 data science team uses
Play video
Let's go Bananas with GenMedia — Guillaume Vernade, Google DeepMind
Guillaume Vernade from Google DeepMind takes a public domain book and runs it through the full gen media stack live. Gemini reads the whole text and writes image prompts for each character and chapter. Imagen generates the portraits. Veo animates them into video clips using those images as first frames. Lyria composes
Play video
Build Agents That Run for Hours (Without Losing the Plot) — Ash Prabaker & Andrew Wilson, Anthropic
Why self-evaluation is a trap and adversarial evaluator agents work better; why context compaction doesn't cure coherence drift but structured handoffs do; how to decompose work into testable sprint contracts; how to grade subjective output with rubrics an LLM can actually apply; and how to read traces as your primary
Play video
Harnesses in AI: A Deep Dive — Tejas Kumar, IBM
The agent hit a login page, panicked, reported success anyway, and the upvote never happened. Tejas Kumar's diagnosis: not a prompt problem. A harness problem. The demo builds a browser agent on GPT-3.5 Turbo (consciously choosing a VERY old model to show how good harness eng can improve it a lot) against Hacker News a
Play video
Fighting AI with AI — Lawrence Jones, Incident
Incident's AI SRE runs hundreds of prompts per investigation across logs, metrics, traces, and code. When it produces a wrong root cause analysis, there is no tractable way for a human to read through the full trace and find where the reasoning went sideways. Lawrence Jones, founding engineer at Incident.io, describes
Play video
Why Your AI UX Is Broken (and It's Not the Model's Fault) — Mike Christensen, Ably
SSE ties a response stream to a single connection. The user refreshes the page, walks out of WiFi range, or opens a second tab and the in-progress response is gone. Abort and resume are mutually exclusive for the same reason: the only signal a client can send over a one-way pipe is closing it, so the agent cannot tell
Play video
AIE Singapore Day 2 ft. Google DeepMind, OpenClaw, Adaption, Arize, Cloudflare, Robot Company & more
May 17, 2026 - all times in SGT -- 9am - kickoff https://www.ai.engineer/singapore#schedule join us in person and on all side events https://luma.com/1eofvp02?tk=kN58jG
Play video
Beyond Code Coverage: Functionality Testing with Playwright — Marlene Mhangami, Microsoft
When an LLM writes your tests, it tends to write tests that confirm what the code does rather than tests that verify what the user experiences. Your test suite goes green. The app still breaks in ways none of those tests would catch. Marlene Mhangami from Microsoft makes the case for flipping the order: get the agent t
Play video
How to Leverage Domain Expertise — Chris Lovejoy, Notius Labs
Granola's first employee was a writer who still reviews meeting note outputs and tweaks prompts directly. Chris Lovejoy says that is not a gap in the org chart. There is no objectively perfect meeting note, so you need someone with taste doing both the assessment and the improvement. He frames this as one of three patt
Play video
Connecting the Dots with Context Graphs — Stephen Chin, Neo4j
Ask a vector RAG system about a patient's emphysema care plan and it returns generic advice: respiratory therapy, deep breathing. Give it a graph grounded in that patient's actual history and it knows they smoke, knows they've had an operation, and gives recommendations that reflect it. The information existed in both
Play video
Agents Don't Do Standups: Building the Post-Engineer Engineering Org — Mike Spitz, PFF
PFF ran a three-month case study: two engineers against a team of ten, same codebase, same customers. The two shipped five times a day. The ten shipped once every five days. Output measured by ticket complexity came out at 10x. Customer satisfaction went up, not down. Mike Spitz, their CTO, started with one reframe: st
Play video
Combine Skills and MCP to Close the Context Gap — Pedro Rodrigues, Supabase
Agents working with Postgres will confidently create a view over a table with row-level security enabled and silently bypass that security in the process. Not because they can't reason. Because they don't know about the security_invoker flag, and nobody told them. Pedro Rodrigues from Supabase ran this exact test: same
Play video
How Building with AI Can Double the Throughput of Your Engineering Team — Brian Scanlan, Intercom
Intercom hit 2x engineering throughput in under a year. Not by prompting better. By treating Claude Code like a new hire: onboarding it to a Rails monolith built over 15 years, writing skills for every recurring task, connecting it to production systems and internal tooling, and going all in on one platform instead of
Play video
AIE Singapore Day 1 ft. Minister, NanoClaw, OpenAI, Google, Vercel, Cursor & more
May 16, 2026 - all times in SGT -- 8.30am - kickoff https://www.ai.engineer/singapore#schedule join us in person and on all side events https://luma.com/1eofvp02?tk=kN58jG
Play video
Ship Real Agents: Hands-On Evals for Agentic Applications — Laurie Voss, Arize
Most agents get tested by running a few queries and checking if it looks right. Laurie calls this the vibes problem: it doesn't catch regressions, doesn't run in CI, and doesn't tell you whether a prompt fix broke three other things. This workshop builds a complete eval pipeline from scratch on a financial analysis age
Play video
Mind the Gap (In your Agent Observability) — Amy Boyd & Nitya Narasimhan, Microsoft
Agents drift. Models change, prompts get tweaked, edge cases accumulate, and the gap between what your agent does and what you need it to do widens without you noticing. Amy and Nitya walk through Microsoft Foundry's observability stack: tracing built on OpenTelemetry, built-in evaluators for quality, safety, and agent
Play video
Make your own event-sourced agent harness using stream processors — Jonas Templestein, Iterate
The abstraction is three things: state, a synchronous reducer that derives state from events, and an after-append hook for side effects. The split matters: when your program restarts after 100 events, you want to catch up state without replaying LLM requests. Everything that happens (streaming chunks, tool calls, error
Play video
Your Agent Can Now Train Models — Merve Noyan, Hugging Face
Open-source models have caught up. GLM 5.1 is leading the Artificial Analysis intelligence index over closed models, and the gap is closing fast with each release cycle. The practical upside beyond benchmarks: full weight access means you can quantize, fine-tune, and deploy to edge devices or browsers without data leav
Play video
Building a Chess Coach — Anant Dole and Asbjorn Steinskog, Take Take Take
LLMs can explain things clearly but can't play chess reliably. Take Take Take (Magnus Carlsen's app) solved this by separating concerns: Stockfish handles position evaluation, tactical and positional detectors extract concepts like forks, pins, and structural weaknesses, and the LLM's only job is translating those stru
Play video
CI/CD Is Dead, Agents Need Continuous Compute and Computers — Hugo Santos and Madison Faulkner
Traditional CI/CD was built for humans pushing one or two diffs a week. Scale to thousands of autonomous agents opening PRs continuously and you get runner saturation, cold Docker builds on every branch, cache thrash, and a merge queue that starts behaving like a serialized database lock where time-to-commit becomes th
Play video
Build & deploy AI-powered apps — Paige Bailey, Google DeepMind
Got a massive idea but stuck in the "just talking about it" phase? This session cuts the fluff and dives straight into how to build and prototype at lightning speed using AI Studio Build and Antigravity for free. It breaks down Google DeepMind's AI tech stack so viewers know exactly which tools to use, when to reach fo
Play video
Everything I Learned Training Frontier Small Models — Maxime Labonne, Liquid AI
A new class of small models is emerging with the ability to reliably follow instructions and call tools while running on-device under 1 GB of memory. In this talk, we'll break down how to post-train frontier small models using the LFM2.5 recipe: on-policy preference alignment, agentic reinforcement learning, and curric
Play video
Building your own software factory — Eric Zakariasson, Cursor
Most of us are pair-programming with one agent and stopping there. There's a lot more on the table. This workshop is about going from one agent to many. We'll start with codebase setup, the foundational work that makes agents effective on their own. Then we'll scale up to running agents in parallel, kicking off async w
Play video
Why building eval platforms is hard — Phil Hetzel, Braintrust
An eval platform is not just a test runner. You are building shared definitions of "good," reliable data pipelines, labelling workflows, versioning, and trust in results across many teams and model changes. This session breaks down the hidden complexity, the common failure modes, and the design principles that make eva
Play video
One Login to Rule Them All: Cross-App Access for MCP — Garrett Galow, WorkOS
Connecting a coding agent to multiple services often means facing a dozen OAuth consent screens, a dozen token lifecycles, and a dozen chances for something to break. Despite having Single Sign-On, users still find themselves signing in repeatedly. This talk explores how Cross-App Access leverages a three-way trust bet
Play video
Open Models at Google DeepMind — Cassidy Hardin, Google DeepMind
Open models are getting smaller, faster, and far more capable. In this talk, Cassidy Hardin walks through the latest advances in the Gemma family, with a focus on Gemma 4 and what it enables for developers building on-device and open-weight AI systems. She covers the architecture behind Gemma’s dense, effective, and mi
Play video
Lessons from Scaling GitHub's Remote MCP Server — Sam Morrow, GitHub
GitHub operates one of the most heavily-utilised MCP servers in the ecosystem, with over 4 million downloads of the stdio server alone. Discover the architectural decisions, technical challenges and lessons learned while building and scaling a remote MCP server on production infrastructure. The session walks through th
Play video
Bringing MCPs to the Enterprise — Karan Sampath, Anthropic
MCPs are often flaky, face multiple security vulnerabilities, and are generally hard to scale. Most enterprises struggle to use more than single digit numbers of MCPs due to issues with security, observability, and access control. In this talk, we'll explore the approaches and learnings we at Anthropic have been taking
Play video
Collaborative AI Engineering — Maggie Appleton, GitHub Next
Agentic engineering so far has been a solo story: one developer and a dozen agents moving at warp speed. But speed without thoughtful planning and team alignment is just wasting tokens. When everyone on a team is directing agents alone in their personal CLI tools with no shared context, you get duplicate work, conflict
Play video
MCP = Mega Context Problem - Matt Carey
The best MCP server is the one you didn't have to build. At Cloudflare we have a lot of products. Our REST OpenAPI spec is over 2.3 million tokens. When teams started building MCP servers, they did what everyone does: cherry-picked important endpoints for their product, wrote some tool definitions and shipped a separat
Play video
Full Walkthrough: Workflow for AI Coding from Planning to Production — Matt Pocock (@mattpocockuk )
A hands-on workshop covering the full lifecycle of AI-assisted development, from turning ambiguous requirements into agent-ready plans to running autonomous coding agents that ship production features. You'll learn to stress-test vague briefs into structured PRDs, slice work into thin "tracer bullet" vertical slices, a
Play video
The End of Apps — Kitze, Sizzy.co
AI Engineer session on The End of Apps, presented by Kitze, Sizzy.co. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Generative Image & Video models at Scale - Sander Dieleman (Veo and Nano Banana)
AI Engineer session on Building Generative Image & Video models at Scale - Sander Dieleman (Veo and Nano Banana). It adds practical context for how teams are building and operating AI systems in production.
Play video
How AI is changing Software Engineering: A Conversation with Gergely Orosz, @The Pragmatic Engineer
AI Engineer session on How AI is changing Software Engineering: A Conversation with Gergely Orosz, @The Pragmatic Engineer. It adds practical context for how teams are building and operating AI systems in production.
Play video
AIE Miami Day 2 ft. Cerebras, OpenCode, Cursor, Arize AI, and more!
April 21, 2026 - all times in EST -- 9:00am - Welcome to Day 2 -- 9:10am - David House, G2i Transforming Programming Mindsets: Case Studies in Agentic Coding Adoption -- 9:35am - Sarah Chieng, Cerebras Help! We're DEEP in (latency) Debt -- 10:00am - Lech Kalinowski, CallStack Ambient Generative AI: Deploying Latent Dif
Play video
Full Workshop: Build Your Own Deep Research Agents - Louis-François Bouchard, Paul Iusztin, Samridhi
AI Engineer session on Full Workshop: Build Your Own Deep Research Agents - Louis-François Bouchard, Paul Iusztin, Samridhi. It adds practical context for how teams are building and operating AI systems in production.
Play video
Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX — Adrien Grondin, Locally AI
AI Engineer session on Running LLMs on your iPhone: 40 tok/s Gemma 4 with MLX, presented by Adrien Grondin, Locally AI. It adds practical context for how teams are building and operating AI systems in production.
Play video
Taste & Craft: A Conversation with Tuomas Artman, CTO Linear & Gergely Orosz, @The Pragmatic Engineer
AI Engineer session on Taste & Craft: A Conversation with Tuomas Artman, CTO Linear & Gergely Orosz, @The Pragmatic Engineer. It adds practical context for how teams are building and operating AI systems in production.
Play video
The New Application Layer - Malte Ubl, CTO Vercel
AI Engineer session on The New Application Layer - Malte Ubl, CTO Vercel. It adds practical context for how teams are building and operating AI systems in production.
Play video
AIE Miami Keynote & Talks ft. OpenCode. Google Deepmind, OpenAI, and more!
April 20, 2026 - all times in EST -- 9:00am - Welcome to AI Engineer Miami -- 9:10am - Gabe Greenberg, G2i Opening Remarks -- 9:15am - Dax Raad, OpenCode Keynote -- 9:40am - Dexter Horthy, HumanLayer Everything We got Wrong About RPI -- 10:05am - Max Stoiber, OpenAI Coming Soon -- 10:30am - Morning Break -- 11:00am - B
Play video
Code Mode: Let the Code do the Talking - Sunil Pai, Cloudflare
AI Engineer session on Code Mode: Let the Code do the Talking - Sunil Pai, Cloudflare. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building pi in a World of Slop — Mario Zechner
AI Engineer session on Building pi in a World of Slop, presented by Mario Zechner. It adds practical context for how teams are building and operating AI systems in production.
Play video
Harness Engineering: How to Build Software When Humans Steer, Agents Execute — Ryan Lopopolo, OpenAI
AI Engineer session on Harness Engineering: How to Build Software When Humans Steer, Agents Execute, presented by Ryan Lopopolo, OpenAI. It adds practical context for how teams are building and operating AI systems in production.
Play video
Agentic Engineering: Working With AI, Not Just Using It — Brendan O'Leary
AI Engineer session on Agentic Engineering: Working With AI, Not Just Using It, presented by Brendan O'Leary. It adds practical context for how teams are building and operating AI systems in production.
Play video
Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI
AI Engineer session on Judge the Judge: Building LLM Evaluators That Actually Work with GEPA, presented by Mahmoud Mabrouk, Agenta AI. It adds practical context for how teams are building and operating AI systems in production.
Play video
Let LLMs Wander: Engineering RL Environments — Stefano Fiorucci
AI Engineer session on Let LLMs Wander: Engineering RL Environments, presented by Stefano Fiorucci. It adds practical context for how teams are building and operating AI systems in production.
Play video
Platforms for Humans and Machines: Engineering for the Age of Agents — Juan Herreros Elorza
AI Engineer session on Platforms for Humans and Machines: Engineering for the Age of Agents, presented by Juan Herreros Elorza. It adds practical context for how teams are building and operating AI systems in production.
Play video
Why, and how you need to sandbox AI-Generated Code? — Harshil Agrawal, Cloudflare
AI Engineer session on Why, and how you need to sandbox AI-Generated Code?, presented by Harshil Agrawal, Cloudflare. It adds practical context for how teams are building and operating AI systems in production.
OpenAI to acquire Promptfoo
OpenAI announced plans to acquire Promptfoo, highlighting automated AI security testing, red teaming, and evaluation as core enterprise requirements.
Play video
Build a Prompt Learning Loop - SallyAnn DeLucia & Fuad Ali, Arize
AI Engineer session on Build a Prompt Learning Loop - SallyAnn DeLucia & Fuad Ali, Arize. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building durable Agents with Workflow DevKit & AI SDK - Peter Wielander, Vercel
AI Engineer session on Building durable Agents with Workflow DevKit & AI SDK - Peter Wielander, Vercel. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Intelligent Research Agents with Manus - Ivan Leo, Manus AI (now Meta Superintelligence)
AI Engineer session on Building Intelligent Research Agents with Manus - Ivan Leo, Manus AI (now Meta Superintelligence). It adds practical context for how teams are building and operating AI systems in production.
Play video
DSPy: The End of Prompt Engineering - Kevin Madura, AlixPartners
AI Engineer session on DSPy: The End of Prompt Engineering - Kevin Madura, AlixPartners. It adds practical context for how teams are building and operating AI systems in production.
Play video
How Claude Code Works - Jared Zoneraich, PromptLayer
AI Engineer session on How Claude Code Works - Jared Zoneraich, PromptLayer. It adds practical context for how teams are building and operating AI systems in production.
Play video
OpenAI + @Temporalio : Building Durable, Production Ready Agents - Cornelia Davis, Temporal
AI Engineer session on OpenAI + @Temporalio : Building Durable, Production Ready Agents - Cornelia Davis, Temporal. It adds practical context for how teams are building and operating AI systems in production.
Play video
Agents are Robots Too: What Self-Driving Taught Me About Building Agents — Jesse Hu, Abundant
AI Engineer session on Agents are Robots Too: What Self-Driving Taught Me About Building Agents, presented by Jesse Hu, Abundant. It adds practical context for how teams are building and operating AI systems in production.
Play video
AI Copilots for Tech Architecture: The Highest-ROI Use Case You’re Not Building — Boris B., Catio
AI Engineer session on AI Copilots for Tech Architecture: The Highest-ROI Use Case You’re Not Building, presented by Boris B., Catio. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Cursor Composer — Lee Robinson, Cursor
AI Engineer session on Building Cursor Composer, presented by Lee Robinson, Cursor. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building in the Gemini Era — Kat Kampf & Ammaar Reshi, Google DeepMind
AI Engineer session on Building in the Gemini Era, presented by Kat Kampf & Ammaar Reshi, Google DeepMind. It adds practical context for how teams are building and operating AI systems in production.
Play video
Code World Model: Building World Models for Computation — Jacob Kahn, FAIR Meta
AI Engineer session on Code World Model: Building World Models for Computation, presented by Jacob Kahn, FAIR Meta. It adds practical context for how teams are building and operating AI systems in production.
Play video
Context Engineering: Connecting the Dots with Graphs — Stephen Chin, Neo4j
AI Engineer session on Context Engineering: Connecting the Dots with Graphs, presented by Stephen Chin, Neo4j. It adds practical context for how teams are building and operating AI systems in production.
Play video
Context Platform Engineering to Reduce Token Anxiety — Val Bercovici, WEKA
AI Engineer session on Context Platform Engineering to Reduce Token Anxiety, presented by Val Bercovici, WEKA. It adds practical context for how teams are building and operating AI systems in production.
Play video
Developer Experience in the Age of AI Coding Agents — Max Kanat-Alexander, Capital One
AI Engineer session on Developer Experience in the Age of AI Coding Agents, presented by Max Kanat-Alexander, Capital One. It adds practical context for how teams are building and operating AI systems in production.
Play video
Dispatch from the Future: building an AI-native Company — Dan Shipper, Every, AI & I
AI Engineer session on Dispatch from the Future: building an AI-native Company, presented by Dan Shipper, Every, AI & I. It adds practical context for how teams are building and operating AI systems in production.
Play video
Don't Build Agents, Build Skills Instead — Barry Zhang & Mahesh Murag, Anthropic
AI Engineer session on Don't Build Agents, Build Skills Instead, presented by Barry Zhang & Mahesh Murag, Anthropic. It adds practical context for how teams are building and operating AI systems in production.
Play video
From Arc to Dia: Lessons learned building AI Browsers — Samir Mody, The Browser Company of New York
AI Engineer session on From Arc to Dia: Lessons learned building AI Browsers, presented by Samir Mody, The Browser Company of New York. It adds practical context for how teams are building and operating AI systems in production.
Play video
From Vibe Coding To Vibe Engineering — Kitze, Sizzy
AI Engineer session on From Vibe Coding To Vibe Engineering, presented by Kitze, Sizzy. It adds practical context for how teams are building and operating AI systems in production.
Play video
Hard Won Lessons from Building Effective AI Coding Agents — Nik Pash, Cline
AI Engineer session on Hard Won Lessons from Building Effective AI Coding Agents, presented by Nik Pash, Cline. It adds practical context for how teams are building and operating AI systems in production.
Play video
Leadership in AI Assisted Engineering — Justin Reock, DX (acq. Atlassian)
AI Engineer session on Leadership in AI Assisted Engineering, presented by Justin Reock, DX (acq. Atlassian). It adds practical context for how teams are building and operating AI systems in production.
Play video
Minimax M2: Building the #1 Open Model — Olive Song, MiniMax
AI Engineer session on Minimax M2: Building the #1 Open Model, presented by Olive Song, MiniMax. It adds practical context for how teams are building and operating AI systems in production.
Play video
Small Bets, Big Impact Building GenBI at a Fortune 100 — Asaf Bord, Northwestern Mutual
AI Engineer session on Small Bets, Big Impact Building GenBI at a Fortune 100, presented by Asaf Bord, Northwestern Mutual. It adds practical context for how teams are building and operating AI systems in production.
Play video
The Unreasonable Effectiveness of Prompt Learning — Aparna Dhinakaran, Arize
AI Engineer session on The Unreasonable Effectiveness of Prompt Learning, presented by Aparna Dhinakaran, Arize. It adds practical context for how teams are building and operating AI systems in production.
Play video
What We Learned Deploying AI within Bloomberg’s Engineering Organization — Lei Zhang, Bloomberg
AI Engineer session on What We Learned Deploying AI within Bloomberg’s Engineering Organization, presented by Lei Zhang, Bloomberg. It adds practical context for how teams are building and operating AI systems in production.
Play video
Nano Banana Pro: But Did You Catch These 10 Details?
This AI Explained video reviews a major AI development through the lens of benchmarks and evaluation evidence. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that
This AI Explained video reviews a major AI development through the lens of agentic workflows and tool-use risk. It is useful context for AI engineering, evaluation, governance, and operational risk.
Understanding prompt injections: a frontier security challenge
An accessible explanation of prompt injection risk in real AI products, including how third-party content can redirect or manipulate agent behavior.
Play video
Building an Agentic Platform — Ben Kus, CTO Box
AI Engineer session on Building an Agentic Platform, presented by Ben Kus, CTO Box. It adds practical context for how teams are building and operating AI systems in production.
Play video
[Full Workshop] Building Conversational AI Agents - Thor Schaeff, ElevenLabs
AI Engineer session on [Full Workshop] Building Conversational AI Agents - Thor Schaeff, ElevenLabs. It adds practical context for how teams are building and operating AI systems in production.
Play video
[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)
AI Engineer session on [Full Workshop] Building Metrics that actually work, presented by David Karam, Pi Labs (fmr Google Search). It adds practical context for how teams are building and operating AI systems in production.
Play video
Building a Smarter AI Agent with Neural RAG - Will Bryk, Exa.ai
AI Engineer session on Building a Smarter AI Agent with Neural RAG - Will Bryk, Exa.ai. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Agents at Cloud Scale — Antje Barth, AWS
AI Engineer session on Building Agents at Cloud Scale, presented by Antje Barth, AWS. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building AI Products That Actually Work — Ben Hylak (Raindrop), Sid Bendre (Oleve)
AI Engineer session on Building AI Products That Actually Work, presented by Ben Hylak (Raindrop), Sid Bendre (Oleve). It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Alice’s Brain: an AI Sales Rep that Learns Like a Human - Sherwood & Satwik, 11x
AI Engineer session on Building Alice’s Brain: an AI Sales Rep that Learns Like a Human - Sherwood & Satwik, 11x. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Applications with AI Agents — Michael Albada, Microsoft
AI Engineer session on Building Applications with AI Agents, presented by Michael Albada, Microsoft. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building the platform for agent coordination — Tom Moor, Linear
AI Engineer session on Building the platform for agent coordination, presented by Tom Moor, Linear. It adds practical context for how teams are building and operating AI systems in production.
Play video
Evals Are Not Unit Tests — Ido Pesok, Vercel v0
AI Engineer session on Evals Are Not Unit Tests, presented by Ido Pesok, Vercel v0. It adds practical context for how teams are building and operating AI systems in production.
Play video
Everything is ugly, so go build something that isn't — Raiza Martin, Huxe (ex NotebookLM)
AI Engineer session on Everything is ugly, so go build something that isn't, presented by Raiza Martin, Huxe (ex NotebookLM). It adds practical context for how teams are building and operating AI systems in production.
Play video
How BlackRock Builds Custom Knowledge Apps at Scale — Vaibhav Page & Infant Vasanth, BlackRock
AI Engineer session on How BlackRock Builds Custom Knowledge Apps at Scale, presented by Vaibhav Page & Infant Vasanth, BlackRock. It adds practical context for how teams are building and operating AI systems in production.
Play video
Make your LLM app a Domain Expert: How to Build an Expert System — Christopher Lovejoy, Anterior
AI Engineer session on Make your LLM app a Domain Expert: How to Build an Expert System, presented by Christopher Lovejoy, Anterior. It adds practical context for how teams are building and operating AI systems in production.
Play video
On Engineering AI Systems that Endure The Bitter Lesson - Omar Khattab, DSPy & Databricks
AI Engineer session on On Engineering AI Systems that Endure The Bitter Lesson - Omar Khattab, DSPy & Databricks. It adds practical context for how teams are building and operating AI systems in production.
Play video
Practical tactics to build reliable AI apps — Dmitry Kuchin, Multinear
AI Engineer session on Practical tactics to build reliable AI apps, presented by Dmitry Kuchin, Multinear. It adds practical context for how teams are building and operating AI systems in production.
Play video
The 2025 AI Engineering Report — Barr Yaron, Amplify
AI Engineer session on The 2025 AI Engineering Report, presented by Barr Yaron, Amplify. It adds practical context for how teams are building and operating AI systems in production.
Play video
"Data readiness" is a Myth: Reliable AI with an Agentic Semantic Layer — Anushrut Gupta, PromptQL
AI Engineer session on "Data readiness" is a Myth: Reliable AI with an Agentic Semantic Layer, presented by Anushrut Gupta, PromptQL. It adds practical context for how teams are building and operating AI systems in production.
Play video
3 ingredients for building reliable enterprise agents - Harrison Chase, LangChain/LangGraph
AI Engineer session on 3 ingredients for building reliable enterprise agents - Harrison Chase, LangChain/LangGraph. It adds practical context for how teams are building and operating AI systems in production.
Play video
Agents, Access, and the Future of Machine Identity — Nick Nisi (WorkOS) + Lizzie Siegle (Cloudflare)
AI Engineer session on Agents, Access, and the Future of Machine Identity, presented by Nick Nisi (WorkOS) + Lizzie Siegle (Cloudflare). It adds practical context for how teams are building and operating AI systems in production.
Play video
AI Engineering with the Google Gemini 2.5 Model Family - Philipp Schmid, Google DeepMind
AI Engineer session on AI Engineering with the Google Gemini 2.5 Model Family - Philipp Schmid, Google DeepMind. It adds practical context for how teams are building and operating AI systems in production.
Play video
Build Dynamic Products, and Stop the AI Sideshow — Eliza Cabrera (Workday) + Jeremy Silva (Freeplay)
AI Engineer session on Build Dynamic Products, and Stop the AI Sideshow, presented by Eliza Cabrera (Workday) + Jeremy Silva (Freeplay). It adds practical context for how teams are building and operating AI systems in production.
Play video
Building a 10 person unicorn - Max Brodeur-Urbas, Gumloop
AI Engineer session on Building a 10 person unicorn - Max Brodeur-Urbas, Gumloop. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building agent fleet architectures your CISO doesn't hate — Lou Bichard, Gitpod
AI Engineer session on Building agent fleet architectures your CISO doesn't hate, presented by Lou Bichard, Gitpod. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Agentic Applications w/ Heroku Managed Inference and Agents — Julián Duque & Anush Dsouza
AI Engineer session on Building Agentic Applications w/ Heroku Managed Inference and Agents, presented by Julián Duque & Anush Dsouza. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Agents (the hard parts!) - Rita Kozlov, Cloudflare
AI Engineer session on Building Agents (the hard parts!) - Rita Kozlov, Cloudflare. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Code First AI Agents with Azure AI Agent Service — Cedric Vidal, Microsoft
AI Engineer session on Building Code First AI Agents with Azure AI Agent Service, presented by Cedric Vidal, Microsoft. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Effective Voice Agents — Toki Sherbakov + Anoop Kotha, OpenAI
AI Engineer session on Building Effective Voice Agents, presented by Toki Sherbakov + Anoop Kotha, OpenAI. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Multimodal AI Agents From Scratch — Apoorva Joshi, MongoDB
AI Engineer session on Building Multimodal AI Agents From Scratch, presented by Apoorva Joshi, MongoDB. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building voice agents with OpenAI — Dominik Kundel, OpenAI
AI Engineer session on Building voice agents with OpenAI, presented by Dominik Kundel, OpenAI. It adds practical context for how teams are building and operating AI systems in production.
Play video
Data is Your Differentiator: Building Secure and Tailored AI Systems — Mani Khanuja, AWS
AI Engineer session on Data is Your Differentiator: Building Secure and Tailored AI Systems, presented by Mani Khanuja, AWS. It adds practical context for how teams are building and operating AI systems in production.
Play video
Does AI Actually Boost Developer Productivity? (100k Devs Study) - Yegor Denisov-Blanch, Stanford
AI Engineer session on Does AI Actually Boost Developer Productivity? (100k Devs Study) - Yegor Denisov-Blanch, Stanford. It adds practical context for how teams are building and operating AI systems in production.
Play video
Engineering Better Evals: Scalable LLM Evaluation Pipelines That Work — Dat Ngo, Aman Khan, Arize
AI Engineer session on Engineering Better Evals: Scalable LLM Evaluation Pipelines That Work, presented by Dat Ngo, Aman Khan, Arize. It adds practical context for how teams are building and operating AI systems in production.
Play video
Forget RAG Pipelines — Build Production Ready Agents in 15 Mins: Nina Lopatina, Rajiv Shah, Contextual
AI Engineer session on Forget RAG Pipelines, presented by Build Production Ready Agents in 15 Mins: Nina Lopatina, Rajiv Shah, Contextual. It adds practical context for how teams are building and operating AI systems in production.
Play video
From Hype to Habit: How We’re Building an AI-First SaaS Company — While Still Shipping the Roadmap
AI Engineer session on From Hype to Habit: How We’re Building an AI-First SaaS Company, presented by While Still Shipping the Roadmap. It adds practical context for how teams are building and operating AI systems in production.
Play video
Fun stories from building OpenRouter and where all this is going - Alex Atallah, OpenRouter
AI Engineer session on Fun stories from building OpenRouter and where all this is going - Alex Atallah, OpenRouter. It adds practical context for how teams are building and operating AI systems in production.
Play video
How to build Enterprise Aware Agents - Chau Tran, Glean
AI Engineer session on How to build Enterprise Aware Agents - Chau Tran, Glean. It adds practical context for how teams are building and operating AI systems in production.
Play video
How to Build Planning Agents without losing control - Yogendra Miraje, Factset
AI Engineer session on How to Build Planning Agents without losing control - Yogendra Miraje, Factset. It adds practical context for how teams are building and operating AI systems in production.
Play video
How to build world-class AI products — Sarah Sachs (AI lead @ Notion) & Carlos Esteban (Braintrust)
AI Engineer session on How to build world-class AI products, presented by Sarah Sachs (AI lead @ Notion) & Carlos Esteban (Braintrust). It adds practical context for how teams are building and operating AI systems in production.
Play video
How to Train Your Agent: Building Reliable Agents with RL — Kyle Corbitt, OpenPipe
AI Engineer session on How to Train Your Agent: Building Reliable Agents with RL, presented by Kyle Corbitt, OpenPipe. It adds practical context for how teams are building and operating AI systems in production.
Play video
Mastering Engineering Flow with Windsurf - Eashan Sinha, Windsurf
AI Engineer session on Mastering Engineering Flow with Windsurf - Eashan Sinha, Windsurf. It adds practical context for how teams are building and operating AI systems in production.
Play video
Prompt Engineering and AI Red Teaming — Sander Schulhoff, HackAPrompt/LearnPrompting
AI Engineer session on Prompt Engineering and AI Red Teaming, presented by Sander Schulhoff, HackAPrompt/LearnPrompting. It adds practical context for how teams are building and operating AI systems in production.
Play video
Prompt Engineering is Dead — Nir Gazit, Traceloop
AI Engineer session on Prompt Engineering is Dead, presented by Nir Gazit, Traceloop. It adds practical context for how teams are building and operating AI systems in production.
Play video
Rethinking Team Building: how a 30-person Startup serves 50 Million Users — Grant Lee, Gamma
AI Engineer session on Rethinking Team Building: how a 30-person Startup serves 50 Million Users, presented by Grant Lee, Gamma. It adds practical context for how teams are building and operating AI systems in production.
Play video
Revenue Engineering: How to Price (and Reprice) Your AI Product — Kshitij Grover, Orb
AI Engineer session on Revenue Engineering: How to Price (and Reprice) Your AI Product, presented by Kshitij Grover, Orb. It adds practical context for how teams are building and operating AI systems in production.
Play video
Ship it! Building Production Ready Agents — Mike Chambers, AWS
AI Engineer session on Ship it! Building Production Ready Agents, presented by Mike Chambers, AWS. It adds practical context for how teams are building and operating AI systems in production.
Play video
Survive the AI Knife Fight: Building Products That Win — Brian Balfour, Reforge
AI Engineer session on Survive the AI Knife Fight: Building Products That Win, presented by Brian Balfour, Reforge. It adds practical context for how teams are building and operating AI systems in production.
Play video
The Build-Operate Divide: Bridging Product Vision and AI Operational Reality
AI Engineer session on The Build-Operate Divide: Bridging Product Vision and AI Operational Reality. It adds practical context for how teams are building and operating AI systems in production.
Play video
Using OSS models to build AI apps with millions of users — Hassan El Mghari
AI Engineer session on Using OSS models to build AI apps with millions of users, presented by Hassan El Mghari. It adds practical context for how teams are building and operating AI systems in production.
Play video
Grok 4 - 10 New Things to Know
This AI Explained video reviews a major AI development through the lens of benchmarks and evaluation evidence. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
Arrakis: How To Build An AI Sandbox From Scratch - Abhishek Bhardwaj, OpenAI
AI Engineer session on Arrakis: How To Build An AI Sandbox From Scratch - Abhishek Bhardwaj, OpenAI. It adds practical context for how teams are building and operating AI systems in production.
Play video
Break It 'Til You Make It: Building the Self-Improving Stack for AI Agents - Aparna Dhinakaran
AI Engineer session on Break It 'Til You Make It: Building the Self-Improving Stack for AI Agents - Aparna Dhinakaran. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Agents with Amazon Nova Act and MCP - Du'An Lightfoot, Amazon (Full Workshop)
AI Engineer session on Building Agents with Amazon Nova Act and MCP - Du'An Lightfoot, Amazon (Full Workshop). It adds practical context for how teams are building and operating AI systems in production.
Play video
Building AI Agents that actually automate Knowledge Work - Jerry Liu, LlamaIndex
AI Engineer session on Building AI Agents that actually automate Knowledge Work - Jerry Liu, LlamaIndex. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Protected MCP Servers — Den Delimarsky and Julia Kasper, MCP Steering Committee & Microsoft
AI Engineer session on Building Protected MCP Servers, presented by Den Delimarsky and Julia Kasper, MCP Steering Committee & Microsoft. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Reliable Support Agents Using the Effect Typescript Library - Michael Fester
AI Engineer session on Building Reliable Support Agents Using the Effect Typescript Library - Michael Fester. It adds practical context for how teams are building and operating AI systems in production.
Play video
Buy Now, Maybe Pay Later: Dealing with Prompt-Tax While Staying at the Frontier - Andrew Thomspson
AI Engineer session on Buy Now, Maybe Pay Later: Dealing with Prompt-Tax While Staying at the Frontier - Andrew Thomspson. It adds practical context for how teams are building and operating AI systems in production.
Play video
From PM at Stripe to Building an AI startup, a recent founder's journey - Mounir Mouawad
AI Engineer session on From PM at Stripe to Building an AI startup, a recent founder's journey - Mounir Mouawad. It adds practical context for how teams are building and operating AI systems in production.
Play video
How to Build Trustworthy AI — Allie Howe
AI Engineer session on How to Build Trustworthy AI, presented by Allie Howe. It adds practical context for how teams are building and operating AI systems in production.
Play video
Real AI Agents Need Planning, Not Just Prompting - Yuval Belfer
AI Engineer session on Real AI Agents Need Planning, Not Just Prompting - Yuval Belfer. It adds practical context for how teams are building and operating AI systems in production.
Play video
Stop Ordering AI Takeout A Cookbook for Winning When You Build In House - Jan Siml
AI Engineer session on Stop Ordering AI Takeout A Cookbook for Winning When You Build In House - Jan Siml. It adds practical context for how teams are building and operating AI systems in production.
Play video
Supercharging developer workflow with Amazon Q Developer - Vikash Agrawal
AI Engineer session on Supercharging developer workflow with Amazon Q Developer - Vikash Agrawal. It adds practical context for how teams are building and operating AI systems in production.
Play video
Veo 3 for Developers — Paige Bailey, Google DeepMind
AI Engineer session on Veo 3 for Developers, presented by Paige Bailey, Google DeepMind. It adds practical context for how teams are building and operating AI systems in production.
Play video
When Will AI Models Blackmail You, and Why?
This AI Explained video reviews a major AI development through the lens of agentic workflows and tool-use risk. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
How to Build Your Own AI Data Center in 2025 — Paul Gilbert, Arista Networks
AI Engineer session on How to Build Your Own AI Data Center in 2025, presented by Paul Gilbert, Arista Networks. It adds practical context for how teams are building and operating AI systems in production.
Play video
AI Improves at Self-improving
This AI Explained video reviews a major AI development through the lens of agentic workflows and tool-use risk. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
[Full Workshop from Microsoft] Github Copilot - The World's Most Widely Adopted AI Developer Tool
AI Engineer session on [Full Workshop from Microsoft] Github Copilot - The World's Most Widely Adopted AI Developer Tool. It adds practical context for how teams are building and operating AI systems in production.
Play video
AI Engineering at Jane Street - John Crepezzi
AI Engineer session on AI Engineering at Jane Street - John Crepezzi. It adds practical context for how teams are building and operating AI systems in production.
Play video
AI Engineering Without Borders — swyx
AI Engineer session on AI Engineering Without Borders, presented by swyx. It adds practical context for how teams are building and operating AI systems in production.
Play video
AI Music Generation, From Prompt to Production: Phlo Young
AI Engineer session on AI Music Generation, From Prompt to Production: Phlo Young. It adds practical context for how teams are building and operating AI systems in production.
Play video
AI Platform Engineering: Patrick Debois
AI Engineer session on AI Platform Engineering: Patrick Debois. It adds practical context for how teams are building and operating AI systems in production.
Play video
Build an AI Research Agent: Apoorva Joshi
AI Engineer session on Build an AI Research Agent: Apoorva Joshi. It adds practical context for how teams are building and operating AI systems in production.
Play video
Build enterprise generative AI apps using Llama 3 at 1,000 tokens/s on the SambaNova AI platform
AI Engineer session on Build enterprise generative AI apps using Llama 3 at 1,000 tokens/s on the SambaNova AI platform. It adds practical context for how teams are building and operating AI systems in production.
Play video
Build, Evaluate and Deploy a RAG-Based Retail Copilot with Azure AI: Cedric Vidal and David Smith
AI Engineer session on Build, Evaluate and Deploy a RAG-Based Retail Copilot with Azure AI: Cedric Vidal and David Smith. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Agents with Model Context Protocol - Full Workshop with Mahesh Murag of Anthropic
AI Engineer session on Building Agents with Model Context Protocol - Full Workshop with Mahesh Murag of Anthropic. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building AI Agents with Real ROI in the Enterprise SDLC: Bruno (Booking.com) & Beyang (Sourcegraph)
AI Engineer session on Building AI Agents with Real ROI in the Enterprise SDLC: Bruno (Booking.com) & Beyang (Sourcegraph). It adds practical context for how teams are building and operating AI systems in production.
Play video
Building an AI assistant that makes phone calls [Convex Workshop]
AI Engineer session on Building an AI assistant that makes phone calls [Convex Workshop]. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building and evaluating AI Agents — Sayash Kapoor, AI Snake Oil
AI Engineer session on Building and evaluating AI Agents, presented by Sayash Kapoor, AI Snake Oil. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building and Scaling an AI Agent Swarm of low latency real time voice bots: Damien Murphy
AI Engineer session on Building and Scaling an AI Agent Swarm of low latency real time voice bots: Damien Murphy. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building efficient hybrid context query for LLM grounding: Simrat Hanspal
AI Engineer session on Building efficient hybrid context query for LLM grounding: Simrat Hanspal. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building LinkedIn's GenAI Platform — Xiaofeng Wang
AI Engineer session on Building LinkedIn's GenAI Platform, presented by Xiaofeng Wang. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Multi agent Systems with Finite State Machines
AI Engineer session on Building Multi agent Systems with Finite State Machines. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Reliable Agentic Systems: Eno Reyes
AI Engineer session on Building Reliable Agentic Systems: Eno Reyes. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building security around ML: Dr. Andrew Davis
AI Engineer session on Building security around ML: Dr. Andrew Davis. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building State of the Art Open Weights Tool Use: The Command R Family: Sandra Kublik
AI Engineer session on Building State of the Art Open Weights Tool Use: The Command R Family: Sandra Kublik. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building with Anthropic Claude: Prompt Workshop with Zack Witten
AI Engineer session on Building with Anthropic Claude: Prompt Workshop with Zack Witten. It adds practical context for how teams are building and operating AI systems in production.
Play video
Cohere: Building enterprise LLM agents that work (Shaan Desai)
AI Engineer session on Cohere: Building enterprise LLM agents that work (Shaan Desai). It adds practical context for how teams are building and operating AI systems in production.
Play video
Don't just slap on a chatbot: building AI that works before you ask
AI Engineer session on Don't just slap on a chatbot: building AI that works before you ask. It adds practical context for how teams are building and operating AI systems in production.
Play video
From Software Developer to AI Engineer: Antje Barth
AI Engineer session on From Software Developer to AI Engineer: Antje Barth. It adds practical context for how teams are building and operating AI systems in production.
Play video
GitHub Copilot: The World's Most Widely Adopted AI Developer Tool
AI Engineer session on GitHub Copilot: The World's Most Widely Adopted AI Developer Tool. It adds practical context for how teams are building and operating AI systems in production.
Play video
Hiring & Building an AI Engineering Team: Dr. Bryan Bischof
AI Engineer session on Hiring & Building an AI Engineering Team: Dr. Bryan Bischof. It adds practical context for how teams are building and operating AI systems in production.
Play video
How to build the world's fastest voice bot: Kwindla Hultman Kramer
AI Engineer session on How to build the world's fastest voice bot: Kwindla Hultman Kramer. It adds practical context for how teams are building and operating AI systems in production.
Play video
How We Build Effective Agents: Barry Zhang, Anthropic
AI Engineer session on How We Build Effective Agents: Barry Zhang, Anthropic. It adds practical context for how teams are building and operating AI systems in production.
Play video
How Zapier Builds AI Products and Features with the Help of Braintrust: Ankur Goyal & Olmo Maldonado
AI Engineer session on How Zapier Builds AI Products and Features with the Help of Braintrust: Ankur Goyal & Olmo Maldonado. It adds practical context for how teams are building and operating AI systems in production.
Play video
Insights on Building AI Teams — Heath Black, SignalFire
AI Engineer session on Insights on Building AI Teams, presented by Heath Black, SignalFire. It adds practical context for how teams are building and operating AI systems in production.
Play video
Iterating on LLM apps at scale Learnings from Discord: Ian Webster
AI Engineer session on Iterating on LLM apps at scale Learnings from Discord: Ian Webster. It adds practical context for how teams are building and operating AI systems in production.
Play video
Keynote: The AI developer experience doesn't have to suck — why and how we built Modal
AI Engineer session on Keynote: The AI developer experience doesn't have to suck, presented by why and how we built Modal. It adds practical context for how teams are building and operating AI systems in production.
Play video
Knowledge Graphs & GraphRAG: Techniques for Building Effective GenAI Applications: Zach Blumenthal
AI Engineer session on Knowledge Graphs & GraphRAG: Techniques for Building Effective GenAI Applications: Zach Blumenthal. It adds practical context for how teams are building and operating AI systems in production.
Play video
Lessons From A Year Building With LLMs
AI Engineer session on Lessons From A Year Building With LLMs. It adds practical context for how teams are building and operating AI systems in production.
Play video
Lessons from building GenAI based applications — Juan Peredo
AI Engineer session on Lessons from building GenAI based applications, presented by Juan Peredo. It adds practical context for how teams are building and operating AI systems in production.
Play video
Lessons from the Trenches: Building LLM Evals That Work IRL: Aparna Dhinkaran
AI Engineer session on Lessons from the Trenches: Building LLM Evals That Work IRL: Aparna Dhinkaran. It adds practical context for how teams are building and operating AI systems in production.
Play video
Lets Build An Agent from Scratch
AI Engineer session on Lets Build An Agent from Scratch. It adds practical context for how teams are building and operating AI systems in production.
Play video
Open Challenges for AI Engineering: Simon Willison
AI Engineer session on Open Challenges for AI Engineering: Simon Willison. It adds practical context for how teams are building and operating AI systems in production.
Play video
OpenAI for VP's of AI + Advice for Building Agents
AI Engineer session on OpenAI for VP's of AI + Advice for Building Agents. It adds practical context for how teams are building and operating AI systems in production.
Play video
Patrick Dougherty: How to Build AI Agents that Actually Work
AI Engineer session on Patrick Dougherty: How to Build AI Agents that Actually Work. It adds practical context for how teams are building and operating AI systems in production.
Play video
Privacy First Enterprise AI: Building AI Agents that Never Leave Your Security Boundary
AI Engineer session on Privacy First Enterprise AI: Building AI Agents that Never Leave Your Security Boundary. It adds practical context for how teams are building and operating AI systems in production.
Play video
Prompt Engineering Tactics: Dan Cleary
AI Engineer session on Prompt Engineering Tactics: Dan Cleary. It adds practical context for how teams are building and operating AI systems in production.
Play video
RAG at scale: production ready GenAI apps with Azure AI Search
AI Engineer session on RAG at scale: production ready GenAI apps with Azure AI Search. It adds practical context for how teams are building and operating AI systems in production.
Play video
Scaling Agents for Gen AI Products - Anju Kambadur, Bloomberg Head of AI Engineering
AI Engineer session on Scaling Agents for Gen AI Products - Anju Kambadur, Bloomberg Head of AI Engineering. It adds practical context for how teams are building and operating AI systems in production.
Play video
Stop Guessing: Build Robust AI with Layered CoT
AI Engineer session on Stop Guessing: Build Robust AI with Layered CoT. It adds practical context for how teams are building and operating AI systems in production.
Play video
The Hidden Costs of Building Your Own RAG Stack — Ofer Vectara
AI Engineer session on The Hidden Costs of Building Your Own RAG Stack, presented by Ofer Vectara. It adds practical context for how teams are building and operating AI systems in production.
Play video
The LLM Triangle: Engineering Principles for Robust AI Applications - Almog Baku:
AI Engineer session on The LLM Triangle: Engineering Principles for Robust AI Applications - Almog Baku:. It adds practical context for how teams are building and operating AI systems in production.
Play video
The Model Isn’t Wrong — You’re Just Bad at Prompting
AI Engineer session on The Model Isn’t Wrong, presented by You’re Just Bad at Prompting. It adds practical context for how teams are building and operating AI systems in production.
Play video
Unlocking Developer Productivity across CPU and GPU with MAX: Chris Lattner
AI Engineer session on Unlocking Developer Productivity across CPU and GPU with MAX: Chris Lattner. It adds practical context for how teams are building and operating AI systems in production.
Play video
Using agents to build an agent company: Joao Moura
AI Engineer session on Using agents to build an agent company: Joao Moura. It adds practical context for how teams are building and operating AI systems in production.
Play video
Vercel AI SDK Masterclass: From Fundamentals to Deep Research
AI Engineer session on Vercel AI SDK Masterclass: From Fundamentals to Deep Research. It adds practical context for how teams are building and operating AI systems in production.
Play video
Voice Agent Engineering — Nik Caryotakis, SuperDial
AI Engineer session on Voice Agent Engineering, presented by Nik Caryotakis, SuperDial. It adds practical context for how teams are building and operating AI systems in production.
Play video
Why Agent Engineering — swyx
AI Engineer session on Why Agent Engineering, presented by swyx. It adds practical context for how teams are building and operating AI systems in production.
Play video
AI CEO: ‘Stock Crash Could Stop AI Progress’, Llama 4 Anti-climax + ‘Superintelligence in 2027’ ...
This AI Explained video reviews a major AI development through the lens of benchmarks and evaluation evidence. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
Gemini 2.5 Pro - It’s a Darn Smart Chatbot … (New Simple High Score)
This AI Explained video reviews a major AI development through the lens of benchmarks and evaluation evidence. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
OpenAI’s New ImageGen is Unexpectedly Epic … (ft. Reve, Imagen 3, Midjourney etc)
This AI Explained video reviews a major AI development through the lens of multimodal generation and provenance. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
Claude 3.7 is More Significant than its Name Implies (ft DeepSeek R2 + GPT 4.5 coming soon)
This AI Explained video reviews a major AI development through the lens of governance and responsible deployment. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
Nothing Much Happens in AI, Then Everything Does All At Once
This AI Explained video reviews a major AI development through the lens of governance and responsible deployment. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
AI - 2024AD: 212-page Report (from this morning) Fully Read w/ Highlights
This AI Explained video reviews a major AI development through the lens of governance and responsible deployment. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
o1 - What is Going On? Why o1 is a 3rd Paradigm of Model + 10 Things You Might Not Know
This AI Explained video reviews a major AI development through the lens of benchmarks and evaluation evidence. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
Grok-2 Actually Out, But What If It Were 10,000x the Size?
This AI Explained video reviews a major AI development through the lens of benchmarks and evaluation evidence. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
How Far Can We Scale AI? Gen 3, Claude 3.5 Sonnet and AI Hype
This AI Explained video reviews a major AI development through the lens of AI safety and model behavior. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
New OpenAI Model 'Imminent' and AI Stakes Get Raised (plus Med Gemini, GPT 2 Chatbot and Scale AI)
This AI Explained video reviews a major AI development through the lens of agentic workflows and tool-use risk. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
[Workshop] AI Engineering 101
AI Engineer session on [Workshop] AI Engineering 101. It adds practical context for how teams are building and operating AI systems in production.
Play video
[Workshop] AI Engineering 201: Inference
AI Engineer session on [Workshop] AI Engineering 201: Inference. It adds practical context for how teams are building and operating AI systems in production.
Play video
AI Engineering 201: The Rest of the Owl
AI Engineer session on AI Engineering 201: The Rest of the Owl. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building AI For All: Amjad Masad & Michele Catasta
AI Engineer session on Building AI For All: Amjad Masad & Michele Catasta. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Blocks for LLM Systems & Products: Eugene Yan
AI Engineer session on Building Blocks for LLM Systems & Products: Eugene Yan. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Context-Aware Reasoning Applications with LangChain and LangSmith: Harrison Chase
AI Engineer session on Building Context-Aware Reasoning Applications with LangChain and LangSmith: Harrison Chase. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Production-Ready RAG Applications: Jerry Liu
AI Engineer session on Building Production-Ready RAG Applications: Jerry Liu. It adds practical context for how teams are building and operating AI systems in production.
Play video
Building Reactive AI Apps: Matt Welsh
AI Engineer session on Building Reactive AI Apps: Matt Welsh. It adds practical context for how teams are building and operating AI systems in production.
Play video
GPT Web App Generator - 10,000 apps created in a month: Matija Sosic
AI Engineer session on GPT Web App Generator - 10,000 apps created in a month: Matija Sosic. It adds practical context for how teams are building and operating AI systems in production.
Play video
Open Questions for AI Engineering: Simon Willison
AI Engineer session on Open Questions for AI Engineering: Simon Willison. It adds practical context for how teams are building and operating AI systems in production.
Play video
Principles for Prompt Engineering - Karina Nguyen (Claude Instant @ Anthropic)
AI Engineer session on Principles for Prompt Engineering - Karina Nguyen (Claude Instant @ Anthropic). It adds practical context for how teams are building and operating AI systems in production.
Play video
Storyteller: Building Multi-modal Apps with TS & ModelFusion - Lars Grammel, PhD
AI Engineer session on Storyteller: Building Multi-modal Apps with TS & ModelFusion - Lars Grammel, PhD. It adds practical context for how teams are building and operating AI systems in production.
Play video
Using AI to Build an Infinite Game: Jeff Schomay
AI Engineer session on Using AI to Build an Infinite Game: Jeff Schomay. It adds practical context for how teams are building and operating AI systems in production.
Play video
Gemini Ultra - Full Review
This AI Explained video reviews a major AI development through the lens of scaling and compute economics. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
AI On An Exponential? Data, Mamba, and More
This AI Explained video reviews a major AI development through the lens of scaling and compute economics. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
Phi-2, Imagen-2, Optimus-Gen-2: Small New Models to Change the World?
This AI Explained video reviews a major AI development through the lens of benchmarks and evaluation evidence. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
OpenAI Insights and Training Data Shenanigans - 7 'Complicated' Developments + Guest Star
This AI Explained video reviews a major AI development through the lens of model capability and AI systems in practice. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
AI Declarations and AGI Timelines – Looking More Optimistic?
This AI Explained video reviews a major AI development through the lens of governance and responsible deployment. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
RT-X and the Dawn of Large Multimodal Models: Google Breakthrough and 160-page Report Highlights
This AI Explained video reviews a major AI development through the lens of multimodal generation and provenance. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
ChatGPT Fails Basic Logic but Now Has Vision, Wins at Chess and Prompts a Masterpiece
This AI Explained video reviews a major AI development through the lens of governance and responsible deployment. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
11 Major AI Developments: RT-2 to '100X GPT-4'
This AI Explained video reviews a major AI development through the lens of AI safety and model behavior. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
ChatGPT's Achilles' Heel
This AI Explained video reviews a major AI development through the lens of scaling and compute economics. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
'Show Your Working': ChatGPT Performance Doubled w/ Process Rewards (+Synthetic Data Event Horizon)
This AI Explained video reviews a major AI development through the lens of benchmarks and evaluation evidence. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
GPT 4 is Smarter than You Think: Introducing SmartGPT
This AI Explained video reviews a major AI development through the lens of agentic workflows and tool-use risk. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
Can GPT 4 Prompt Itself? MemoryGPT, AutoGPT, Jarvis, Claude-Next [10x GPT 4!] and more...
This AI Explained video reviews a major AI development through the lens of agentic workflows and tool-use risk. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
Google Bard - The Full Review. Bard vs Bing [LaMDA vs GPT 4]
This AI Explained video reviews a major AI development through the lens of multimodal generation and provenance. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
8 New Ways to Use Bing's Upgraded 8 [now 20] Message Limit (ft. pdfs, quizzes, tables, scenarios...)
This AI Explained video reviews a major AI development through the lens of model capability and AI systems in practice. It is useful context for AI engineering, evaluation, governance, and operational risk.
Play video
9 of the Best Bing (GPT 4) Prompts
This AI Explained video reviews a major AI development through the lens of model capability and AI systems in practice. It is useful context for AI engineering, evaluation, governance, and operational risk.