AI Engineering // Security // Governance

AI systems in practice

Engineering notes and curated references on AI systems, prompt injection, agent behavior, responsible AI, governance, and compliance.

Calendar

Upcoming events

Future talks, workshops, conferences, and community events related to AI systems, security, compliance, and red teaming.

AI.Engineer June 29, 2026 - July 2, 2026 event upcoming

AI Engineer World's Fair - JUNE 29 - JULY 2, 2026 • SAN FRANCISCO, CA

AI Engineer runs the most viewed technical conferences in AI for engineers, with over 10M+ views of our talks online. We are back in SF for the 4th year in a row! This is the one place you can meet with every major frontier lab, leading AI clouds, and AI native/transformed companies — from disruptive AI startups to Fortune 500 AI leaders, and every notable building block in the LLM OS ecosystem.

Featured Reading

Current material worth reading

Curated research, system cards, and technical write-ups that are useful for understanding how AI systems are being evaluated, attacked, governed, and deployed in practice.

Anthropic Frontier Red Team April 7, 2026 news

Assessing Claude Mythos Preview’s cybersecurity capabilities

Claude Mythos Preview is a new general-purpose language model that is strikingly capable at computer security tasks. This post provides technical details for researchers and practitioners who want to understand exactly how we have been testing this model, and what we have found over the past month. We hope this will sh

Latest Notes

New additions to the research library

Recent notes and references across prompt injection, agent security, evaluations, responsible AI, and adjacent AI work.

How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS video thumbnail Play video
AI Engineer YouTube May 30, 2026 video

How I deleted 95% of my agent skills and got better results — Nick Nisi, WorkOS

Claude would fake running tests by touching the expected output file. Nick Ni, DX engineer at WorkOS, fixed it by SHA-256 hashing the actual test output and verifying it cryptographically. His principle: make it easier to do the real work than to lie about it, and enforce that through code and state machines, not promp

How We Built Zeta2: Training an Edit Prediction Model in Production — Ben Kunkle, Zed video thumbnail Play video
AI Engineer YouTube May 30, 2026 video

How We Built Zeta2: Training an Edit Prediction Model in Production — Ben Kunkle, Zed

To validate settled data, Zed ran 10 frontier model predictions per example and measured Levenshtein distance to the final state. For 100,000 training examples that is a million frontier model requests, which is prohibitively expensive. The fix: Zeta 2's student model now approaches teacher quality, so they run it 50 t

Microsoft Security Blog May 30, 2026 news

Malicious npm packages abuse dependency confusion to profile developer environments

A dependency confusion campaign leveraged 33 malicious npm packages to collect reconnaissance data from developer and build environments. This report details the attack chain, observed tradecraft, and detection opportunities to help organizations identify and disrupt related activity. The post Malicious npm packages ab

Why (Senior) Engineers Struggle to Build AI Agents — Philipp Schmid, Google DeepMind video thumbnail Play video
AI Engineer YouTube May 30, 2026 video

Why (Senior) Engineers Struggle to Build AI Agents — Philipp Schmid, Google DeepMind

A `deleteItem` endpoint is obvious to the developer who built it. An agent only sees the function schema and docstring. Philipp Schmid from Google DeepMind argues this is why senior engineers struggle most: they carry years of implicit context that agents do not, and design tools assuming it. He names four other shifts

Topic Coverage

Prompt engineering, AI compliance, agent security, and more

These topic hubs connect current engineering and research with the parts of AI security, governance, evaluation, and system behavior that are most useful in practice.

AI Red Teaming

Methods, case studies, and tooling for red teaming AI systems end to end.

Open topic
Prompt Engineering

Prompt design patterns, instruction hierarchy, and defensive prompt construction.

Open topic
Prompt Injection

Prompt injection attacks, mitigations, detection, and design patterns for safer AI applications.

Open topic
Agent Security

Controls and attack paths for browsing, tool use, memory, identity, and action-taking agents.

Open topic
Model Evaluation

Safety evaluations, system cards, preparedness, and security measurement for frontier models.

Open topic
AI Compliance

Responsible AI, governance, standards, and regulatory reference material for teams mapping AI systems to policy and operational controls.

Open topic
Adversarial ML

Adversarial machine learning attacks, taxonomies, and mitigations across the ML lifecycle.

Open topic
AI Engineering

Application architecture, developer workflow, tooling, and production patterns for building AI systems.

Open topic
Profile

Profile and contact

Focused on AI engineering, responsible AI, compliance, model behavior, and operational AI systems. Current work includes founding AI operational software for compliance and financial tracking.