Topic

Prompt Injection

Prompt injection attacks, mitigations, detection, and design patterns for safer AI applications.

prompt injectionindirect prompt injectionjailbreakagent hijackprompt abuse

Evergreen Overview

Prompt injection is the core attack pattern in modern AI applications. It happens when a model treats malicious or conflicting instructions from users, retrieved content, documents, tools, or pages as trusted guidance and changes its behavior in response.

What this page helps explain

Direct, indirect, and cross-context prompt injection
How documents, web content, and tool output become attack carriers
Why prompt injection is a workflow problem as much as a model problem

What secure teams focus on

Trust boundaries between instructions, content, tools, and actions
Approvals, isolation, and scoped permissions for agent behavior
Detection and monitoring patterns when prompt controls fail

Who this page is for

Agent builders and platform engineers
Readers studying retrieval or tool-enabled products
Leaders who need practical language for why this risk matters

References

Current notes, events, and source material

These items are included because they add useful evidence, framing, implementation detail, or upcoming context for teams working in this area.

DEF CON August 6, 2026 - August 9, 2026 event upcoming

DEF CON 34 / AI Village 2026

DEF CON 34 takes place in Las Vegas and is expected to include AI security activity through villages, workshops, contests, and community-led research tracks as schedules firm up.

View details Open event page

Google Cloud Security Blog July 13, 2026 news

Securing the AI supply chain on GKE: Introducing k8s-aibom for automated AI BOMs

We’re open-sourcing k8s-aibom, a Kubernetes controller that continuously monitors environments to detect AI runtimes and generate standard ML-BOMs.

Agent Security Prompt Injection Model Evaluation

Read summary Source link

The Hacker News AI Security July 13, 2026 news

New MemGhost Attack Plants Persistent False Memories in AI Agents Through One Email

Give an AI assistant a memory and access to your inbox, and you hand an attacker a way to rewrite what it thinks it knows about you. A single email can trick that agent into saving a false "fact" about the user, hide the change, and quietly steer its answers in later sessions. When it works, the person reads an ordinar

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

Google Cloud Security Blog July 11, 2026 news

Contributing to U.K. financial sector resilience as a critical third party

The U.K. Treasury has designated Google Cloud EMEA as a critical third party (CTP) to the U.K. financial sector under the CTP regime. Here’s how that helps you.

Agent Security Prompt Injection Model Evaluation

Read summary Source link

The Hacker News AI Security July 10, 2026 news

Researcher Details WhatsApp-to-Host Attack Chain Using Three OpenClaw Flaws

Details have emerged about three now-patched security flaws in the OpenClaw personal artificial intelligence (AI) assistant that, if successfully exploited, could enable credential theft, privilege escalation, and arbitrary code execution on the host. A brief description of the high-severity vulnerabilities is as follo

Agent Security Prompt Injection AI Red Teaming

Read summary Source link

Microsoft Security Blog July 10, 2026 news

Securing our future: July 2026 progress report on Microsoft’s Secure Future Initiative

Microsoft’s latest Secure Future Initiative report outlines progress on secure foundations, AI-powered defense, and future-ready cybersecurity. The post Securing our future: July 2026 progress report on Microsoft’s Secure Future Initiative appeared first on Microsoft Security Blog .

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

The Hacker News AI Security July 9, 2026 news

Top AI Agents Built to Catch Malicious Code Can Be Tricked Into Running It

Ask an AI coding agent to scan open-source code for security holes, and it might run the attacker's code on your own machine instead. That is the finding in a proof-of-concept published Wednesday by the AI Now Institute, an attack it calls "Friendly Fire." It works against Anthropic's Claude Code and OpenAI's Codex whe

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

The Hacker News AI Security July 9, 2026 news

GhostApproval Symlink Flaws Could Let Malicious Repos Run Code in AI Coding Agents

Researchers at Wiz found that a flaw in six popular AI coding assistants lets a booby-trapped code project quietly take control of a developer's computer. The assistant asks permission to edit one harmless-looking file, but the write lands on a sensitive one instead. The affected tools are Amazon Q Developer, Anthropic

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

AWS Security Blog July 8, 2026 analysis

Designing for the inevitable: System prompt leakage and mitigations in generative AI applications

System prompts form the foundation of generative AI applications. A system prompt is a collection of instructions and operational context provided to a large language model (LLM) that shapes how the model behaves and interacts with users and tools. System prompts often contain proprietary information, including role de

Prompt Injection Prompt Engineering Agent Security

Read summary Source link

Google Cloud Security Blog July 8, 2026 news

Meet the 33 cybersecurity startups joining the Gemini Startup Forum

Our flagship Google for Startups program, Gemini Startup Forum: Cybersecurity, has selected its first 33 trailblazing startups.

Agent Security Prompt Injection Model Evaluation

Read summary Source link

The Hacker News AI Security July 8, 2026 news

AI Coding Agents Found Triggering Endpoint Security Rules Built to Catch Attackers

Sophos looked at a week of its own endpoint data and found that AI coding agents such as Claude Code, Cursor, and OpenAI Codex are setting off detection rules written to catch human intruders. The agents are not malicious. They just do a lot of things that, to a behavioral engine, look exactly like an attack. Decryptin

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

The Hacker News AI Security July 8, 2026 news

CISA Adds 4 Actively Exploited Adobe, Joomla, and Langflow Flaws to KEV

The U.S. Cybersecurity and Infrastructure Security Agency (CISA) on Tuesday added four security flaws to its Known Exploited Vulnerabilities (KEV) catalog, citing evidence of active exploitation. The vulnerabilities are listed below - CVE-2026-48282 (CVSS score: 10.0) - A path traversal vulnerability in Adobe ColdFusio

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

The Hacker News AI Security July 8, 2026 news

GitHub Copilot Refuses Harmful Requests in Chat, Then Writes Them in Code

An AI coding assistant that refuses to answer a dangerous request in its chat box can answer it anyway if the same request is broken into small, ordinary-looking steps inside a code editor. That is the finding of a new study of GitHub Copilot by researchers Abhishek Kumar and Carsten Maple. The models they tested throu

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

Google Cloud Security Blog July 7, 2026 news

Drive proactive security, prioritize risks with Google Threat Intelligence and Wiz ASM

To help you match your real-world exposures with real-time adversary activity, we’ve begun integrating Google Threat Intelligence with Wiz Attack Surface Management.

Agent Security Prompt Injection Model Evaluation

Read summary Source link

The Hacker News AI Security July 7, 2026 news

Public GitHub Issue Could Trick GitHub Agentic Workflows Into Leaking Private Repo Data

A public issue can trick GitHub Agentic Workflows into leaking the contents of an organization's private repositories, researchers at Noma Security have shown. The attacker needs only to open a normal-looking issue on a public repository, with no stolen credentials and no access to the organization. If that organizatio

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

The Hacker News AI Security July 7, 2026 news

Writer AI Flaw Could Let Agent Previews Leak Session Tokens Across Tenants

Cybersecurity researchers have disclosed details of a now-patched critical session isolation vulnerability in Writer, an enterprise generative artificial intelligence (AI) platform, that could result in cross-tenant compromise. The one-click vulnerability has been codenamed WriteOut by the Sand Security Research team.

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

Adversa AI Trusted AI Blog July 6, 2026 analysis

Top MCP security resources — July 2026

July's security digest covers the critical MCP vulnerabilities, real-world MCP exploitation, NSA's official MCP hardening guidelines. Explore these essential resources and secure our MCP servers. The post Top MCP security resources — July 2026 first appeared on Adversa AI .

Agent Security Prompt Injection AI Red Teaming

Read summary Source link

Microsoft Security Blog July 6, 2026 news

5 insights from Frost & Sullivan’s 2025 Frost Radar™ for Cloud Security Posture Management

Read five key learnings from the Frost & Sullivan 2025 Frost Radar™ for CSPM to learn how CSPM is evolving from point-in-time compliance to continuous risk management. The post 5 insights from Frost & Sullivan’s 2025 Frost Radar™ for Cloud Security Posture Management appeared first on Microsoft Security Blog .

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

Unit 42 AI Security July 1, 2026 news

Phantom Squatting: AI-Hallucinated Domains as a Software Supply Chain Vector

Unit 42 analysis of phantom squatting: attackers register domains that LLMs may hallucinate in recommendations, code, documentation, or support answers. That turns model error into a supply-chain path, especially when users or agents follow generated links without independent validation.

AI Red Teaming Agent Security Prompt Injection

Read summary Source link

Microsoft Security Blog July 1, 2026 news

Microsoft named a leader in the Frost Radar for cloud and application runtime security

Frost & Sullivan names Microsoft a leader as cloud and application security converge into unified, runtime risk reduction. The post Microsoft named a leader in the Frost Radar for cloud and application runtime security appeared first on Microsoft Security Blog .

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

Unit 42 AI Security June 23, 2026 news

OpenClaw’s Skill Marketplace and the Emerging AI Supply Chain Threat

Agent skill marketplaces introduce supply-chain risk when third-party skills can execute actions or collect data. Relevant to vetting, provenance, and containment controls.

AI Red Teaming Agent Security Prompt Injection

Read summary Source link

The Hacker News AI Security June 22, 2026 news

Stop Your Legacy Infrastructure from Hijacking Your AI Agents

Earlier this month, I spoke at the Gartner Security & Risk Management Summit about a blind spot most security programs are still not accounting for - how attackers are circumventing AI security programs by using legacy infrastructure to hijack AI agents. AI adoption is moving faster than security programs can account f

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

OpenAI News June 11, 2026 news

Supporting Europe’s work in ensuring a trustworthy AI ecosystem

OpenAI update on European trustworthy-AI work and governance engagement. Relevant to standards, assurance, and regulatory coordination for deployed AI systems.

Model Evaluation Prompt Injection Agent Security

Read summary Source link

The Hacker News AI Security June 11, 2026 news

New Attacks Trick OpenClaw AI Agent Into Running Code and Leaking Secrets

Two security teams have shown, in separate research published this week, that OpenClaw, the popular self-hosted AI agent, can be driven to run attacker-controlled code or hand over sensitive data through ordinary-looking inputs. Imperva buried instructions inside shared contacts, vCards, and location pins that the agen

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

Cloudflare AI Security June 9, 2026 analysis

Defend against frontier cyber models: Cloudflare's architecture as customer zero

Cloudflare describes architecture and operational lessons for defending against frontier cyber models. Relevant to AI-enabled threat modeling, defensive controls, and internal security readiness.

Agent Security Prompt Injection AI Red Teaming

Read summary Source link

Cloudflare AI Security May 18, 2026 analysis

Project Glasswing: what Mythos showed us

Cloudflare report on testing security-focused frontier models against real infrastructure code. Relevant to evaluating AI-assisted vulnerability discovery and production security workflows.

Agent Security Prompt Injection AI Red Teaming

Read summary Source link

OWASP GenAI Security Project May 14, 2026 analysis

Memory Is a Feature. It Is Also an Attack Surface

OWASP analysis of memory and context poisoning as an agent attack surface. Relevant to persistent state, trust boundaries, and regression tests for agent memory.

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

NVIDIA AI Red Team May 8, 2026 analysis

Improving Bash Generation in Small Language Models with Grammar-Constrained Decoding

NVIDIA AI Red Team post on grammar-constrained decoding for Bash generation in small language models. Relevant to safer command generation and executable-output controls.

AI Red Teaming Agent Security Prompt Injection

Read summary Source link

Cloudflare AI Security April 21, 2026 analysis

Moving past bots vs. humans

Cloudflare article on accountability models as AI assistants and privacy proxies blur bot and human distinctions. Relevant to agent identity, abuse prevention, and web access controls.

Agent Security Prompt Injection AI Red Teaming

Read summary Source link

NVIDIA AI Red Team April 20, 2026 analysis

Mitigating Indirect AGENTS.md Injection Attacks in Agentic Environments

NVIDIA guidance on mitigating indirect AGENTS.md injection in agentic coding environments. Relevant to instruction provenance, repository trust, and sandboxed automation.

AI Red Teaming Agent Security Prompt Injection

Read summary Source link

Attacking AI - Jason Haddix - NDC Security 2026 video thumbnail

Play video

NDC Conferences YouTube April 16, 2026 video

Attacking AI - Jason Haddix - NDC Security 2026

Attacking AI is a one of a kind session releasing case studies, tactics, and methodology from Arcanum’s AI assessments in 2024 and 2025. While most AI assessment material focuses on academic AI red team content, “Attacking AI” is focused on the task of assessing AI enabled systems.

Prompt Injection AI Red Teaming Agent Security

Open notes Watch on YouTube

OWASP GenAI Security Project April 15, 2026 analysis

FinBot CTF Is Live: A Hands-On Companion to the OWASP GenAI Security Project

Announcement of a hands-on CTF for agentic AI security in a financial-services scenario. Relevant to training, scenario design, and practical red-team exercises.

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

OWASP GenAI Security Project April 15, 2026 analysis

OWASP GenAI Exploit Round-up Report Q1 2026

OWASP roundup of reported GenAI incidents and exploit patterns from Q1 2026. Relevant as a threat-intelligence reference for risk tracking and test-case design.

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

Hijacking Google's CI/CD Through Prompt Injection: The New Era of AI-Based Exploits - Mackenzie Jackson video thumbnail

Play video

NDC Conferences YouTube March 23, 2026 video

Hijacking Google's CI/CD Through Prompt Injection: The New Era of AI-Based Exploits - Mackenzie Jackson

NDC Security 2026 talk on prompt injection in CI/CD and automation systems, including AI agents with access to shell commands, GitHub or GitLab tokens, issue editing, build workflows, and privileged pipeline context.

Prompt Injection Agent Security AI Red Teaming

Open notes Watch on YouTube

Microsoft Security Blog March 12, 2026 guide

Detecting and analyzing prompt abuse in AI tools

Microsoft Incident Response explains how to detect prompt abuse using logging, telemetry, and incident response workflows.

Prompt Injection Agent Security

Read summary Source link

OpenAI March 11, 2026 analysis

Designing AI agents to resist prompt injection

OpenAI frames prompt injection as an agent-security problem that increasingly resembles social engineering rather than simple string matching.

Prompt Injection Agent Security

Read summary Source link

OpenAI March 9, 2026 news

OpenAI to acquire Promptfoo

OpenAI announced plans to acquire Promptfoo, highlighting automated AI security testing, red teaming, and evaluation as core enterprise requirements.

AI Red Teaming Prompt Engineering Prompt Injection

Read summary Source link

NVIDIA AI Red Team January 30, 2026 analysis

Practical Security Guidance for Sandboxing Agentic Workflows and Managing Execution Risk

NVIDIA guidance on sandboxing agentic workflows and managing execution risk. Relevant to tool isolation, approvals, filesystem boundaries, and operational controls for coding agents.

AI Red Teaming Agent Security Prompt Injection

Read summary Source link

Doors of (AI)pportunity: The Front and Backdoors of LLMs - Kasimir Schulz & Kenneth Yeung video thumbnail

Play video

NDC Conferences YouTube January 29, 2026 video

Doors of (AI)pportunity: The Front and Backdoors of LLMs - Kasimir Schulz & Kenneth Yeung

NDC AI 2025 talk on LLM frontdoors and backdoors, jailbreak techniques, control-token abuse, local model compromise, and how attackers or insiders can manipulate model behavior.

AI Red Teaming Adversarial ML Prompt Injection

Open notes Watch on YouTube

How to Break AI Systems (Before Someone Else Does) - Gary Lopez - NDC AI 2025 video thumbnail

Play video

NDC Conferences YouTube January 28, 2026 video

How to Break AI Systems (Before Someone Else Does) - Gary Lopez - NDC AI 2025

NDC AI 2025 talk on breaking AI systems in production, covering prompt injection, hidden prompts in documents, agent goal manipulation, privacy exposure, and practical AI red-team testing methods.

AI Red Teaming Prompt Injection Agent Security

Open notes Watch on YouTube

Introduction to AI Security - Jim Manico - NDC AI 2025 video thumbnail

Play video

NDC Conferences YouTube January 28, 2026 video

Introduction to AI Security - Jim Manico - NDC AI 2025

NDC AI 2025 talk introducing AI security for developers, including model lifecycle, training data, secure integration, data leakage, prompt injection, adversarial inputs, and model bias.

AI Compliance Prompt Injection Adversarial ML

Open notes Watch on YouTube

Prompt-Jacking: The Rise of a New Supply Chain Risk - Kasimir Schulz & Kenneth Yeung video thumbnail

Play video

NDC Conferences YouTube January 28, 2026 video

Prompt-Jacking: The Rise of a New Supply Chain Risk - Kasimir Schulz & Kenneth Yeung

NDC AI 2025 talk on prompt-jacking in AI coding assistants, using Cursor vulnerability examples, hidden text in codebases, agentic behavior shaping, data exfiltration, and supply-chain style propagation.

Prompt Injection Agent Security AI Red Teaming

Open notes Watch on YouTube

OpenAI December 22, 2025 analysis

Continuously hardening ChatGPT Atlas against prompt injection attacks

OpenAI describes using automated red teaming and reinforcement learning to discover agent prompt injection attacks before they appear in the wild.

Prompt Injection Agent Security AI Red Teaming

Read summary Source link

Google Cloud Blog December 4, 2025 guide

Building a Production-Ready AI Security Foundation

Google Cloud outlines a defense-in-depth view of AI security spanning application controls, data protections, and infrastructure isolation.

Agent Security Prompt Injection Adversarial ML

Read summary Source link

Beyond the Prompt: Evaluating, Testing, and Securing LLM Applications - Mete Atamel video thumbnail

Play video

NDC Conferences YouTube November 25, 2025 video

Beyond the Prompt: Evaluating, Testing, and Securing LLM Applications - Mete Atamel

NDC Copenhagen talk on evaluating, testing, and securing LLM applications, including RAG changes, prompt-injection resilience, harmful-response guardrails, Promptfoo, DeepEval, Vertex AI Evaluation, and LLM Guard.

Model Evaluation Prompt Injection AI Engineering

Open notes Watch on YouTube

Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that video thumbnail

Play video

AI Explained November 14, 2025 video

Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that

This AI Explained video reviews a major AI development through the lens of agentic workflows and tool-use risk. It is useful context for AI engineering, evaluation, governance, and operational risk.

Model Evaluation Agent Security Prompt Engineering Prompt Injection AI Red Teaming Adversarial ML

Open notes Watch on YouTube

OpenAI November 7, 2025 guide

Understanding prompt injections: a frontier security challenge

An accessible explanation of prompt injection risk in real AI products, including how third-party content can redirect or manipulate agent behavior.

Prompt Injection Prompt Engineering

Read summary Source link

Google Cloud Blog March 5, 2025 news

Announcing AI Protection: Security for the AI era

Google introduced AI Protection and Model Armor to address prompt injection, jailbreaks, data loss, and multicloud AI workload security.

Prompt Injection Agent Security

Read summary Source link

OpenAI February 25, 2025 framework

Deep research System Card

OpenAI’s system card for deep research covers prompt injection, privacy, code execution, and external red teaming prior to release.

Model Evaluation Prompt Injection AI Compliance

Read summary Source link

OpenAI January 23, 2025 framework

Operator System Card

The Operator system card documents red teaming and mitigation choices for a computer-using agent, with prompt injections listed as a central risk area.

Agent Security Model Evaluation Prompt Injection AI Compliance

Read summary Source link

Microsoft Cloud Blog January 14, 2025 analysis

Enhancing AI safety: Insights and lessons from red teaming

Microsoft summarizes lessons from red teaming more than one hundred generative AI products, emphasizing system-level testing, human expertise, and automation.

AI Red Teaming Prompt Injection Agent Security

Read summary Source link

Microsoft Security Blog January 13, 2025 guide

3 takeaways from red teaming 100 generative AI products

Microsoft Security distills lessons from red teaming more than 100 generative AI products, including multimodal prompt injection and core cyber hygiene.

AI Red Teaming Prompt Injection

Read summary Source link

OWASP January 1, 2025 framework

OWASP Top 10 for Large Language Model Applications

OWASP’s GenAI security project remains a practical baseline for teams building or assessing LLM applications and agentic systems.

Prompt Injection Agent Security Adversarial ML

Read summary Source link

AI - 2024AD: 212-page Report (from this morning) Fully Read w/ Highlights video thumbnail

Play video

AI Explained October 10, 2024 video

AI - 2024AD: 212-page Report (from this morning) Fully Read w/ Highlights

This AI Explained video reviews a major AI development through the lens of governance and responsible deployment. It is useful context for AI engineering, evaluation, governance, and operational risk.

Model Evaluation AI Compliance Prompt Engineering Prompt Injection

Open notes Watch on YouTube

Gemini Ultra - Full Review video thumbnail

Play video

AI Explained February 8, 2024 video

Gemini Ultra - Full Review

This AI Explained video reviews a major AI development through the lens of scaling and compute economics. It is useful context for AI engineering, evaluation, governance, and operational risk.

Model Evaluation Prompt Engineering Prompt Injection

Open notes Watch on YouTube

OpenAI Insights and Training Data Shenanigans - 7 'Complicated' Developments + Guest Star video thumbnail

Play video

AI Explained December 3, 2023 video

OpenAI Insights and Training Data Shenanigans - 7 'Complicated' Developments + Guest Star

This AI Explained video reviews a major AI development through the lens of model capability and AI systems in practice. It is useful context for AI engineering, evaluation, governance, and operational risk.

Model Evaluation Prompt Engineering Prompt Injection

Open notes Watch on YouTube

11 Major AI Developments: RT-2 to '100X GPT-4' video thumbnail

Play video

AI Explained July 30, 2023 video

11 Major AI Developments: RT-2 to '100X GPT-4'

This AI Explained video reviews a major AI development through the lens of AI safety and model behavior. It is useful context for AI engineering, evaluation, governance, and operational risk.

Model Evaluation Prompt Engineering Prompt Injection AI Red Teaming

Open notes Watch on YouTube

ChatGPT's Achilles' Heel video thumbnail

Play video

AI Explained June 25, 2023 video

ChatGPT's Achilles' Heel

This AI Explained video reviews a major AI development through the lens of scaling and compute economics. It is useful context for AI engineering, evaluation, governance, and operational risk.

Model Evaluation Prompt Engineering Prompt Injection

Open notes Watch on YouTube