Article Issue #5183

Prompt Injection

What to know

Prompt Injection is a security vulnerability in LLM-based applications where untrusted input, supplied by a user or embedded in external data the model processes, contains instructions that the model follows in place of or in addition to the developer's intended system prompt; An attacker crafts text such as 'Ignore previous instructions and instead...' or embeds covert instructions in documents, web pages, or database records that an agent retrieves and processes; Builders deploying agents that read external content, such as email processors, document analyzers, or web browsing agents, should treat all retrieved text as untrusted data

Wikiwalls Team Administrator

May 15, 2026 2 min read

« Back to Glossary Index

Prompt Injection is a security vulnerability in LLM-based applications where untrusted input, supplied by a user or embedded in external data the model processes, contains instructions that the model follows in place of or in addition to the developer’s intended system prompt. It is the AI analog of SQL injection, exploiting the fact that LLMs do not inherently separate instructions from data.

How it works

An attacker crafts text such as ‘Ignore previous instructions and instead…’ or embeds covert instructions in documents, web pages, or database records that an agent retrieves and processes. The model, unable to distinguish malicious instructions from legitimate context, may comply. Indirect prompt injection is particularly dangerous in agentic systems where the model reads external content autonomously.

Key facts

Direct injection: Attacker controls the user-facing input field and supplies adversarial instructions.
Indirect injection: Attacker plants instructions in a document, email, or webpage that the agent retrieves automatically.
Mitigations: Input sanitization, privilege separation, restricting tool permissions, and output filtering reduce but do not eliminate risk.
No complete fix: No current mitigation fully prevents prompt injection; defense-in-depth is required.

For builders

Builders deploying agents that read external content, such as email processors, document analyzers, or web browsing agents, should treat all retrieved text as untrusted data. Implementing least-privilege tool access, adding a separate safety classifier on model outputs before executing actions, and logging all tool invocations for audit are essential defensive layers in any production agentic system.

Sources

« Back to Definition Index

If this saved you an afternoon — and we will send the next one straight to your inbox.

Wikiwalls Team

Administrator · 41 published guides · Joined 2016

Welcome to wikiwalls

How it works

Key facts

For builders

Sources

More from WikiWalls

Cursor vs Copilot vs Cody vs Windsurf, after a 30-day production diary

The Cheapest Production-Grade LLM, ranked at constant output quality

Best Mini-PC for Homelab: Beelink, Minisforum, GMKtec Tested

Best AI Note Apps: Mem vs Reflect vs Tana vs Saner.ai

One careful fix in your inbox each Wednesday.