Question 1

What is prompt injection in simple terms?

Accepted Answer

Prompt injection is an attack where malicious text tricks a language model into ignoring its original instructions and following the attacker's instead. Because an LLM treats your system prompt and the user's input as the same stream of text, a cleverly worded input can override the developer's intended behaviour — for example, making a support bot reveal its hidden instructions or perform unintended actions.

Question 2

What is the difference between direct and indirect prompt injection?

Accepted Answer

Direct injection is when the attacker types malicious instructions straight into the chat (for example, 'ignore previous instructions and...'). Indirect injection hides the malicious instructions inside content the model later reads — a web page, email, PDF, or database record — so the attack triggers when the AI processes that external data. Indirect injection is more dangerous because the victim may never see it.

Question 3

Why can't prompt injection just be filtered out?

Accepted Answer

Because instructions and data share the same channel — natural language — there is no reliable way to perfectly separate 'commands' from 'content'. Filters and classifiers reduce risk but can be bypassed with paraphrasing, encoding, or novel phrasing. Prompt injection is widely considered an unsolved problem, so defence relies on layered mitigations rather than a single fix.

Question 4

What is the worst that prompt injection can do?

Accepted Answer

The impact depends on what the AI is connected to. A read-only chatbot might leak its system prompt or produce off-policy output. An agent with tools — email, code execution, database access, payments — could be manipulated into exfiltrating data, sending messages, or taking destructive actions. The more capabilities you give the model, the higher the stakes, which is why least-privilege design matters.

What Is Prompt Injection? How It Works and How to Prevent It

What prompt injection is

How it works

Direct vs indirect injection

The real-world risks

How to prevent it