Prompt injection test suite
Prompt injection is the number-one security risk for LLM applications: a user pastes text that tricks the model into ignoring its system prompt, leaking hidden instructions, or doing something the developer never intended. The prompt injection test suite fires 50 known attack strings at your chatbot endpoint and flags responses that show signs of compromise. It runs from your browser, sending requests directly to the endpoint you control.
How it works
You provide the endpoint URL and the name of the JSON field your endpoint expects the user message in. The suite contains attack strings grouped into families — direct role override (“ignore previous instructions”), instruction-ignoring, system-prompt and data extraction, and indirect injection (malicious content disguised inside data the bot is asked to process). Each selected attack is sent as a POST request with a JSON body, and the response is scanned for compromise signals: leaked instruction fragments, role-confirmation phrases, or a canary marker the attack tries to make the bot output. Responses that match are flagged for your review. All requests go directly from your browser to your endpoint.
Tips and notes
- Test the deployed prompt. Run the suite against the exact system prompt and model you ship — guardrails that hold on one model can fail on another.
- Watch indirect injection. If your bot summarises web pages or documents, the attack can hide inside that content — test that path specifically.
- A flag is a lead, not a verdict. Read the flagged response; some matches are false positives, and some real breaches are subtle.
- Re-run after every change. Prompt tweaks, model upgrades, and new tools can all reopen a hole you previously closed.