Safety system prompt builder
Every production LLM app needs a clear safety boundary, but writing one from scratch is easy to get wrong — too loose and the model helps with harmful requests, too tight and it refuses ordinary questions. This builder generates a structured safety and refusal block tailored to your application: the harm categories you select, a scope limitation that keeps the model on-task, your escalation path, and a refusal tone that matches your product.
How it works
You describe your domain, tick the risk categories relevant to your app, choose a refusal style, and optionally add an escalation contact. The tool assembles a Markdown policy with five sections: a scoped role line, an explicit refusal list, scope limitation, escalation triggers, and safe-messaging rules. Selecting fewer categories produces a tighter, app-specific policy rather than a generic catch-all. Everything is generated in your browser — nothing is sent anywhere.
Tips and notes
- Scope first. A narrow role (“a cooking recipe assistant”) prevents far more misuse than any refusal list, because it makes off-topic harmful requests out of scope by default.
- Layer your defenses. A system prompt can be jailbroken; combine it with a moderation endpoint and human review for high-risk paths.
- Use real escalation contacts. Replace placeholder hotlines with the actual support address or emergency number for your region.
- Test it. Run adversarial prompts against the block before shipping and tighten the wording where it fails.