System prompt token auditor
Your system prompt is the most expensive text you own, because it rides along on every request. A bloated preamble that nobody re-reads still gets billed tens of millions of times a month. This auditor scans your prompt for filler, redundancy, and over-specification, estimates how many tokens you could shed, and turns that into a real annual dollar figure based on your own call volume and pricing.
How it works
The tool estimates token count with a character-based heuristic that tracks tiktoken closely for English. It then scans for known waste patterns — politeness padding (“please make sure to”), hedging (“if possible, try to”), redundant restatements, and long phrasings with shorter equivalents — and produces a suggested slim version. It multiplies the tokens saved by your daily call volume and price per million input tokens to show daily, monthly, and annual savings. Everything runs locally; nothing is uploaded.
Tips and notes
The biggest wins usually come from cutting entire redundant sentences, not word-level tweaks — if two instructions say the same thing, delete one. Move rarely needed detail out of the system prompt and into the user turn only when relevant. After trimming, run your evaluation set: a leaner prompt occasionally loosens behavior, and a regression costs far more than the tokens saved. If most of your system prompt is fixed, check whether your provider supports prompt caching, which can make the repeated input nearly free regardless of length.