Prompt token trimmer
Every token in a prompt is paid for on every single call — so a bloated system prompt quietly taxes your whole application. This tool shortens a prompt toward a target token budget using deterministic rules that strip the parts models don’t need: politeness filler, verbose preambles, and redundant phrasing. A live token estimate shows how far each edit moves you toward the budget.
How it works
The trimmer applies a sequence of safe transformations:
- removes politeness filler (“please”, “kindly”, “thank you”),
- cuts common verbose preambles (“I would like you to”, “your task is to”) down to the imperative,
- collapses redundant intensifiers and duplicate whitespace,
- at higher aggressiveness, replaces wordy phrases with shorter equivalents (“in order to” → “to”, “due to the fact that” → “because”).
Token counts are estimated with the standard ~4-characters-per-token heuristic so you can compare before and after at a glance. Nothing leaves your browser. If a rule cuts something you need, just edit it back in the output before copying.
Tips and notes
Start at the light level and only escalate if you are still over budget — aggressive collapsing can occasionally shave nuance you wanted. Trimming pays off most on system prompts and templates that run on every request, where each saved token multiplies across thousands of calls. Pair this with the LLM cost calculator to turn the token saving into a money figure, and always test the trimmed prompt on real inputs: shorter is only better if the output quality holds.