Token Waste Analyzer

Find the biggest sources of wasted tokens in your prompts and where to compress.

Ad placeholder (leaderboard)

Token waste analyzer

Long prompts are billed on every single call, so a bloated system prompt quietly multiplies your bill at volume. This analyzer breaks your prompt into five categories — instruction, context, examples, formatting and filler — estimates the token share of each, and tells you which blocks have the most room to shrink.

How it works

The tool reads your prompt line by line and assigns tokens to a category using lexical signals:

  • Instruction — imperative verbs and directives (“you must”, “respond with”, “do not”).
  • Examples — fenced code blocks, quoted samples and few-shot markers (“Input:”, “Output:”, “Example”).
  • Formatting — markdown headers, bullets, tables and XML tags.
  • Filler — hedging and padding phrases (“please”, “as an AI”, “kindly”, “in order to”, “it is important to note”).
  • Context — everything else: the background, data and reference material.

Token counts use the standard ≈ 4-characters-per-token heuristic, which tracks real tokenizers closely enough for prioritization.

Tips to recover wasted tokens

  • Delete filler outright. “In order to” → “to”; drop “please” and “kindly”. It changes nothing about model behavior.
  • Cap your examples. Two or three sharp few-shot examples usually beat ten near-duplicates that cost tokens on every call.
  • Move stable context to caching. If a large block never changes, prompt caching bills it at a fraction of the input rate.
  • Flatten formatting. Decorative markdown and nested tags add tokens without improving answers — keep only the structure the model actually needs.
Ad placeholder (rectangle)