Markdown Formatting Token Overhead Calculator

See how much markdown formatting inflates your LLM token count and cost.

Ad placeholder (leaderboard)

Markdown formatting token overhead calculator

Markdown is great for humans, but every #, *, backtick and | is a billable token — and on output you pay for all of them. This tool counts your text with markdown, strips the syntax to get the plain-text equivalent, and shows the exact overhead so you can decide where formatting earns its tokens.

How it works

The calculator counts tokens for your markdown as written, then removes the formatting syntax with a markdown-stripping pass: headers (#), emphasis (*/_), list markers, links and images, inline and fenced code, blockquotes and table pipes. It counts the cleaned text and reports the difference in tokens and as a percent of the formatted total. Counts use the ≈ 4-characters-per-token heuristic that tracks GPT and Claude tokenizers for English.

Most of the cost comes from repeated markers — a 50-row bullet list or a heavily nested document pays the marker tax on every line.

Tips to control formatting cost

  • Match format to consumer. Code reading the output? Ask for plain text or JSON. Human reading rendered markdown? Keep it.
  • Flatten deep nesting. Multiple indent levels and decorative dividers add tokens without improving comprehension.
  • Prefer compact lists. A short comma-separated line is cheaper than ten one-item bullets when the structure is not load-bearing.
  • Cap output tokens. Combine max_tokens with a plain-format instruction to stop formatting from inflating long completions.
Ad placeholder (rectangle)