Markdown formatting token overhead calculator
Markdown is great for humans, but every #, *, backtick and | is a billable
token — and on output you pay for all of them. This tool counts your text
with markdown, strips the syntax to get the plain-text equivalent, and
shows the exact overhead so you can decide where formatting earns its tokens.
How it works
The calculator counts tokens for your markdown as written, then removes the
formatting syntax with a markdown-stripping pass: headers (#), emphasis
(*/_), list markers, links and images, inline and fenced code, blockquotes
and table pipes. It counts the cleaned text and reports the difference in tokens
and as a percent of the formatted total. Counts use the ≈ 4-characters-per-token
heuristic that tracks GPT and Claude tokenizers for English.
Most of the cost comes from repeated markers — a 50-row bullet list or a heavily nested document pays the marker tax on every line.
Tips to control formatting cost
- Match format to consumer. Code reading the output? Ask for plain text or JSON. Human reading rendered markdown? Keep it.
- Flatten deep nesting. Multiple indent levels and decorative dividers add tokens without improving comprehension.
- Prefer compact lists. A short comma-separated line is cheaper than ten one-item bullets when the structure is not load-bearing.
- Cap output tokens. Combine
max_tokenswith a plain-format instruction to stop formatting from inflating long completions.