Token Counting Unit Test Generator

Generate unit tests to assert correct token counts in your codebase

Ad placeholder (leaderboard)

Pin your token counts with generated unit tests

Token counts are load-bearing: they set your cost estimates, decide when you truncate context, and gate your rate limits. When a tokenizer library upgrade or a model swap silently shifts those counts, your budgets drift and prompts get cut. This tool generates ready-to-run unit tests that assert your countTokens function returns the exact values you expect.

How it works

You provide sample texts and the token count each one should produce. The generator emits one assertion per case in your chosen framework:

expect(countTokens("Hello, world")).toBe(3);

The test calls your tokenizer wrapper, so it pins real behaviour. Wire countTokens to tiktoken, the Anthropic SDK’s counting endpoint, or whatever your integration uses, and the suite fails loudly the moment counts change.

Tips for a useful test suite

  • Cover edge cases. Include empty strings, emoji, code blocks, and non-Latin text — these are where tokenizers diverge most between versions.
  • Record counts from your current tokenizer. Run it once, capture the numbers, and pin them. The test guards against change, not against a theoretical “correct” answer.
  • Run it in CI. A token-count test is cheap and catches dependency drift before it reaches your billing dashboard or truncates a user’s prompt.
Ad placeholder (rectangle)