Context Window Stress Tester

Generate a test prompt that fills exactly N% of a model's context window

Ad placeholder (leaderboard)

Context window stress tester

Modern models advertise huge context windows — 128K, 200K, even a million tokens — but capacity is not the same as quality. Recall accuracy, latency, and per-call cost all change as you fill the window. The context window stress tester generates filler text sized to land on an exact percentage of a model’s window so you can probe that behaviour deliberately instead of guessing.

How it works

You pick a model (or enter a custom window size), a target fill percentage, and a filler type. The tool converts your target percentage into a target token count, then generates that many tokens of filler using a per-type characters-per-token ratio: prose is roughly 4 characters per token, source code about 3.2, and JSON about 2.8 because of dense punctuation. It builds the text in the browser and reports the achieved token estimate and character count so you can paste it straight into a real call.

Tips and notes

  • Test the curve, not one point. Run 50%, 75%, and 95% — degradation is rarely linear.
  • Match filler to your workload. If you feed the model JSON, test with JSON filler so token density is realistic.
  • Embed a needle. Drop a unique fact near the start, middle, and end of the filler, then ask the model to retrieve each to map positional recall.
  • Watch cost. Input tokens are billed every call; a 95%-full 200K window is expensive to probe repeatedly.
Ad placeholder (rectangle)