Context Freshness vs Cost Calculator

Find the optimal context update interval to balance cost and freshness

Ad placeholder (leaderboard)

Balance context freshness against token cost

Retrieval and agent apps often carry a large, slowly-changing context block — documentation, a knowledge base, a long system prompt — that is re-sent on every request. The fresher you keep it, the more often you rebuild and resend it, and the more input tokens you pay for. This calculator models the cost of each refresh interval alongside the average staleness it implies, so you can choose a cadence that is cheap enough and fresh enough.

How it works

Every request that includes the context pays for its input tokens:

daily_context_cost = context_tokens × daily_requests / 1,000,000 × input_price

Refreshing more frequently does not change the per-request cost directly, but a shorter interval means more cache misses (full-price reads) when prompt caching is on, while a longer interval increases the average staleness — roughly half the refresh interval, since content can be up to one full interval old. The tool sweeps common intervals and reports cost and staleness side by side so the tradeoff is explicit.

Tips for cost-effective freshness

  • Turn on prompt caching. If your context is stable between refreshes, cached input tokens are billed at a fraction of the normal rate — often the single biggest lever here.
  • Refresh on change, not on a timer. If you can detect document updates, an event-driven refresh beats a fixed interval for both cost and freshness.
  • Split hot from cold context. Keep volatile facts small and refresh them often; keep the large stable corpus on a long interval.
  • Trim the context. The cheapest token is the one you never send — retrieve only the chunks a query actually needs instead of stuffing the whole corpus.
Ad placeholder (rectangle)