Hitting 429 errors means you are sending requests faster than your tier allows. This calculator
turns your RPM and TPM limits plus your typical request size into concrete, safe settings:
a sustainable request rate, a concurrency level, and a minimum delay between calls.
How it works
Providers enforce two ceilings simultaneously — requests per minute and tokens per minute — and your real throughput is bounded by whichever you hit first:
- RPM-bound rate = your RPM limit.
- TPM-bound rate = TPM limit ÷ tokens per request.
The calculator takes the smaller of the two, applies a safety margin (default 90%) for token-count variance and clock drift, and reports the binding constraint plus a safe concurrency estimate and inter-request delay. All math runs locally in your browser.
Tips
If you are TPM-bound, shrinking prompts (trimming history, using retrieval) buys more throughput
than a higher RPM ever will. If you are RPM-bound, batching multiple items into one request
helps. Keep exponential backoff and Retry-After handling in your client regardless — token
estimates drift and shared quotas mean staying under the average rate reduces 429s but never
eliminates them entirely.