Daily API budget allocator
Decide how much you can spend on a model per day and instantly see how many requests that buys. Enter a daily dollar cap, describe a typical request in tokens, pick your model, and the allocator floors your budget into a clean maximum requests per day figure — plus the exact point where you cross your chosen alert threshold.
How it works
Every request has a fixed cost driven by tokens and model price:
cost_per_request = (input_tokens / 1,000,000) × input_price
+ (output_tokens / 1,000,000) × output_price
max_requests = floor(daily_budget / cost_per_request)
Output tokens are usually 3–5× more expensive than input, so a chatty completion can shrink your request ceiling fast. The alert threshold simply marks a percentage of the budget — say 80% — and reports the request number at which spend reaches it, giving you a buffer to throttle or switch to a cheaper model.
Tips for stretching a daily budget
- Trim output, not just input. Capping
max_tokensis often the single biggest lever because output is priced highest. - Set the alert at 70–80%. That leaves headroom for traffic spikes without blowing the cap.
- Model a worst-case request. Budget against your longest typical call, not the average, so a busy hour does not silently overrun.