Instructor / Pydantic Schema Token Overhead Calculator

Token cost added by Instructor or Pydantic schema injection in prompts

Ad placeholder (leaderboard)

Structured output with Instructor or function calling is convenient, but the schema you inject is billed as input tokens on every request. This tool estimates that overhead from your actual schema and projects the monthly cost.

How it works

When you ask for structured output, the library serializes your Pydantic model or JSON schema and places it in the prompt (or as a tool/function definition). The model reads that schema to know the output shape — and you pay input-token cost for it each call.

The calculator estimates tokens from the pasted schema using a character-based heuristic tuned for structured text (≈1 token per 3.6 characters, close to BPE tokenizers on JSON and code). It then computes:

overhead/request = schema_tokens × input_price / 1e6
monthly cost = overhead/request × requests_per_day × 30

Worked example

A moderately rich schema of ~900 characters estimates to ~250 tokens. At 1,000,000 requests/month and $1/1M input tokens:

  • Overhead per request: 250 × $1 / 1e6 = $0.00025
  • Monthly cost: $0.00025 × 1,000,000 = $250/month

That is $3,000/year purely to repeatedly describe the same output shape — a clear candidate for prompt caching or schema trimming.

Tips

  • Prompt caching is the single biggest lever; a stable schema is ideal cache content and is billed at a fraction of the normal rate on hits.
  • Drop verbose field descriptions and docstrings the model does not need.
  • Flatten unnecessary nesting — deeply nested $defs inflate token counts.
  • Confirm the exact count with a real tokenizer before optimizing aggressively, and model total spend with the LLM API Cost Calculator.
Ad placeholder (rectangle)