Budget a document extraction pipeline before you run it
Extracting structured fields from thousands of invoices, contracts or records with an LLM can be cheap or eye-watering depending on document size, schema overhead and how often you retry. This estimator gives you a defensible per-document and total cost so you can size a pipeline — or compare models — before processing a single file.
How it works
Each document costs ((doc_tokens + schema_tokens) × input_price) + (output_tokens × output_price). The schema and instruction tokens are added to
every document because you resend them on each call, which is why short
documents with a big schema can cost more than you expect. The estimator then
applies your retry rate: an 8% retry rate inflates effective calls per
document to 1.08×, capturing the cost of re-running failed or low-confidence
extractions. Multiply by your document count and you have the total pipeline cost.
Tips to keep extraction cheap at scale
- Shrink the schema. Every redundant field description and example is billed on every document. Keep instructions tight.
- Use a cheaper model with validation. A mini/flash model plus a schema validator often beats a premium model on cost per correct extraction.
- Escalate, don’t blanket. Run everything on the cheap model and only retry the failures on a stronger one, rather than paying premium prices everywhere.
- Batch where you can. Batch APIs and longer prompts that pack multiple records can cut per-document overhead — just watch context limits and accuracy.