What does AI customer support actually cost?
“Pennies per conversation” is the pitch, but a real support deployment re-sends conversation history every turn, injects retrieved knowledge-base context, and still escalates a fraction of tickets to humans. This calculator models all three so you get a credible cost per ticket and monthly spend, side by side with the human-agent equivalent.
How the cost is built up
Because LLMs are stateless, every turn re-sends the whole conversation so far. For a conversation of n turns where each turn adds roughly the same number of tokens, total input tokens scale with the triangular number of turns:
turns_input = tokens_per_turn × (n × (n + 1) / 2)
ai_cost_per_ticket = (turns_input/1e6 × in_price)
+ (output_tokens/1e6 × out_price)
blended = ai_cost_per_ticket
+ escalation_rate × human_cost_per_ticket
That quadratic growth is why long conversations get expensive fast — and why trimming history or summarising it matters.
Tips to lower cost per ticket
Cap conversation length, summarise old turns instead of re-sending them, and route trivial intents to a cheap model (GPT-4o mini, Claude Haiku, Gemini Flash) while reserving a frontier model for hard cases. Most of the bill hides in re-sent context, not in the final answer.