Conversation History Compressor (BYO-key)

Compress long chat histories to fit within context windows.

Ad placeholder (leaderboard)

Keep long conversations inside the context window

Long-running chats eventually overflow the model’s context window or simply waste tokens on stale history. This tool builds a rolling summary: it keeps your most recent turns verbatim and asks the model to compress the older portion into a dense factual block, so the conversation still fits your token budget without losing the facts, names, and decisions that matter.

How it works

You paste your conversation as a JSON array of { "role", "content" } objects and set a target token budget. The tool reserves roughly 40% of that budget for the newest turns, which it passes through unchanged, then sends everything older to your chosen model with a prompt that asks for a compact summary preserving every fact, decision, name, and open question. The output is a single context block: the summary plus the verbatim recent turns, ready to drop back into your next request. Token estimates use the standard ~4-characters-per-token heuristic.

Tips and notes

Set the budget a little below your real limit to leave room for the new user message and the model’s reply. If the summary drops a detail you needed, raise the budget or move that turn into the recent window. The summary prompt is tuned to retain unresolved questions, which is where naive compression usually fails. Your key never leaves your browser except to call the provider directly, and it is never stored.

Ad placeholder (rectangle)