Prompt complexity grader
Not every prompt needs your most expensive model. Some tasks are one-step lookups; others demand multi-hop reasoning, broad world knowledge, and a strict structured output all at once. The prompt complexity grader estimates how much cognitive load your prompt places on the model and returns a 1-10 score with a per-dimension breakdown — entirely in your browser, no API key, nothing uploaded.
How it works
The grader scans your prompt for measurable signals and combines them into four sub-scores:
- Reasoning steps — looks for chains, multi-part questions, “then”, “after”, numbered steps, and conditional logic (“if… else”) that imply sequential thinking.
- World knowledge — flags references to specialized domains, named entities, and breadth of topics the model must already know.
- Instruction ambiguity — penalizes vague qualifiers (“good”, “appropriate”, “etc.”) and rewards explicit, concrete constraints.
- Output complexity — detects demands for structured formats (JSON, tables, schemas), length targets, and multiple required sections.
Each dimension is normalized and the four are blended into a single 1-10 score. The logic is deterministic: the same prompt always grades the same way.
Tips and examples
A prompt like “Summarize this paragraph in one sentence” grades low — one step, no special knowledge, simple output. A prompt like “Read these three reports, reconcile the conflicting figures, explain your reasoning step by step, then output a JSON object with a confidence score per claim” grades high across every dimension.
When a prompt grades 8+, consider a frontier model with explicit chain-of-thought, or split it into smaller calls. When it grades 1-3, a small, cheap model with a plain instruction will usually do. Re-grade after editing to confirm your changes actually reduced the load.