Prompt Complexity Grader

Grade the cognitive complexity your prompt demands of the LLM

Ad placeholder (leaderboard)

Prompt complexity grader

Not every prompt needs your most expensive model. Some tasks are one-step lookups; others demand multi-hop reasoning, broad world knowledge, and a strict structured output all at once. The prompt complexity grader estimates how much cognitive load your prompt places on the model and returns a 1-10 score with a per-dimension breakdown — entirely in your browser, no API key, nothing uploaded.

How it works

The grader scans your prompt for measurable signals and combines them into four sub-scores:

  • Reasoning steps — looks for chains, multi-part questions, “then”, “after”, numbered steps, and conditional logic (“if… else”) that imply sequential thinking.
  • World knowledge — flags references to specialized domains, named entities, and breadth of topics the model must already know.
  • Instruction ambiguity — penalizes vague qualifiers (“good”, “appropriate”, “etc.”) and rewards explicit, concrete constraints.
  • Output complexity — detects demands for structured formats (JSON, tables, schemas), length targets, and multiple required sections.

Each dimension is normalized and the four are blended into a single 1-10 score. The logic is deterministic: the same prompt always grades the same way.

Tips and examples

A prompt like “Summarize this paragraph in one sentence” grades low — one step, no special knowledge, simple output. A prompt like “Read these three reports, reconcile the conflicting figures, explain your reasoning step by step, then output a JSON object with a confidence score per claim” grades high across every dimension.

When a prompt grades 8+, consider a frontier model with explicit chain-of-thought, or split it into smaller calls. When it grades 1-3, a small, cheap model with a plain instruction will usually do. Re-grade after editing to confirm your changes actually reduced the load.

Ad placeholder (rectangle)