Prompt Meta-Evaluator

Score your prompt against 10 best-practice dimensions before sending.

Ad placeholder (leaderboard)

Score your prompt before you spend tokens

A weak prompt wastes tokens, money, and time on outputs you have to re-prompt anyway. The Prompt Meta-Evaluator scores your system and user prompt against ten dimensions that consistently separate reliable prompts from flaky ones, so you can fix the obvious problems before you ever hit the API.

How it works

The tool runs a set of heuristic checks entirely in your browser. It looks at length and structure, detects whether you have defined a role, asked for a specific output format, supplied examples, stated constraints, and broken the task into steps. It also flags ambiguity (vague verbs like “handle” or “deal with”) and checks for prompt-injection resilience — whether untrusted input is clearly delimited and the model is told to treat it as data. Each dimension gets a 0, 1, or 2, and the totals roll up into a score out of 20 with a grade.

Tips for a higher score

  • Define a role and an output format. “You are a senior tax accountant. Reply only in JSON matching this schema” outscores a bare question.
  • Show, don’t just tell. One or two examples of the desired output lift the specificity and example dimensions and dramatically improve consistency.
  • Delimit user input. Wrap pasted content in triple backticks or XML tags and state that anything inside is data, never instructions. This is the single biggest win for injection safety.
  • Cut filler. Politeness padding and restating the obvious lowers the clarity score without improving results — be direct.
Ad placeholder (rectangle)