LLM temperature & parameter guide
Sampling parameters quietly decide whether your LLM feels reliable or unhinged. Set temperature too high on a coding task and you get plausible-looking nonsense; set it too low on a brainstorm and you get the same three ideas every time. This guide maps the common task types to sensible starting parameters and explains the reasoning so you can adjust with intent instead of guessing.
How it works
Pick the task type and how deterministic you need the output to be. The tool returns a recommended temperature, top-p, and max-tokens, plus a short rationale. The logic follows the well-established trade-off: deterministic, correctness-critical tasks (code, extraction, classification) sit near temperature 0, balanced tasks (Q&A, summarisation) sit in the 0.2–0.5 band, and generative, divergent tasks (storytelling, brainstorming, marketing copy) climb toward 0.8–1.2. The determinism slider nudges the recommendation within the band so you can lean more consistent or more varied.
Tips and notes
Change temperature or top-p, not both — combining them makes results hard to
reason about. For anything you will run repeatedly (tests, pipelines, eval suites)
pin temperature to 0 so results are reproducible. If outputs feel repetitive at
moderate temperature, a small bump plus a frequency_penalty often helps more
than a large temperature jump. And remember that the “right” setting is the one
that passes your own evaluation on real inputs — treat these as informed starting
points, then measure.