Question 1

What does temperature do in an LLM?

Accepted Answer

Temperature scales the model's probability distribution before sampling. Low values make the distribution sharper so the most likely tokens dominate; high values flatten it so less likely tokens get picked more often, increasing variety and randomness.

Question 2

What is a typical temperature range?

Accepted Answer

Most APIs accept 0.0 to 2.0. Values around 0.0–0.3 give focused, near-deterministic output; 0.7–1.0 is a common default for general chat; above 1.2 output becomes noticeably more random and can degrade in coherence.

Question 3

What temperature should I use for factual tasks?

Accepted Answer

Use a low temperature — often 0.0 to 0.2 — for factual answers, data extraction, classification, and code, where you want consistency and the single most likely token. Reserve higher values for brainstorming and creative writing.

Question 4

Does temperature 0 guarantee identical output?

Accepted Answer

It makes sampling effectively greedy, so output is far more consistent, but it is not always bit-for-bit identical. Hardware, batching, and floating-point non-determinism can still cause small variations between runs.

Temperature (AI Glossary)

Definition

How it works under the hood

The practical range

Choosing the right value

Temperature vs top-p