Question 1

What is the difference between temperature and top_p?

Accepted Answer

Temperature reshapes the entire probability distribution over the next token — low values sharpen it toward the likeliest tokens, high values flatten it so unlikely tokens get a chance. Top_p (nucleus sampling) instead truncates the distribution, keeping only the smallest set of top tokens whose probabilities sum to p, then samples from that set. Temperature controls how peaky the distribution is; top_p controls how many candidates survive.

Question 2

Should I change both temperature and top_p together?

Accepted Answer

Generally no. OpenAI and most providers recommend altering one and leaving the other at its default, because the two parameters interact in ways that are hard to reason about. If you change both, you can get compounding randomness or unexpected determinism. Pick the one that matches your mental model — usually temperature — and tune just that.

Question 3

What are good default values?

Accepted Answer

For factual, deterministic tasks (extraction, classification, code), use a low temperature around 0 to 0.3. For balanced general writing, 0.7 is a common default. For creative brainstorming, 0.9 to 1.2. If you tune top_p instead, leave temperature at 1.0 and lower top_p toward 0.5–0.9 to restrict the candidate pool.

Question 4

Does temperature 0 make the model fully deterministic?

Accepted Answer

Almost, but not always guaranteed. Temperature 0 makes the model greedily pick the highest-probability token, which produces near-identical output for the same prompt. In practice small non-determinism can still creep in from floating-point operations, batching, or hardware, so do not rely on bit-for-bit identical output even at temperature 0.

Top-p vs Temperature in LLMs: What's the Difference?

Two different ways to control randomness

How temperature works

How top_p (nucleus sampling) works

Practical guidance: pick one and set sensible defaults