Question 1

What is top-k sampling?

Accepted Answer

Top-k sampling restricts the model to choosing its next token only from the K most likely candidates. The rest of the vocabulary is discarded for that step, then a token is sampled from the surviving K according to their probabilities.

Question 2

How does top-k differ from top-p?

Accepted Answer

Top-k keeps a fixed number of candidates (always K) regardless of how confident the model is. Top-p keeps a variable number whose probabilities sum to P, so it expands when the model is uncertain and shrinks when it is confident.

Question 3

What is a typical top-k value?

Accepted Answer

Values like 40 or 50 are common defaults. A small K (for example 1 to 10) makes output focused and repetitive, while a large K approaches unrestricted sampling. K of 1 is equivalent to greedy decoding.

Question 4

Should I use top-k together with temperature?

Accepted Answer

They can be combined, and many APIs apply temperature first to reshape the distribution, then top-k to truncate it. To keep behaviour predictable, most practitioners tune one randomness control as the primary dial rather than aggressively changing all of them.

Top-K Sampling (AI Glossary)

Definition

How it works step by step

The role of K

Top-k vs top-p

Combining with temperature