Question 1

What is a hyperparameter?

Accepted Answer

A hyperparameter is a configuration value you set before training begins and that the learning algorithm does not adjust on its own. Examples include the learning rate, batch size, number of layers, and dropout rate. They control how the model learns rather than what it learns.

Question 2

What is the difference between a hyperparameter and a parameter?

Accepted Answer

Parameters are the weights and biases the model learns automatically from data during training. Hyperparameters are the settings you choose beforehand that govern the training process itself. You tune hyperparameters; the model fits parameters.

Question 3

What are common hyperparameters?

Accepted Answer

Common hyperparameters include the learning rate, batch size, number of training epochs, the number and width of layers, dropout rate, weight decay, and the choice of optimiser. For prompting and inference, temperature and top-p are sometimes loosely called hyperparameters too.

Question 4

How do you tune hyperparameters?

Accepted Answer

Typical strategies are grid search (try every combination on a defined grid), random search (sample combinations at random, often more efficient), and Bayesian optimisation (use past results to choose promising settings). All evaluate candidates on a held-out validation set.

Hyperparameter (AI Glossary)

What is a hyperparameter?

Hyperparameters vs parameters

Common hyperparameters

Tuning strategies