LLM Glossary: The Key AI and Machine Learning Terms Explained

A plain-English dictionary of the AI terms that actually matter

Ad placeholder (leaderboard)

How to use this glossary

AI terminology is full of overlapping jargon, and the same idea often has three names. This glossary defines the terms you will actually meet when reading documentation, papers, product pages, or news about large language models. It is model-neutral — the vocabulary applies whether you use GPT, Claude, Gemini, Llama, or Mistral — and grouped by theme so related ideas sit together.

Core model and architecture terms

  • LLM (large language model): a neural network trained on vast text to predict the next token and generate language.
  • Transformer: the architecture behind nearly every modern LLM, built on the attention mechanism.
  • Attention / self-attention: the mechanism that lets the model weigh how much each word relates to every other word in the input.
  • Parameters / weights: the billions of learned numbers that store what the model knows.
  • Mixture of experts (MoE): an architecture that activates only part of the network per token to save compute.

Input, output, and usage terms

  • Token: a chunk of text (about ¾ of a word on average) the model reads and counts.
  • Context window: the maximum number of tokens the model can consider at once.
  • Prompt: the input text and instructions you give the model.
  • System prompt: persistent instructions that set the model’s role and behavior.
  • Temperature: a setting from 0 to ~2 controlling randomness; low is deterministic, high is creative.
  • Top-p (nucleus sampling): an alternative randomness control that picks from the smallest set of likely tokens.
  • Inference: the act of running the trained model to produce output.

Knowledge, accuracy, and customization terms

  • Embedding: a vector of numbers representing the meaning of text for similarity search.
  • RAG (retrieval-augmented generation): fetching relevant documents and adding them to the prompt so answers use current or private data.
  • Fine-tuning: further training a model on your own examples to change its behavior or style.
  • LoRA / QLoRA: parameter-efficient fine-tuning methods that train small adapters instead of the whole model.
  • Hallucination: confident output that is factually wrong or invented.
  • Grounding: anchoring the model’s answers to a verified source to reduce hallucination.

Training, alignment, and behavior terms

  • Pre-training: the initial large-scale training on raw text.
  • RLHF (reinforcement learning from human feedback): the technique that aligns model behavior with human preferences.
  • Alignment: making a model’s behavior match human intentions and values.
  • Zero-shot / few-shot: asking a model to do a task with no examples (zero-shot) or a handful of examples (few-shot) in the prompt.
  • Chain-of-thought: prompting the model to reason step by step before answering.
  • Agent: an LLM-driven system that plans, uses tools, and takes multi-step actions autonomously.

Bookmark this page and pair it with the AI terminology quiz to lock the definitions into memory — recognizing a term and being able to explain it are two different skills.

Ad placeholder (rectangle)