Question 1

What are the most important LLM terms a beginner should learn first?

Accepted Answer

Start with token, prompt, context window, embedding, hallucination, temperature, and fine-tuning. These seven cover most everyday conversations about LLMs: how text is chopped up and counted, how you instruct the model, how much it can read at once, how it represents meaning, why it makes things up, how random its output is, and how it gets customized.

Question 2

What is the difference between an embedding and a token?

Accepted Answer

A token is a chunk of text — roughly a word or word-piece — that the model reads and counts for pricing. An embedding is a list of numbers (a vector) that represents the meaning of a token, word, or whole passage in a way the model can do math on. Tokens are about splitting text; embeddings are about measuring similarity and meaning.

Question 3

What does RAG stand for and why does it matter?

Accepted Answer

RAG stands for retrieval-augmented generation. It means searching a knowledge base for relevant documents and inserting them into the prompt before the model answers, so the LLM responds using current, specific information instead of only its training data. RAG matters because it reduces hallucination and lets a model answer questions about private or up-to-date content without retraining.

Question 4

Is this glossary specific to one AI model or company?

Accepted Answer

No. The terms here apply across all major large language models — GPT, Claude, Gemini, Llama, Mistral, and others — and across the broader machine learning field. Vendor-specific features change often, but the underlying vocabulary of tokens, attention, embeddings, fine-tuning, and inference is shared, which is exactly why a model-neutral glossary stays useful.

LLM Glossary: The Key AI and Machine Learning Terms Explained

How to use this glossary

Core model and architecture terms

Input, output, and usage terms

Knowledge, accuracy, and customization terms

Training, alignment, and behavior terms