Question 1

What does autoregressive mean in AI?

Accepted Answer

It means the model generates output one token at a time, using everything it has produced so far to predict the next token. Each step's output becomes part of the input for the next step.

Question 2

Are all LLMs autoregressive?

Accepted Answer

Most modern chat LLMs (GPT, Claude, Gemini, Llama) are autoregressive decoders. Some other architectures, like diffusion models for images, are not — they refine a whole output in parallel instead.

Question 3

Why are autoregressive models slow to generate long text?

Accepted Answer

Because tokens are produced sequentially — token N must be generated before token N+1 — generation cannot be fully parallelised, so longer outputs take proportionally longer.

Autoregressive (AI Glossary)

Autoregressive — definition

How it works

Why it matters

Contrast: non-autoregressive models