Question 1

What is a token in AI?

Accepted Answer

A token is the smallest unit of text a language model processes — usually a subword piece such as a whole short word, a word fragment, a space, or a punctuation mark. Models read input and produce output one token at a time, not one word or letter at a time.

Question 2

How many tokens is a word?

Accepted Answer

For typical English, one token is roughly three-quarters of a word, so 1,000 tokens is about 750 words. Common words are often a single token, while rare or long words, code, and non-English text can split into several tokens each.

Question 3

Why do tokens matter for cost?

Accepted Answer

API providers bill per token for both input and output, and context windows are measured in tokens. Fewer tokens means lower cost and more room for content, so concise prompts and efficient formats directly affect your spend.

Question 4

How are tokens created?

Accepted Answer

A tokenizer such as Byte Pair Encoding (BPE) is trained to split text into frequently occurring subword chunks. The same text can produce different token counts under different tokenizers, which is why model providers ship their own.

Token (AI Glossary)

Definition

Tokens, words, and characters

How tokens are made

Why token count matters

Practical implications