What is a special token?

A special token is a reserved symbol the model treats as structure rather than literal text — markers like (begin a chat turn), (document boundary), BOS/EOS (sequence start/end), and [INST] (Llama instruction wrapper). They steer how the model interprets a sequence and are usually added automatically by the chat template.

Should I put special tokens in my own text?

Almost never. If you type into a user message, a model may interpret it as a real control token and break formatting or open an injection vector. Let the API's chat template insert them; this tool flags any you have accidentally included.

Why do different models use different tokens?

Each model family ships its own chat template. OpenAI models use ChatML ( role ... ), Llama uses [INST]...[/INST] and special header tokens, Mistral uses a similar instruction wrapper, and Gemini uses its own turn markers. Sending one family's tokens to another model does nothing useful.

Do special tokens cost money?

Yes — each one is a token you are billed for, and the chat template can add several per message (role markers, turn delimiters, BOS/EOS). For short, high-volume messages this overhead is a real fraction of cost, which the companion chat-template overhead analyzer quantifies.

Is my input uploaded?

No. Detection runs entirely in your browser with pattern matching. Nothing you paste is transmitted.

What is the Special Token Decoder?

Detect and explain special control tokens like , , BOS, EOS, and [INST] in your prompt or chat JSON. Shows which tokens come from the model's chat template, what each one does, and the hidden token overhead they add. Runs in your browser. It runs free in your browser on Gera Tools, with nothing uploaded.

Special Token Decoder

Name: Special Token Decoder
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Special token decoder

Special tokens are the invisible scaffolding of every LLM conversation — the <|im_start|> and [INST] markers that tell the model where a turn begins, who is speaking, and when the document ends. This tool scans your prompt or chat JSON, lists every special token it finds, explains what each one does, and warns you about any you have accidentally typed into user content.

Why special tokens exist

Language models are trained on vast text corpora, but a deployed chat model needs additional signals that were not present in that raw text: signals for where one speaker’s turn ends and another’s begins, where system instructions end and user input begins, and where a document boundary falls. These are injected as special tokens — reserved identifiers that were given unique token IDs during training and were never seen as ordinary text. The model learns to treat them as structural markup, not content.

Model-family token vocabularies

Each model family ships its own chat template and therefore its own set of special tokens. The decoder recognises the four major families:

ChatML (OpenAI GPT models)

<|im_start|>system
You are a helpful assistant.
<|im_end|>
<|im_start|>user
Hello!
<|im_end|>
<|im_start|>assistant

Llama (Meta Llama 2 / Llama 3)

Mistral

Mistral uses [INST] / [/INST] similarly to Llama 2 for instruction wrapping, with <s> (BOS) and </s> (EOS) as sequence delimiters.

Gemini

Gemini uses its own turn markers and system-instruction format, distinct from the above families.

How the decoder works

The tool matches your pasted input against the known token patterns for the selected model family. For each match it reports:

the literal token as it appears in your text
its semantic role (turn start, turn end, BOS, EOS, instruction wrapper, system boundary)
whether it is template-inserted (normally added automatically by the API) or manual (you typed it in)
an estimated token cost of one billing token each

It also flags tokens from a different family than the one you selected — a common copy-paste error when adapting a prompt between models.

Practical guidance

Never type special tokens into user content by hand. Let the API’s chat template insert them. If you send <|im_start|> inside a user message, many models will either ignore it, treat it as a control signal, or open a prompt-injection path.
Mixing families silently fails. [INST] does nothing on a GPT model, and <|im_start|> has no effect on raw Llama. The decoder highlights mismatches so you catch them before wasting API calls.
Token overhead accumulates. Each special token is a billable token before any of your content. A typical ChatML exchange adds 4–8 special tokens per message; across millions of calls that is a real cost. This decoder lets you count that overhead accurately.
Prompt injection via special tokens. A user message that contains what looks like a turn-ending token (<|im_end|>) can potentially break the model’s role separation. Sanitise user input server-side by scanning for known special-token patterns — this tool can be the reference for what to look for.

Nothing you paste is transmitted. Detection runs entirely in your browser with pattern matching against the known token sets.