Vector Embeddings Explained: Meaning as Math

King − Man + Woman = Queen: the magic behind AI's grasp of meaning

Ad placeholder (leaderboard)

From points to meaning

A vector embedding represents an item as a point in a high-dimensional space where distance and direction encode meaning. The reason this matters is that the geometry is not arbitrary — relationships between concepts show up as consistent directions you can manipulate with arithmetic. The most famous demonstration is that king − man + woman lands near queen. The analogy solver below lets you try this kind of arithmetic yourself.

Embedding arithmetic and analogies

When embeddings are trained well, semantic relationships become vector directions. The difference between “man” and “woman” is roughly the same direction as “king” and “queen” — a gender axis. So if you take the vector for “king,” subtract “man,” and add “woman,” you arrive near “queen.” The same trick captures Paris − France + Italy ≈ Rome (capital-of) and verb tenses, plurals, and comparatives. This is strong evidence that the vectors encode real structure rather than noise.

Cosine similarity, the workhorse metric

To compare embeddings, almost everyone uses cosine similarity — the cosine of the angle between two vectors. It ignores length and focuses on direction, scoring 1 for identical orientation down to −1 for opposite. This is the single most important operation in applied embeddings: rank a list of candidates by their cosine similarity to a query vector and you have built the core of search, deduplication, clustering, and recommendation.

Powering search and recommendations

Semantic search embeds every document ahead of time and stores the vectors. At query time it embeds the question and returns the nearest documents by cosine similarity, finding results by meaning rather than exact words. Recommendation systems work the same way: represent users and items as vectors and suggest items whose embeddings sit closest to a user’s. Because the heavy work is precomputed, lookups stay fast even over millions of items.

Embeddings and retrieval-augmented generation

The most common modern use is retrieval-augmented generation (RAG). You split your documents into chunks, embed each chunk, and store them. When a user asks a question, you embed the question, retrieve the most similar chunks, and feed them into a language model’s prompt as context. This grounds the model in your own, up-to-date data and dramatically reduces hallucination — all built on the same similarity arithmetic the analogy solver demonstrates.

Ad placeholder (rectangle)