Meaning as a list of numbers
An embedding is a way of representing something — a word, a sentence, an image, even a user — as a list of numbers called a vector. The crucial property is that items with similar meaning end up with similar vectors. “Dog” and “puppy” land close together; “dog” and “spreadsheet” land far apart. This lets a machine, which can only do arithmetic, reason about meaning by measuring distances. The demo below lets you compare concepts and see their similarity scores.
Why turn meaning into geometry
Computers do not understand language; they manipulate numbers. Embeddings bridge that gap by placing every concept at a point in a high-dimensional space, where direction and distance encode meaning. Once meaning is geometry, hard problems become simple maths: find related documents by looking for nearby vectors, group customers by clustering their embeddings, or detect off-topic text by measuring how far it sits from a reference point.
How embeddings are learned
Embeddings are not hand-written; they are learned from data. The classic example is word2vec, which trains on huge amounts of text by predicting which words tend to appear near each other. The famous result is that the learned vectors support analogies: king − man + woman ≈ queen. Modern systems use neural networks to produce embeddings for whole sentences and documents, capturing context that single words miss.
Word, sentence, and beyond
Different embeddings suit different jobs. Word embeddings give one vector per word and are great for vocabulary-level tasks. Sentence embeddings from models like sentence transformers compress an entire sentence into one vector, ideal for search and clustering. The same idea extends to images, audio, and code, and multimodal embeddings even place text and images in a shared space so you can search images with words.
Measuring similarity
The standard way to compare two embeddings is cosine similarity — the cosine of the angle between the vectors. It ranges from 1 (pointing the same way, very similar) through 0 (perpendicular, unrelated) to −1 (opposite). This is exactly what powers semantic search and recommendation systems: embed everything once, then rank results by cosine similarity to a query. The interactive demo computes this score for any two concepts you choose.