What is an embedding in simple terms?

An embedding is a list of numbers — a vector — that represents the meaning of something like a word, sentence, or image. Items with similar meaning get similar vectors, so the distance between two embeddings measures how related the two items are.

Why represent meaning as numbers?

Computers can only do maths, not understand language directly. By turning meaning into vectors, an AI can compare concepts with arithmetic — measuring similarity, clustering related items, and searching by meaning rather than exact keywords.

Word2vec is an early, influential method that learns word embeddings by predicting which words appear near each other in large text. It famously captured analogies as vector arithmetic, such as king minus man plus woman landing near queen, showing the vectors encode real semantic structure.

How are sentence embeddings different from word embeddings?

Word embeddings give one vector per word. Sentence embeddings, produced by models like sentence transformers, give a single vector for a whole sentence or paragraph, capturing its overall meaning so you can compare or search across longer pieces of text.

What Are Embeddings in Machine Learning?

Meaning as a list of numbers

An embedding is a way of representing something — a word, a sentence, an image, even a user — as a list of numbers called a vector. The crucial property is that items with similar meaning end up with similar vectors. “Dog” and “puppy” land close together; “dog” and “spreadsheet” land far apart. This lets a machine, which can only do arithmetic, reason about meaning by measuring distances. The demo below lets you compare concepts and see their similarity scores.

Why turn meaning into geometry

Computers do not understand language; they manipulate numbers. Embeddings bridge that gap by placing every concept at a point in a high-dimensional space, where direction and distance encode meaning. Once meaning is geometry, hard problems become simple maths: find related documents by looking for nearby vectors, group customers by clustering their embeddings, or detect off-topic text by measuring how far it sits from a reference point.

How embeddings are learned

Embeddings are not hand-written; they are learned from data. The classic example is word2vec, which trains on huge amounts of text by predicting which words tend to appear near each other. The famous result is that the learned vectors support analogies: king − man + woman ≈ queen. Modern systems use neural networks to produce embeddings for whole sentences and documents, capturing context that single words miss.

Word, sentence, and beyond

Different embeddings suit different jobs. Word embeddings give one vector per word and are great for vocabulary-level tasks. Sentence embeddings from models like sentence transformers compress an entire sentence into one vector, ideal for search and clustering. The same idea extends to images, audio, and code, and multimodal embeddings even place text and images in a shared space so you can search images with words.

Measuring similarity

The standard way to compare two embeddings is cosine similarity — the cosine of the angle between the vectors. It ranges from 1 (pointing the same way, very similar) through 0 (perpendicular, unrelated) to −1 (opposite). This is exactly what powers semantic search and recommendation systems: embed everything once, then rank results by cosine similarity to a query. The interactive demo computes this score for any two concepts you choose.