How to Create Text Embeddings with OpenAI

Turn words into vectors — the foundation of semantic AI

Ad placeholder (leaderboard)

What embeddings are

A text embedding turns a string into a fixed-length vector of numbers that encodes its meaning. Two sentences that say the same thing in different words produce vectors pointing in nearly the same direction, even with no shared keywords. That property is the foundation of semantic search, clustering, classification, recommendation, and retrieval-augmented generation.

How to create and use them

You call the /v1/embeddings endpoint with a model — text-embedding-3-small for a cheap, capable default or text-embedding-3-large for higher quality — and get back an array of floats per input. You store each vector alongside its source text in a vector store like pgvector, Pinecone, or Qdrant. To find related items, you embed a query and compare it to stored vectors using cosine similarity, which measures the angle between two vectors: values near 1 mean very similar, near 0 mean unrelated.

The calculator below lets you paste two comma-separated vectors and see their cosine similarity, dot product, and magnitudes — the exact math a vector database runs under the hood.

Tips and pitfalls

Pick one embedding model and stick with it; vectors from different models are not comparable, so changing models means re-embedding everything. Use the dimensions parameter to shrink large vectors when storage or speed matters — modern OpenAI models support this with little quality loss. Normalize vectors if your store expects it, batch many texts per request to cut cost, and always keep the original text with each vector so you can show real results, not just numbers.

Ad placeholder (rectangle)