What Is An Embedding — AI Engineer Stage 1

An embedding is a list of numbers (a vector) that represents the meaning of a piece of text — texts with similar meanings have vectors that are close together in mathematical space.

Why this appears in interviews

Embeddings are the foundation of every RAG system, semantic search application, and recommendation system built on LLMs. Interviewers ask about embeddings to test whether you understand why retrieval works the way it does. If you cannot explain what an embedding is and why cosine similarity matters, you cannot design a retrieval system.

The mental model

Imagine plotting every word in the English language on a map. Words with similar meanings are placed near each other. "Dog" and "puppy" are neighbors. "King" and "queen" are neighbors. "Paris" and "France" are closer to each other than either is to "banana."

A 2D scatter plot of word embeddings clustered by semantic similarity

Now expand this beyond words to entire sentences and paragraphs. "How do I reset my password?" and "I forgot my login credentials" end up very close together on this map, even though they share no words. That proximity is the power of embeddings — they capture semantic meaning rather than exact word matches.

In practice, embeddings are just long lists of numbers (typically 1536 numbers for OpenAI's text-embedding-3-small model). Those numbers encode meaning in a way that allows mathematical comparisons.

How embeddings are created

An embedding model (separate from the generation model) takes text as input and outputs a fixed-length vector:

"How do I reset my password?" → embedding model → [0.021, -0.453, 0.891, ... 1536 numbers]

The numbers themselves are meaningless individually. What matters is their relationship to other vectors: texts that mean similar things produce vectors that point in similar directions.

Measuring similarity — cosine similarity

The standard way to compare two embeddings is cosine similarity — it measures the angle between two vectors. A score of 1.0 means identical direction (same meaning). A score of 0 means perpendicular (unrelated). A score of -1.0 means opposite direction.

Why cosine similarity and not regular distance? Because the magnitude (length) of a vector can vary without affecting meaning — what matters is the direction. Cosine similarity ignores magnitude and compares only direction.

Choosing an embedding model

Key tradeoffs between common embedding models:

OpenAI text-embedding-3-small — 1536 dims, low cost, good quality
OpenAI text-embedding-3-large — 3072 dims, higher cost, better quality
Cohere embed-v3 — 1024 dims, low cost, good quality
Local (nomic-embed) — 768 dims, free (self-hosted), decent quality

For most applications, text-embedding-3-small gives 90% of the quality at 20% of the cost of large.

Common interview mistakes

Mistake 1: Thinking embedding equals the generation model. The model you use to embed documents is different from the model you use to generate answers. They serve different purposes.

Mistake 2: Using the wrong embedding for the use case. Some embedding models are optimized for sentence-level similarity, others for longer documents. Some are multilingual. Match the model to the use case.

Mistake 3: Not thinking about dimensionality tradeoffs. Higher dimensions = more expressive = better quality, but higher storage costs and slower search.

Key vocabulary

Vector — An ordered list of numbers that represents something in mathematical space.
Cosine similarity — A measure of similarity between two vectors based on the angle between them. Values from -1 to 1.
Embedding model — A model that converts text into vectors. Separate from the generation model.
Semantic search — Search that finds results based on meaning rather than exact keyword matching.