Concept · ~8 min read

What Is An Embedding

An is a list of numbers (a vector) that represents the meaning of a piece of text — texts with similar meanings have vectors that are close together in mathematical space.

Why this appears in interviews

are the foundation of every system, semantic search application, and recommendation system built on LLMs. Interviewers ask about to test whether you understand why retrieval works the way it does. If you cannot explain what an is and why cosine similarity matters, you cannot design a retrieval system.

The mental model

Imagine plotting every word in the English language on a map. Words with similar meanings are placed near each other. "Dog" and "puppy" are neighbors. "King" and "queen" are neighbors. "Paris" and "France" are closer to each other than either is to "banana."

A 2D scatter plot of word embeddings clustered by semantic similarity

Now expand this beyond words to entire sentences and paragraphs. "How do I reset my password?" and "I forgot my login credentials" end up very close together on this map, even though they share no words. That proximity is the power of — they capture semantic meaning rather than exact word matches.

In practice, are just long lists of numbers (typically 1536 numbers for OpenAI's text--3-small model). Those numbers encode meaning in a way that allows mathematical comparisons.

How embeddings are created

An model (separate from the generation model) takes text as input and outputs a fixed-length vector:

"How do I reset my password?" → embedding model → [0.021, -0.453, 0.891, ... 1536 numbers]

The numbers themselves are meaningless individually. What matters is their relationship to other vectors: texts that mean similar things produce vectors that point in similar directions.

Measuring similarity — cosine similarity

The standard way to compare two is cosine similarity — it measures the angle between two vectors. A score of 1.0 means identical direction (same meaning). A score of 0 means perpendicular (unrelated). A score of -1.0 means opposite direction.

Why cosine similarity and not regular distance? Because the magnitude (length) of a vector can vary without affecting meaning — what matters is the direction. Cosine similarity ignores magnitude and compares only direction.

Choosing an embedding model

Key tradeoffs between common models:

  • OpenAI text--3-small — 1536 dims, low cost, good quality
  • OpenAI text--3-large — 3072 dims, higher cost, better quality
  • Cohere embed-v3 — 1024 dims, low cost, good quality
  • Local (nomic-embed) — 768 dims, free (self-hosted), decent quality

For most applications, text--3-small gives 90% of the quality at 20% of the cost of large.

Common interview mistakes

Mistake 1: Thinking equals the generation model. The model you use to embed documents is different from the model you use to generate answers. They serve different purposes.

Mistake 2: Using the wrong for the use case. Some models are optimized for sentence-level similarity, others for longer documents. Some are multilingual. Match the model to the use case.

Mistake 3: Not thinking about dimensionality tradeoffs. Higher dimensions = more expressive = better quality, but higher storage costs and slower search.

Key vocabulary

  • Vector — An ordered list of numbers that represents something in mathematical space.
  • Cosine similarity — A measure of similarity between two vectors based on the angle between them. Values from -1 to 1.
  • model — A model that converts text into vectors. Separate from the generation model.
  • Semantic search — Search that finds results based on meaning rather than exact keyword matching.
← Previous
Next · ProblemEmbedding Similarity — Semantic Reasoning