An embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → is a list of numbers (a vector) that represents the meaning of a piece of text — texts with similar meanings have vectors that are close together in mathematical space.
Why this appears in interviews
EmbeddingsEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → are the foundation of every RAGRAGRetrieval-Augmented Generation — gives LLMs access to external knowledge by retrieving relevant documents before generating a response.Learn more → system, semantic search application, and recommendation system built on LLMs. Interviewers ask about embeddingsEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → to test whether you understand why retrieval works the way it does. If you cannot explain what an embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → is and why cosine similarity matters, you cannot design a retrieval system.
The mental model
Imagine plotting every word in the English language on a map. Words with similar meanings are placed near each other. "Dog" and "puppy" are neighbors. "King" and "queen" are neighbors. "Paris" and "France" are closer to each other than either is to "banana."
Now expand this beyond words to entire sentences and paragraphs. "How do I reset my password?" and "I forgot my login credentials" end up very close together on this map, even though they share no words. That proximity is the power of embeddingsEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → — they capture semantic meaning rather than exact word matches.
In practice, embeddingsEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → are just long lists of numbers (typically 1536 numbers for OpenAI's text-embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more →-3-small model). Those numbers encode meaning in a way that allows mathematical comparisons.
How embeddings are created
An embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → model (separate from the generation model) takes text as input and outputs a fixed-length vector:
"How do I reset my password?" → embedding model → [0.021, -0.453, 0.891, ... 1536 numbers]
The numbers themselves are meaningless individually. What matters is their relationship to other vectors: texts that mean similar things produce vectors that point in similar directions.
Measuring similarity — cosine similarity
The standard way to compare two embeddingsEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → is cosine similarity — it measures the angle between two vectors. A score of 1.0 means identical direction (same meaning). A score of 0 means perpendicular (unrelated). A score of -1.0 means opposite direction.
Why cosine similarity and not regular distance? Because the magnitude (length) of a vector can vary without affecting meaning — what matters is the direction. Cosine similarity ignores magnitude and compares only direction.
Choosing an embedding model
Key tradeoffs between common embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → models:
- OpenAI text-embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more →-3-small — 1536 dims, low cost, good quality
- OpenAI text-embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more →-3-large — 3072 dims, higher cost, better quality
- Cohere embed-v3 — 1024 dims, low cost, good quality
- Local (nomic-embed) — 768 dims, free (self-hosted), decent quality
For most applications, text-embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more →-3-small gives 90% of the quality at 20% of the cost of large.
Common interview mistakes
Mistake 1: Thinking embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → equals the generation model. The model you use to embed documents is different from the model you use to generate answers. They serve different purposes.
Mistake 2: Using the wrong embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → for the use case. Some embeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → models are optimized for sentence-level similarity, others for longer documents. Some are multilingual. Match the model to the use case.
Mistake 3: Not thinking about dimensionality tradeoffs. Higher dimensions = more expressive = better quality, but higher storage costs and slower search.
Key vocabulary
- Vector — An ordered list of numbers that represents something in mathematical space.
- Cosine similarity — A measure of similarity between two vectors based on the angle between them. Values from -1 to 1.
- EmbeddingEmbeddingNumerical representation of text capturing semantic meaning. Similar texts produce similar vectors, enabling similarity search.Learn more → model — A model that converts text into vectors. Separate from the generation model.
- Semantic search — Search that finds results based on meaning rather than exact keyword matching.