Math foundations
Dot Product and Cosine Similarity: How Aligned Are Two Vectors?
The idea
Search, recommendations, and RAG retrieval rank candidates by similarity to a query vector. Dot product counts overlapping signal and rewards long vectors. Cosine similarity divides by length so a short query is not drowned out by a long document.
Dot product answers: How much do these vectors point the same direction, weighted by magnitude? Cosine removes the magnitude part.
Example: dot product vs cosine similarity
Dot product rewards overlap and length. Cosine measures angle: how aligned are the directions?
Query and doc as word-count vectors. Dot product ranks overlap; cosine fixes length bias.
| Vector | refund | return | policy | shipping |
|---|---|---|---|---|
| Query: refund policy | 3 | 1 | 2 | 0.00 |
| Doc B: Returns FAQ | 2 | 4 | 3 | 1 |
Dot product
16.00
Cosine
0.78
||query||
3.74
||candidate||
5.48
Dot product 16.00 favors longer vectors. Cosine 0.78 is 78% aligned on direction. Use cosine when document or profile length varies.
The math
Dot product
Multiply matching coordinates and add. High when both vectors are large on the same features.
Cosine similarity
Range −1 to 1 for real embeddings; often 0 to 1 for nonnegative text counts. 1 means same direction, 0 means orthogonal.
Unit vectors
Many embedding pipelines normalize vectors first so dot product equals cosine.
A simple application
Pick cosine for search and duplicate detection when document length varies. Pick dot product when scores are already calibrated probabilities on the same scale. Always state which metric ranked your top-k results.