Math foundations
Vectors and Features: One Row Is a Point in Space
The idea
A spreadsheet row is already a vector. Each column is a dimension: spend, sessions, tenure, return rate. The cell values are coordinates. Machine learning, regression, and search all treat entities as points in this feature space.
The same customer can look very different depending on which columns you include. Feature choice is part of the model, not just the algorithm.
Vectors answer: How do we represent one entity as a list of numbers we can compare or predict from?
Example: one row as a feature vector
Each entity is a point in feature space. Dimensions are columns; coordinates are cell values.
One customer = one vector: orders, spend, sessions, days since last visit.
| Entity | Orders (30d) | Spend ($) | Sessions | Days idle |
|---|---|---|---|---|
| Customer #4821 | 4 | 220 | 12 | 3 |
Dimensions
4
L2 norm
220.4
Customer #4821 is a 4-dimensional vector. Each column is a coordinate. The L2 norm (length) is 220.4 in raw units, so scale features before comparing rows across tables.
The math
Feature vector
n features, n coordinates. Order matters: position i always maps to the same column.
L2 norm (length)
Combines all coordinates into one scale. Raw units mix dollars and counts, so normalize before comparing lengths across tables.
Data table form
m rows (samples), n columns (features). Regression, clustering, and embeddings all start from this layout.
A simple application
Before you compare customers or rank documents, write down the feature list. Changing one column changes the geometry. The dot product and regression posts assume you know what each coordinate means.