Sandhya Indurkar

Math foundations

Vectors and Features: One Row Is a Point in Space

Feature vector as a point in multi-dimensional space

The idea

A spreadsheet row is already a vector. Each column is a dimension: spend, sessions, tenure, return rate. The cell values are coordinates. Machine learning, regression, and search all treat entities as points in this feature space.

The same customer can look very different depending on which columns you include. Feature choice is part of the model, not just the algorithm.

Vectors answer: How do we represent one entity as a list of numbers we can compare or predict from?

Example: one row as a feature vector

Each entity is a point in feature space. Dimensions are columns; coordinates are cell values.

One customer = one vector: orders, spend, sessions, days since last visit.

EntityOrders (30d)Spend ($)SessionsDays idle
Customer #48214220123

Dimensions

4

L2 norm

220.4

Customer #4821 is a 4-dimensional vector. Each column is a coordinate. The L2 norm (length) is 220.4 in raw units, so scale features before comparing rows across tables.

The math

Feature vector

v = [x1, x2, …, xn]

n features, n coordinates. Order matters: position i always maps to the same column.

L2 norm (length)

||v|| = √(x1² + x2² + … + xn²)

Combines all coordinates into one scale. Raw units mix dollars and counts, so normalize before comparing lengths across tables.

Data table form

design matrix X: each row = one observation

m rows (samples), n columns (features). Regression, clustering, and embeddings all start from this layout.

A simple application

Before you compare customers or rank documents, write down the feature list. Changing one column changes the geometry. The dot product and regression posts assume you know what each coordinate means.