Math, Applied

Multicollinearity: When Two Drivers Share the Same Story

Two nearly parallel feature vectors in regression

The idea

Regression wants to split credit across inputs. When ad spend and branded search move together, the model cannot tell which lever caused the lift. Coefficients become unstable: small data changes flip signs or swap magnitude.

Multicollinearity answers: Are my inputs redundant enough that driver rankings are not trustworthy?

Example: overlapping features in one regression

Drag feature overlap. When two inputs move together, coefficient credit becomes unstable.

Ad spend and branded search move together. Coefficients flip sign when both are in the model.

Feature overlap: 85%

Ad spend coef

0.50

Branded search coef

1.26

Ad spend and Branded search correlate at ~78%. Coefficients swap magnitude and can change sign with small data changes. Drop one feature or combine them.

The math

Collinear features

high correlation between columns of X

Columns of the design matrix point in nearly the same direction in feature space.

Unstable β

coefficients sensitive to small data changes

Drop one row or add a week and β1, β2 can swing even when predictions stay similar.

What to do

fix: drop, combine, or regularize

Keep one of the twins, build a composite index, or use ridge penalty. Do not interpret each coefficient as a clean causal lever.

A simple application: marketing attribution

Two channels rise in the same campaign weeks. The regression still forecasts, but coefficient signs are not a budget allocation guide. Check correlation before you present driver slides to leadership.

Marketing mix: correlated drivers in one regression

Move overlap between ad spend and branded search. Coefficients become unstable when both rise together.

Feature overlap (%): 85%

Ad coef 0.50 · Search coef 1.26 · r ≈ 78%

Regression coefficients

Ad spend: 0.50 · Branded search: 1.26

Correlation

78%

Stability

Unstable

Overlap

85%

Optimize (move here)

• Plot feature correlation before driver regression
• Combine collinear channels into one index

Hold (do not over-react)

• Splitting budget by raw coefficients when r > 0.8

Escalate if

• Correlation between drivers exceeds 85%
• Coefficient sign flips across weekly refits

Driver credit is unreliable. Drop one channel or build a composite before budget slides.

The habit: plot feature correlation before trusting multi-driver regression. Pair with overfitting when you have many correlated KPIs and short history.