Sandhya Indurkar

Math, Applied

Regression for Prediction: One Real Example from Data to Decisions

Regression line and forecast visual

The idea

In business data work, regression usually means building a rule that turns inputs into a predicted number. You choose inputs you can measure or control (ad spend, email volume, price, staffing hours). The model learns weights called parameters. Then you use it to forecast, compare plans, and tune budgets.

This is different from the phrase regression to the mean, which describes how extreme scores often move back toward average over time. Same word, different idea. Here we focus on predictive regression.

Regression answers: If inputs look like X, what outcome should we expect?

One real example: weekly orders for an online brand

A growth team wants to predict weekly orders from marketing inputs. They have ten weeks of history. Eight weeks are used to learn the pattern. Two weeks are set aside before anyone trusts the forecast.

What are train and holdout? (plain language)

Train means the weeks the model learned from (like studying past quizzes before a test). Holdout means weeks you hide on purpose and only peek at after the forecast is written (like a final exam you did not use to study). If the forecast is close on holdout weeks, you can use it for next week's budget. If not, widen your range or fix the inputs.

Loading…

Predicted orders ≈ 127 +13.9 × (ad $k) +1.6 × (email k)

The math

Predictive regression fits a line (or plane) that turns inputs you control into a forecast for the outcome you care about.

Linear regression model

predicted orders = b0 + b1(ad spend) + b2(email sends)

b₀ is the baseline intercept. b₁ is how many extra orders each $1k of ad spend adds, holding email volume flat. b₂ is how many orders each 1k emails adds, holding ad spend flat. The explorer and Table 4 show these as business levers.

Prediction error

residual = actual − predicted

Each week has a gap between what happened and what the model expected. Small, random residuals on holdout weeks mean the model generalizes. Large or patterned residuals mean you are missing a driver or overfitting history.

Why holdout matters

holdout error = average |residual| on weeks the model never trained on

Training error always looks good if you have enough parameters. Holdout error is the honest check: can the model forecast new weeks it has not seen?

Coefficients turn inputs into forecast movement: higher b₁ means each ad dollar buys more orders. One outlier week can tilt the whole line; toggle it in the explorer to see the shift. Adding inputs can help or overfit when history is short. Trust a coefficient only if it holds on holdout weeks and the input is something you can actually change. This is not regression to the mean: here you are forecasting from levers, not watching extremes snap back over time.

Business planning: three budget scenarios

The explorer sliders are for play. The table below is what you would put in a leadership memo: conservative, planned, and aggressive marketing, with forecast orders and estimated revenue.

Table 1: Next-week scenarios (full model, no outlier week)
PlanAd ($k)Email (k)Forecast ordersEst. revenueVs planned
Conservative1482454$22,700-55 orders vs planned
Planned1790509$25,450Baseline plan
Aggressive2196574$28,700+65 orders vs planned

Reality check: did the forecast survive new weeks?

Weeks 9 and 10 were not used to draw the regression line. This table is the business read: how far off was the forecast, and should we still trust the model for planning?

Table 2: Holdout weeks (forecast written before seeing these results)
WeekActual ordersForecastGapHow to read it
Week 94254338 below forecastClose enough to use for planning
Week 105415365 above forecastClose enough to use for planning

Which inputs belong in the model?

Tuning is deciding which levers to include. Fewer inputs are simpler. More inputs can help, or can overfit when you only have a handful of weeks. The table compares average miss on history vs on the reality-check weeks.

Table 3: Which approach to trust for next week
ApproachAvg miss on historyAvg miss on reality checkRecommendation
Ad spend only3 orders5 ordersStrong for next-week forecast
Email only5 orders18 ordersStrong for next-week forecast
Ad spend + email2 orders7 ordersStrong for next-week forecast

What one unit of spend buys you

Regression coefficients turn into business levers. From the planned week baseline ($17k ads, 90k emails), here is the incremental lift if you move one input and hold the other flat.

Table 4: Incremental lift from the regression line
If you change…Order impactRevenue impact
Add $1k ad spend (hold email flat)+13 orders$650
Add 5k email sends (hold ad flat)+8 orders$400

Regression connects the math you already use to daily decisions. Correlation shows movement together. Sample size tells you how much to trust a read. Regression adds a structured forecast and parameters you can explain to finance and marketing.

Strong teams show the line, the holdout check, and scenario tables. Weak teams report only a single predicted number with no sense of error, outliers, or alternatives.

Regression is a large topic, but one worked example carries most of the habit. Start with a clear outcome, pick inputs you can defend, fit on enough history, keep some data back to test, watch how outliers pull the line, then use predictions for planning. That workflow is what regression is for in real work.

A simple application: spend forecast

Move weekly ad spend. See predicted orders and holdout error before you lock budget.

Spend forecast: fit history, score holdout

Increase ad spend. See in-sample fit vs holdout weeks before locking budget.

Predicted 1,580 orders — holdout error ~8%

Orders forecast

Error (%)

Train: 4% · Holdout: 8%

Ad spend

$65k

Predicted orders

1,580

Holdout error

~8%

Optimize (move here)

  • Hold out recent weeks before trusting fit
  • Stress-test spend outside training range

Hold (do not over-react)

  • Budget from in-sample R² alone

Escalate if

  • Holdout error doubles when spend steps up

Holdout error is workable at this spend level.