The rookie is on a roll and fits a regression to everything. This time, knowing the right answer means
knowing when SLR is the wrong tool.
0 / 2 graded
person
Rookie's submission
Model: wins ? jersey colour · n = 12 teams
x = jersey_colour: Red, Blue, White, Green, Black, ...
y = wins
fit <- lm(wins ~ jersey_colour)
"Estimated slope = 3.2"
"So switching to red would add about 3 wins."
1 · What's the fundamental problem here?
SLR fits ŷ = β₀ + β₁x, which multiplies x by a slope. That only means something if x is a number with order. "Red, Blue, White" has no numeric value to multiply, so a "slope of 3.2" is meaningless.
2 · So what should the rookie do instead?
For a categorical predictor, compare the groups — ANOVA, or encode the colours as dummy variables in a multiple regression. And even then: a difference between colours wouldn't prove that changing colour causes wins.
block
Graded — you knew when to walk away.
Knowing when not to reach for SLR is as important as knowing how to run it. Numeric X, straight-line link — otherwise, pick a different tool.