The rookie predicted runs from BOTH total bases and home runs, then panicked at the result.
0 / 2 graded
person
Rookie's submission
Model: runs ? total_bases + home_runs
total_bases: +0.9 "makes sense"
home_runs: -1.8 "?! home runs HURT scoring?"
(the two predictors correlate at r = 0.95)
1 · Why did home runs get a nonsensical negative coefficient?
When two predictors are nearly identical, the model can't separate their effects, so coefficients swing wildly and standard errors balloon — even though the model's predictions stay fine.
2 · What should the rookie do?
Check multicollinearity with VIF; with two near-duplicates, keep the more meaningful one. Prediction is fine — it's the individual coefficients that can't be trusted.
join_inner
Graded — spotted the overlap.
Wild, sign-flipped coefficients with big standard errors are the classic fingerprint of multicollinearity.