Chapter 2 · In R

More predictors, same one line

Adding a predictor in R is literally a +. The matrix inversion you skipped by hand? R just does it.

lm() as a machine

Never used R? Set it up in 2 minutes →

terminal

Add a predictor with +

The formula y ~ x1 + x2 reads "y explained by x1 and x2". Everything else is identical to SLR.

R
# runs explained by home runs AND walks mlb <- data.frame( hr = c(245, 221, 198, 214, 177, 162), walks = c(520, 540, 505, 560, 480, 500), runs = c(690, 696, 652, 706, 623, 632) ) fit <- lm(runs ~ hr + walks, data = mlb)
data_object

Read it with summary()

A console window
R console
> summary(fit) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 139.676 13.782 10.13 0.00205 ** hr 0.508 0.031 16.42 0.00049 *** walks 0.819 0.032 25.26 0.00014 *** Residual standard error: 1.64 on 3 degrees of freedom Multiple R-squared: 0.9987, Adjusted R-squared: 0.9979 F-statistic: 1165 on 2 and 3 DF, p-value: 6.1e-05

Everything from the worked example is in this one block:

In the outputValueMeaning
hr0.508+0.51 runs per HR, holding walks fixed
walks0.819+0.82 runs per walk, holding HR fixed
Adjusted R-squared0.9979fit, penalised for 2 predictors
F-statistic1165whole model is significant (p ≈ 0)
Read the adjusted oneWith more than one predictor, quote Adjusted R-squared when comparing models — it's the honest number.
checklist

The MLR cheat-sheet

Test yourself · R

Four quick checks

Based on the output above. Type-and-check.

A console with a check
0 / 4 solved
aWrite the formula to model runs from hr and walks.1 mark

Add predictors with +.

formula
lm(runs ~ hr + walks, data = mlb)
bFrom the output, what is the adjusted R²?1 mark
adj R²
Adjusted R-squared: 0.9979
cWhich function checks whether predictors are too alike (multicollinearity)?1 mark

Three letters — "variance inflation factor".

function
vif(fit) # from the car package
dFrom the output, what is the coefficient on walks?1 mark

The Estimate on the walks row.

β̂ walks
walks 0.819
terminal

4 / 4 — lm() scales up.

One predictor or ten, it's the same call with more +s. The hard part is interpretation — which you've now done both ways.

? Test Yourself Practice it →