Chapter 1 · Test Yourself

Your turn

Same eight moves you just learned — new question. Type each answer and check it, or reveal the full working whenever you're stuck.

A checklist with ticks
0 / 7 solved
sports_baseball

The scenario

When a pitcher throws four balls outside the strike zone, the batter gets a free base — a walk (BB). Walks put runners on for free, so the hunch is: teams whose pitchers walk more batters probably allow more runs. Let x = walks allowed and y = runs allowed, across seven MLB teams.

TeamWalks (x)Runs allowed (y)
Braves472644
Mets521705
Phillies498672
Marlins556748
Nationals541731
Pirates509689
Cardinals483658
A pitch going wide of the strike zone

n = 7. Answers are checked with a little rounding tolerance, so don't sweat the last decimal.

aCalculate the sample means x̄ and ȳ.2 marks

Sum each column, divide by 7.

ȳ
x̄ = (472+521+498+556+541+509+483) / 7 = 3580 / 7 = 511.43 ȳ = (644+705+672+748+731+689+658) / 7 = 4847 / 7 = 692.43

The line will pass through (511.43, 692.43) — the average team in this sample.

bCalculate Sxx = Σ(xᵢ−x̄)² and Sxy = Σ(xᵢ−x̄)(yᵢ−ȳ).4 marks

Build a deviation table, then sum the last two columns.

Sxx
Sxy
Teamxᵢ−x̄yᵢ−ȳ(xᵢ−x̄)²(xᵢ−x̄)(yᵢ−ȳ)
Braves−39.43−48.431554.71909.6
Mets9.5712.5791.6120.3
Phillies−13.43−20.43180.4274.4
Marlins44.5755.571986.52476.7
Nationals29.5738.57874.41140.0
Pirates−2.43−3.435.98.3
Cardinals−28.43−34.43808.3978.6
Σ≈ 0≈ 05501.86907.9

Both large and positive — walks and runs allowed clearly move together.

cEstimate the slope β̂₁ and intercept β̂₀.3 marks

β̂₁ = Sxy / Sxx, then β̂₀ = ȳ − β̂₁ × x̄.

β̂₁
β̂₀
β̂₁ = Sxy / Sxx = 6907.9 / 5501.8 = 1.256 β̂₀ = ȳ − β̂₁ × x̄ = 692.43 − 1.256 × 511.43 = 50.07
ŷ = 50.07 + 1.256 × x

Slope: each extra walk allowed is associated with ≈ 1.256 more runs allowed, on average.

dFind the fitted value and residual for the Marlins (x = 556).2 marks

Plug x = 556 into the line. Residual = actual − fitted. (Enter the residual.)

residual e
ŷ = 50.07 + 1.256 × 556 = 748.41 e = y − ŷ = 748 − 748.41 = −0.41

A tiny residual — the model fits the Marlins almost perfectly.

eCalculate SSE and s². State the degrees of freedom.3 marks

Square all 7 residuals and sum for SSE; then s² = SSE / (n−2).

SSE
Teamyᵢŷᵢeᵢeᵢ²
Braves644643.0+1.01.00
Mets705704.6+0.40.16
Phillies672675.3−3.310.89
Marlins748748.4−0.40.16
Nationals731729.3+1.72.89
Pirates689689.2−0.20.04
Cardinals658656.8+1.21.44
Σ≈ 016.58
df = n − 2 = 7 − 2 = 5 s² = SSE / (n−2) = 16.58 / 5 = 3.32 (s = 1.82 runs)
Exam trapAlways divide by n−2 for SLR — never n or n−1.
fCalculate R² and interpret it.3 marks

TSS = Σ(yᵢ−ȳ)², then R² = 1 − SSE / TSS.

TSS = Σ(yᵢ−ȳ)² = 8692.1 R² = 1 − SSE / TSS = 1 − 16.58 / 8692.1 = 0.998

Interpretation: walks allowed explain 99.8% of the variation in runs allowed across these 7 teams — an excellent fit.

gA new team, the Tigers, allowed 530 walks. Predict their runs allowed.1 mark

Substitute x = 530. Is this interpolation or extrapolation?

ŷ
ŷ = 50.07 + 1.256 × 530 = 715.75 ≈ 716 runs

x = 530 is inside the data range (472–556), so this is interpolation — a reliable prediction.

sports_baseball

Home run — 7 / 7!

You just ran a full simple linear regression by hand, start to finish. That's the whole chapter.

? Back to the worked example Now do it in R →