AI Basics with AK

Season 03 - Introduction to Statistics

Arun Koundinya Parasa

Episode 14 - Degrees of Freedom

Recap: Where We’ve Seen df Already

Episode	Test	Degrees of Freedom
09	T-Distribution	df = n − 1
11	One-sample t-test	df = n − 1
12	Independent t-test	df = n₁ + n₂ − 2
12	Paired t-test	df = n − 1
13	Chi-Square GoF	df = k − 1
13	Chi-Square Independence	df = (r−1)(c−1)

Degrees of Freedom - Intuition

Imagine 5 friends at a dinner table with 5 fixed seats.

Question: How many friends can choose their seat freely?

Friend 1 sits anywhere → free choice
Friend 2 sits anywhere remaining → free choice
Friend 3 sits anywhere remaining → free choice
Friend 4 sits anywhere remaining → free choice
Friend 5 has only one seat left → no choice

4 friends had freedom. 1 did not.

When you have n items and one constraint (the total is fixed) means we have n−1 free choices.

This is the core of df = n − 1.

A More Statistical Version

You have 5 numbers. Their mean is 10.

\[x_1, x_2, x_3, x_4, x_5 \quad \text{with} \quad \bar{X} = 10\]

How many values can you choose freely?

Say you pick: x₁ = 8, x₂ = 12, x₃ = 9, x₄ = 11

Now the sum must equal 50 (since mean = 10, n = 5).

\[x_5 = 50 - (8+12+9+11) = 50 - 40 = \mathbf{10}\]

x₅ is locked in. You had no choice.

4 values were free. 1 was constrained by the mean. df = n − 1 = 4

Why Estimating the Mean Costs One df

df in Each Test We’ve Covered

Every df formula reflects the same logic — how many free pieces of information remain after estimation.

The General Rule

df = (number of observations) − (number of parameters estimated from the data)

Every time you estimate something — you spend one degree of freedom.

What You Estimate	df Cost
One mean (x̄)	−1
Two means (x̄₁ and x̄₂)	−2
One proportion	−1
k category frequencies (with fixed total)	−1 (not −k)
Row totals + column totals in a table	−(r−1) − (c−1) = −r−c+2

The more you estimate — the less freedom remains. Less freedom = more uncertainty = heavier tails = harder tests.

A Common Misconception

What People Think df Means

❌ “It’s just a number you plug into a formula.”

❌ “Bigger df is always better.”

❌ “df only matters for t-tests.”

❌ “It’s the sample size minus one — always.”

These are all incomplete or wrong.

What df Actually Means

✅ It measures how much independent information your data carries after accounting for what you’ve already used to estimate.

✅ It controls how conservative your test is.

✅ It shapes which distribution you compare against.

✅ It changes for every test depending on structure.

df is a measure of remaining statistical information.