AI Basics with AK

Season 03 - Introduction to Statistics

Arun Koundinya Parasa

Episode 14 - Degrees of Freedom

Recap: Where We’ve Seen df Already

Episode Test Degrees of Freedom
09 T-Distribution df = n − 1
11 One-sample t-test df = n − 1
12 Independent t-test df = n₁ + n₂ − 2
12 Paired t-test df = n − 1
13 Chi-Square GoF df = k − 1
13 Chi-Square Independence df = (r−1)(c−1)

Degrees of Freedom - Intuition

Imagine 5 friends at a dinner table with 5 fixed seats.

Question: How many friends can choose their seat freely?

  • Friend 1 sits anywhere → free choice
  • Friend 2 sits anywhere remaining → free choice
  • Friend 3 sits anywhere remaining → free choice
  • Friend 4 sits anywhere remaining → free choice
  • Friend 5 has only one seat leftno choice

4 friends had freedom. 1 did not.

When you have n items and one constraint (the total is fixed) means we have n−1 free choices.

This is the core of df = n − 1.

A More Statistical Version

You have 5 numbers. Their mean is 10.

\[x_1, x_2, x_3, x_4, x_5 \quad \text{with} \quad \bar{X} = 10\]

How many values can you choose freely?

Say you pick: x₁ = 8, x₂ = 12, x₃ = 9, x₄ = 11

Now the sum must equal 50 (since mean = 10, n = 5).

\[x_5 = 50 - (8+12+9+11) = 50 - 40 = \mathbf{10}\]

x₅ is locked in. You had no choice.

4 values were free. 1 was constrained by the mean. df = n − 1 = 4

Why Estimating the Mean Costs One df

df in Each Test We’ve Covered

Every df formula reflects the same logic — how many free pieces of information remain after estimation.

The General Rule

df = (number of observations) − (number of parameters estimated from the data)

Every time you estimate something — you spend one degree of freedom.

What You Estimate df Cost
One mean (x̄) −1
Two means (x̄₁ and x̄₂) −2
One proportion −1
k category frequencies (with fixed total) −1 (not −k)
Row totals + column totals in a table −(r−1) − (c−1) = −r−c+2

The more you estimate — the less freedom remains. Less freedom = more uncertainty = heavier tails = harder tests.

A Common Misconception

What People Think df Means

“It’s just a number you plug into a formula.”

“Bigger df is always better.”

“df only matters for t-tests.”

“It’s the sample size minus one — always.”

These are all incomplete or wrong.

What df Actually Means

✅ It measures how much independent information your data carries after accounting for what you’ve already used to estimate.

✅ It controls how conservative your test is.

✅ It shapes which distribution you compare against.

✅ It changes for every test depending on structure.

df is a measure of remaining statistical information.

Thank You 🌊