AI Basics with AK

Season 03 - Introduction to Statistics

Arun Koundinya Parasa

Episode 11 - Type I & Type II Errors and One-Sample Tests

Recap: Episode 10 — Hypothesis Testing

Concept What It Means
H₀ (Null Hypothesis) The default claim — nothing changed
H₁ (Alternative) The interesting claim — something did change
p-value Probability of seeing this data if H₀ were true
α (significance level) Our threshold — usually 0.05
Decision Rule p < α → Reject H₀

Last episode we learned how to test. This episode we ask: what if we got it wrong?

Every Decision Has a Risk

The Problem

When we reject or fail to reject H₀ —

we are making a decision under uncertainty.

We never see the full truth.

We only see a sample.

And samples can mislead us.

So two kinds of mistakes are always possible.

The Two Mistakes

Type I Error Rejecting H₀ when it is actually true.

A false alarm.

Type II Error Failing to reject H₀ when it is actually false.

A missed signal.

Both are costly — just in different ways.

The Error Matrix

H₀ is True H₀ is False
Reject H₀ ❌ Type I Error (α) ✅ Correct — True Positive
Fail to Reject H₀ ✅ Correct — True Negative ❌ Type II Error (β)
  • α = P(Type I Error) = significance level you chose
  • β = P(Type II Error) = depends on effect size, sample size, α

You control α directly. β is influenced — not directly set.

Real World Cost of Each Error

Type I Error — False Alarm

Medical trial: Drug actually does nothing — but your test says it works.

→ Patients receive an ineffective drug

Quality control: Machine is fine — but test flags it as faulty.

→ Production shuts down unnecessarily

Cost: Acting on something that isn’t real.

Type II Error — Missed Signal

Medical trial: Drug actually works — but your test misses it.

→ A beneficial treatment is never approved

Quality control: Machine is broken — but test says it’s fine.

→ Defective products reach customers

Cost: Failing to act when you should have.

Visualizing the Two Errors

Red = Type I Error (false alarm). Blue = Type II Error (missed signal). The critical value separates the two decision zones.

The Trade-Off Between α and β

Lowering α (e.g., 0.01)

  • Fewer false alarms ✅
  • Harder to reject H₀
  • More missed signals ❌
  • β increases

Use when: False alarms are very costly.

Example: Approving a dangerous drug.

Raising α (e.g., 0.10)

  • More false alarms ❌
  • Easier to reject H₀
  • Fewer missed signals ✅
  • β decreases

Use when: Missing a real effect is very costly.

Example: Cancer screening.

Now We Test — One-Sample Tests

A one-sample test answers a simple question:

“Is the mean of my sample significantly different from a known or claimed value?”

Two tools depending on what you know:

Situation Test to Use
σ (population std dev) is known One-sample z-test
σ is unknown (use sample std dev s) One-sample t-test

In practice, σ is almost never known.

t-test is what you’ll use most of the time.

z-test vs t-test — Quick Decision Guide

Question Answer Use
Is σ known? Yes z-test
Is σ known? No t-test
Is n large (≥ 30) and σ unknown? Yes t-test (z ≈ t at large n anyway)
Is n small and σ unknown? Yes t-test — especially important
Not sure? Always Default to t-test

The t-test was invented specifically because we rarely know σ in practice. It is the workhorse of one-sample testing.

Thank You 🌊