AI Basics with AK

Season 03 - Introduction to Statistics

Arun Koundinya Parasa

Episode 11 - Type I & Type II Errors and One-Sample Tests

Recap: Episode 10 — Hypothesis Testing

Concept	What It Means
H₀ (Null Hypothesis)	The default claim — nothing changed
H₁ (Alternative)	The interesting claim — something did change
p-value	Probability of seeing this data if H₀ were true
α (significance level)	Our threshold — usually 0.05
Decision Rule	p < α → Reject H₀

Last episode we learned how to test. This episode we ask: what if we got it wrong?

Every Decision Has a Risk

The Problem

When we reject or fail to reject H₀ —

we are making a decision under uncertainty.

We never see the full truth.

We only see a sample.

And samples can mislead us.

So two kinds of mistakes are always possible.

The Two Mistakes

Type I Error Rejecting H₀ when it is actually true.

A false alarm.

Type II Error Failing to reject H₀ when it is actually false.

A missed signal.

Both are costly — just in different ways.

The Error Matrix

	H₀ is True	H₀ is False
Reject H₀	❌ Type I Error (α)	✅ Correct — True Positive
Fail to Reject H₀	✅ Correct — True Negative	❌ Type II Error (β)

α = P(Type I Error) = significance level you chose
β = P(Type II Error) = depends on effect size, sample size, α

You control α directly. β is influenced — not directly set.

Real World Cost of Each Error

Type I Error — False Alarm

Medical trial: Drug actually does nothing — but your test says it works.

→ Patients receive an ineffective drug

Quality control: Machine is fine — but test flags it as faulty.

→ Production shuts down unnecessarily

Cost: Acting on something that isn’t real.

Type II Error — Missed Signal

Medical trial: Drug actually works — but your test misses it.

→ A beneficial treatment is never approved

Quality control: Machine is broken — but test says it’s fine.

→ Defective products reach customers

Cost: Failing to act when you should have.

Visualizing the Two Errors

Red = Type I Error (false alarm). Blue = Type II Error (missed signal). The critical value separates the two decision zones.

The Trade-Off Between α and β

Lowering α (e.g., 0.01)

Fewer false alarms ✅
Harder to reject H₀
More missed signals ❌
β increases

Use when: False alarms are very costly.

Example: Approving a dangerous drug.

Raising α (e.g., 0.10)

More false alarms ❌
Easier to reject H₀
Fewer missed signals ✅
β decreases

Use when: Missing a real effect is very costly.

Example: Cancer screening.

Now We Test — One-Sample Tests

A one-sample test answers a simple question:

“Is the mean of my sample significantly different from a known or claimed value?”

Two tools depending on what you know:

Situation	Test to Use
σ (population std dev) is known	One-sample z-test
σ is unknown (use sample std dev s)	One-sample t-test

In practice, σ is almost never known.

→ t-test is what you’ll use most of the time.

z-test vs t-test — Quick Decision Guide

Question	Answer	Use
Is σ known?	Yes	z-test
Is σ known?	No	t-test
Is n large (≥ 30) and σ unknown?	Yes	t-test (z ≈ t at large n anyway)
Is n small and σ unknown?	Yes	t-test — especially important
Not sure?	Always	Default to t-test

The t-test was invented specifically because we rarely know σ in practice. It is the workhorse of one-sample testing.