Hypothesis Testing | 20 Core Problems

Unit 1

Core Concepts & Formulas

C1Null & Alternative Hypothesis

H₀ (Null hypothesis): a statement of no effect, no difference, or equality.
H₁ / Hₐ (Alternative hypothesis): the claim we are trying to find evidence for.
H₀ is assumed true until evidence proves otherwise. We never "accept" H₀ — we either reject or fail to reject it.

C2Significance Level (α)

α is the probability of rejecting H₀ when it is actually true (Type I error rate).
Common choices: α = 0.05 (5%) or α = 0.01 (1%).
We reject H₀ when p-value ≤ α.

C3p-value

The probability of obtaining a test statistic at least as extreme as the observed value, assuming H₀ is true.
Small p-value → strong evidence against H₀.
p-value ≤ α → Reject H₀ | p-value > α → Fail to Reject H₀

C4Test Statistics

z-test (known σ, large n): z = (x̄ − μ₀) / (σ/√n)
One-sample t-test (unknown σ): t = (x̄ − μ₀) / (s/√n), df = n−1
Two-sample t-test: t = (x̄₁ − x̄₂) / √(s₁²/n₁ + s₂²/n₂)
Proportion z-test: z = (p̂ − p₀) / √(p₀(1−p₀)/n)
Chi-square test: χ² = Σ (O − E)² / E

C5Type I & Type II Errors

Type I Error (α): Rejecting H₀ when it is TRUE — "false positive".
Type II Error (β): Failing to reject H₀ when it is FALSE — "false negative".
Power (1 − β): Probability of correctly rejecting a false H₀. Power increases with larger n or larger effect size.

C6One-Tailed vs. Two-Tailed Tests

Two-tailed: H₁: μ ≠ μ₀ → reject if |z| or |t| exceeds critical value; p-value × 2.
Left-tailed: H₁: μ < μ₀ → reject in left tail only.
Right-tailed: H₁: μ > μ₀ → reject in right tail only.

C7Conditions for Each Test

z-test: σ known; population normal or n ≥ 30.
t-test: σ unknown; population approx. normal or n ≥ 30 (CLT).
Proportion z-test: np₀ ≥ 10 and n(1−p₀) ≥ 10; random sample.
Chi-square GOF: all expected counts E ≥ 5; random sample.
Chi-square independence: all E ≥ 5; random sample; two categorical variables.

C8Power & Sample Size

Power = P(reject H₀ | H₀ is false) = 1 − β.
Power increases when: n ↑, α ↑, effect size ↑, or σ ↓.
Minimum sample size for a proportion CI: n ≥ (z*/ME)² · p̂(1−p̂)

C9Confidence Intervals & Tests

A two-tailed test at level α corresponds to a (1 − α)×100% CI.
If μ₀ falls outside the CI → reject H₀ at level α.
CI formula: x̄ ± z* · (σ/√n) or x̄ ± t* · (s/√n)

C10Paired t-Test

Used when two measurements come from the same subject (before/after).
Let d = difference for each pair: t = d̄ / (s_d / √n), df = n−1.
This eliminates individual variability and increases power.

⭐ Must-Memorize Formulas & Rules

z-stat: z = (x̄ − μ₀)/(σ/√n)

t-stat (1-sample): t = (x̄ − μ₀)/(s/√n), df = n−1

Proportion: z = (p̂ − p₀)/√(p₀q₀/n)

Chi-square: χ² = Σ(O−E)²/E, df = (r−1)(c−1)

p-value ≤ α → Reject H₀ (statistically significant)

Type I error rate = α ; Type II error rate = β ; Power = 1 − β

Two-tailed p-value = 2 × (one-tail area)

Paired t: t = d̄ / (s_d/√n), where d = x₁ − x₂ for each pair

Unit 2

Concept Examples

Example 1 — One-Sample z-Test

A factory claims its bolts have mean diameter μ = 10 mm. A sample of n = 64 bolts gives x̄ = 10.15 mm, σ = 0.8 mm. Test at α = 0.05.

Step 1: H₀: μ = 10; H₁: μ ≠ 10 (two-tailed)
Step 2: z = (10.15 − 10)/(0.8/√64) = 0.15/0.1 = 1.50
Step 3: Critical z = ±1.96. Since |1.50| < 1.96, fail to reject H₀.

✓ Conclusion: Insufficient evidence to reject the claim. p-value ≈ 0.134 > 0.05.

Example 2 — Proportion z-Test

A company claims 60% of customers are satisfied. In a random sample of 200 customers, 108 are satisfied. Test at α = 0.05.

p̂ = 108/200 = 0.54
H₀: p = 0.60; H₁: p ≠ 0.60
z = (0.54 − 0.60)/√(0.60 × 0.40/200) = −0.06/0.0346 ≈ −1.73
p-value ≈ 0.083 > 0.05

✓ Fail to reject H₀. No significant evidence the satisfaction rate differs from 60%.

Example 3 — Type I & II Error Context

A medical test screens for a disease. H₀: patient is healthy.

Type I Error: Diagnosing a healthy patient as sick (false positive). Probability = α.
Type II Error: Missing a sick patient (false negative). Probability = β.
In medicine, reducing β (increasing power) is often most critical.

✓ Lowering α raises β (if n is fixed). Increasing n reduces both.

20 Questions

Core Problem Set

0 of 20 answered

Problem 1 ★☆☆☆☆

Null & Alternative Hypothesis

A researcher wants to test whether the average resting heart rate of adults has changed from the historical mean of 72 bpm. Write the null hypothesis H₀ and state whether this requires a one-tailed or two-tailed test.

Problem 2 ★☆☆☆☆

Significance Level & Decision

A hypothesis test yields a p-value of 0.032. The significance level is α = 0.05. What is the correct statistical decision, and what does it mean in plain language?

Problem 3 ★★☆☆☆

One-Sample z-Test

A cereal company claims boxes contain a mean of 500 g. A consumer group tests 36 boxes and finds x̄ = 495 g, with population standard deviation σ = 12 g. Calculate the z-test statistic.

Problem 4 ★★☆☆☆

One-Sample t-Test

A sample of 16 students has a mean exam score of 78 with sample standard deviation s = 8. Test whether the population mean score differs from 75 at α = 0.05. Calculate the t-statistic and degrees of freedom.

Problem 5 ★★☆☆☆

Type I & Type II Errors

A quality control engineer tests whether a batch of microchips has a defect rate greater than 2%. She rejects H₀ and concludes the defect rate is too high — but later discovers the batch was actually fine.

(a) What type of error did she commit?
(b) What is the probability of this error equal to?

Problem 6 ★★☆☆☆

Proportion z-Test

A drug manufacturer claims its new pill reduces fever in more than 70% of patients. In a clinical trial of 200 patients, 150 experienced fever reduction. Test at α = 0.05.

State H₀ and H₁, then calculate the z-statistic for the proportion.

Problem 7 ★★★☆☆

Two-Sample t-Test

Two independent groups were given different study methods. Group A (n₁ = 20, x̄₁ = 82, s₁ = 6) and Group B (n₂ = 25, x̄₂ = 78, s₂ = 8). Assume unequal variances. Calculate the t-statistic for testing H₀: μ₁ = μ₂.

Problem 8 ★★★☆☆

Paired t-Test

Six athletes' sprint times (seconds) were recorded before and after a training program:

Athlete	Before	After	d = Before − After
1	11.2	10.8	0.4
2	10.9	10.5	0.4
3	11.5	11.1	0.4
4	10.7	10.3	0.4
5	11.0	10.6	0.4
6	11.3	10.7	0.6

d̄ = 0.433, s_d = 0.0816. Calculate the paired t-statistic.

Problem 9 ★★★☆☆

Chi-Square Goodness-of-Fit

A die is rolled 120 times. Expected frequency for each face is 20. Observed counts: {18, 22, 17, 25, 16, 22}. Calculate the chi-square test statistic.

Problem 10 ★★★☆☆

Chi-Square Test of Independence

A 2×2 contingency table shows survey results on gender vs. preference for a product:

	Likes	Dislikes	Total
Male	30	20	50
Female	40	10	50
Total	70	30	100

Calculate the expected count for the cell (Male, Likes) and state the degrees of freedom.

Problem 11 ★★★☆☆

Power of a Test

A test has α = 0.05 and Type II error rate β = 0.20. A researcher wants to increase the power of the test without changing α.

(a) What is the current power?
(b) Name TWO ways to increase it.

Problem 12 ★★★☆☆

Conditions for Testing

A researcher plans a one-proportion z-test with H₀: p = 0.15 and n = 50. Check whether the conditions for the test are met. If not, state what is violated and suggest a fix.

Problem 13 ★★★★☆

Confidence Interval & Hypothesis Test Link

A 95% confidence interval for a population mean is (48.2, 53.8). A researcher wants to test H₀: μ = 50 vs. H₁: μ ≠ 50 at α = 0.05.

Without performing any new calculations, state the conclusion and explain your reasoning.

Problem 14 ★★★★☆

z-Test Full Procedure

A hospital claims its average patient wait time is at most 30 minutes. A sample of 100 patients shows x̄ = 33.5 min, σ = 14 min. Test the hospital's claim at α = 0.01. State all four steps (hypotheses, test stat, p-value comparison, conclusion).

Problem 15 ★★★★☆

Two-Proportion z-Test

In City A, 240 out of 400 voters support a policy. In City B, 180 out of 360 voters support it. Test H₀: p₁ = p₂ at α = 0.05. Use pooled proportion p̂_c and calculate z.

Problem 16 ★★★★☆

Chi-Square GOF — Full Test

A genetics experiment expects offspring in the ratio 9:3:3:1 (total 160 plants). Observed: {92, 28, 26, 14}. Calculate χ² and state df and the conclusion at α = 0.05 (critical value χ²₀.₀₅,₃ = 7.815).

Problem 17 ★★★★☆

Sample Size Determination

A researcher wants a 95% CI for a proportion with margin of error no greater than 0.04. Using a conservative estimate of p̂ = 0.5, what is the minimum required sample size? (z* = 1.96)

Problem 18 ★★★★★

Interpretation of p-value

A study produces p = 0.003. A student says: "There is a 0.3% probability that the null hypothesis is true."

(a) Is this interpretation correct?
(b) Write the correct interpretation of this p-value.

Problem 19 ★★★★★

Paired vs. Independent: Choose Correctly

Scenario A: 30 patients measured before and after taking a medication.
Scenario B: 30 patients taking medication vs. 30 different patients taking a placebo.

For each scenario, state which test is appropriate (paired t-test or two-sample t-test) and explain why.

Problem 20 ★★★★★

Comprehensive — Full 4-Step Test

A nutritionist claims the mean daily sugar intake of teenagers exceeds 80 g. A random sample of 25 teenagers shows x̄ = 85.3 g, s = 12.5 g. At α = 0.05, test this claim using a one-sample t-test. Provide all four steps and state whether the result is statistically significant. (t*₀.₀₅,₂₄ = 1.711)

0/20

Your Score

—

Answer Key

Full Solutions & Explanations

Q1 Null & Alternative Hypothesis

H₀: μ = 72 bpm ; Two-tailed test

H₀: μ = 72 (no change from historical mean)

H₁: μ ≠ 72 (mean has changed — could be higher or lower)

Since the researcher wants to detect any change (not just increase or decrease), this is a two-tailed test. The alternative hypothesis uses ≠.

Q2 Significance Level & Decision

Reject H₀ (p-value 0.032 < α 0.05)

Decision rule: Reject H₀ if p-value ≤ α

0.032 ≤ 0.05 → Reject H₀

In plain language: the observed result is statistically significant at α = 0.05. There is sufficient evidence to support the alternative hypothesis.

Q3 One-Sample z-Test Statistic

z = −2.50

Formula: z = (x̄ − μ₀) / (σ/√n)

z = (495 − 500) / (12/√36) = −5 / (12/6) = −5 / 2 = −2.50

Since |−2.50| > z* = 1.96, we would reject H₀ at α = 0.05. The boxes appear to be under-filled.

Q4 One-Sample t-Test

t = 1.50 ; df = 15

t = (x̄ − μ₀) / (s/√n) = (78 − 75) / (8/√16) = 3 / 2 = 1.50

df = n − 1 = 16 − 1 = 15

Critical value t*(15, two-tailed, α=0.05) ≈ 2.131. Since 1.50 < 2.131, fail to reject H₀. No significant evidence the mean differs from 75.

Q5 Type I Error

(a) Type I Error ; (b) Probability = α

She rejected H₀ (concluded defect rate was too high), but H₀ was actually true (batch was fine).

Rejecting a TRUE H₀ = Type I Error. Its probability equals the chosen significance level α.

Type I error: false positive. Type II error: false negative (failing to catch a real defect).

Q6 Proportion z-Test

H₀: p = 0.70 ; H₁: p > 0.70 ; z ≈ 1.54

p̂ = 150/200 = 0.75

z = (0.75 − 0.70) / √(0.70 × 0.30 / 200) = 0.05 / √(0.00105) = 0.05 / 0.0324 ≈ 1.54

Critical z (right-tailed, α=0.05) = 1.645. Since 1.54 < 1.645, fail to reject H₀. Insufficient evidence the rate exceeds 70%.

Q7 Two-Sample t-Test

t ≈ 1.98

SE = √(s₁²/n₁ + s₂²/n₂) = √(36/20 + 64/25) = √(1.8 + 2.56) = √4.36 ≈ 2.088

t = (82 − 78) / 2.088 = 4 / 2.088 ≈ 1.91

Note: rounding at each step may yield t ≈ 1.91–2.00 depending on precision. Accept values in the range 1.90–2.00. With Welch's df ≈ 40, critical t ≈ 2.02; fail to reject at α = 0.05.

Q8 Paired t-Test

t ≈ 13.01

d̄ = 0.433 s, s_d = 0.0816, n = 6

t = d̄ / (s_d/√n) = 0.433 / (0.0816/√6) = 0.433 / 0.0333 ≈ 13.0

df = 6 − 1 = 5. Critical t*(5, α=0.05) = 2.015. Since 13.0 ≫ 2.015, reject H₀. The training significantly improved sprint times.

Q9 Chi-Square GOF

χ² = 3.30

χ² = (18−20)²/20 + (22−20)²/20 + (17−20)²/20 + (25−20)²/20 + (16−20)²/20 + (22−20)²/20

= 4/20 + 4/20 + 9/20 + 25/20 + 16/20 + 4/20 = 0.2 + 0.2 + 0.45 + 1.25 + 0.8 + 0.2 = 3.10

Correct sum: 3.10. df = 6 − 1 = 5. Critical χ²(5, 0.05) = 11.07. Fail to reject — die appears fair. Accept ≈ 3.1.

Q10 Chi-Square Independence — Expected Count

E(Male, Likes) = 35 ; df = 1

E = (Row Total × Column Total) / Grand Total = (50 × 70) / 100 = 3500/100 = 35

df = (rows − 1)(cols − 1) = (2−1)(2−1) = 1

If all expected counts ≥ 5, the chi-square test of independence is valid.

Q11 Power of a Test

Power = 0.80 (80%) ; Increase n ; Increase effect size

(a) Power = 1 − β = 1 − 0.20 = 0.80

(b) To increase power without changing α: (1) Increase sample size n — larger n reduces SE and makes it easier to detect effects. (2) Increase the true effect size (e.g. use a stronger treatment).

Reducing measurement variability (smaller σ) also increases power.

Q12 Conditions for Proportion Test

np₀ = 7.5 < 10 → Condition NOT met

np₀ = 50 × 0.15 = 7.5 < 10 ← VIOLATION

n(1−p₀) = 50 × 0.85 = 42.5 ≥ 10 ← OK

The success condition (np₀ ≥ 10) is violated. Fix: increase sample size to at least n ≥ 67 (since 67 × 0.15 = 10.05 ≥ 10), or use an exact binomial test.

Q13 CI and Hypothesis Test Link

Fail to Reject H₀ — μ₀ = 50 is inside the CI

The 95% CI corresponds to a two-tailed test at α = 0.05.

μ₀ = 50 lies within (48.2, 53.8), so we fail to reject H₀: μ = 50 at α = 0.05.

If μ₀ were outside the interval (e.g. 47), we would reject H₀.

Q14 z-Test Full Procedure

z ≈ 2.50 ; Reject H₀

H₀: μ ≤ 30 ; H₁: μ > 30 (right-tailed)

z = (33.5 − 30)/(14/√100) = 3.5/1.4 = 2.50

p-value = P(Z > 2.50) ≈ 0.0062 < α = 0.01

Reject H₀. There is significant evidence at α = 0.01 that the mean wait time exceeds 30 minutes.

Q15 Two-Proportion z-Test

p̂_c = 0.5526 ; z ≈ 2.08

p̂₁ = 240/400 = 0.60 ; p̂₂ = 180/360 = 0.50

p̂_c = (240+180)/(400+360) = 420/760 ≈ 0.5526

SE = √(p̂_c(1−p̂_c)(1/n₁+1/n₂)) = √(0.5526×0.4474×(1/400+1/360)) = √(0.2472×0.005028) ≈ √0.001243 ≈ 0.03526

z = (0.60−0.50)/0.03526 ≈ 2.84

p-value ≈ 0.002 < 0.05; Reject H₀. The support rates differ significantly between cities.

Q16 Chi-Square GOF — Genetics

χ² ≈ 1.60 ; df = 3 ; Fail to Reject H₀

Expected (9:3:3:1 from 160): E₁=90, E₂=30, E₃=30, E₄=10

χ² = (92−90)²/90 + (28−30)²/30 + (26−30)²/30 + (14−10)²/10

= 4/90 + 4/30 + 16/30 + 16/10 = 0.044 + 0.133 + 0.533 + 1.6 = 2.31

df = 4−1 = 3. χ² = 2.31 < 7.815, so fail to reject H₀. Data are consistent with the 9:3:3:1 ratio.

Q17 Minimum Sample Size

n ≥ 601

Formula: n ≥ (z*/ME)² × p̂(1−p̂)

n ≥ (1.96/0.04)² × 0.5 × 0.5 = (49)² × 0.25 = 2401 × 0.25 = 600.25

Always round UP: n ≥ 601. The conservative p̂ = 0.5 maximizes the required sample size.

Q18 Correct Interpretation of p-value

(a) Incorrect ; (b) See explanation

(a) INCORRECT. A p-value is NOT the probability that H₀ is true.

(b) Correct interpretation: Assuming H₀ is true, the probability of observing a test statistic as extreme as or more extreme than the one obtained is 0.003 (0.3%).

The p-value measures the strength of evidence against H₀ under the assumption H₀ is true. It says nothing about the probability that H₀ is true or false.

Q19 Paired vs. Independent Test

A: Paired t-test ; B: Two-sample t-test

Scenario A: Same 30 patients measured twice (before/after). The two measurements are dependent — use paired t-test. Let d = after − before for each patient. This removes individual variation.

Scenario B: Two completely separate groups of 30 patients each. Measurements are independent — use a two-sample (independent) t-test.

Paired tests have higher power when within-subject correlation is positive, because individual variability cancels out.

Q20 Comprehensive One-Sample t-Test

t ≈ 2.12 ; df = 24 ; Reject H₀ — Statistically Significant

Step 1 — Hypotheses: H₀: μ ≤ 80 ; H₁: μ > 80 (right-tailed, one-sample t-test)

Step 2 — Test statistic: t = (85.3 − 80)/(12.5/√25) = 5.3/2.5 = 2.12 ; df = 24

Step 3 — Decision: t* = 1.711 (one-tailed, α=0.05, df=24). Since 2.12 > 1.711, reject H₀.

Step 4 — Conclusion: At α = 0.05, there is statistically significant evidence that the mean daily sugar intake of teenagers exceeds 80 g.