z-Statistic vs. t-Statistic: A Simple Guide

“The z-score and t-score are not interchangeable—they are tools for different jobs.” Choosing the wrong one can lead to incorrect conclusions. This article clarifies when to use each statistic and why.

In statistics and econometrics, we often test if a sample result is significant. Two common tools are the z-statistic and the t-statistic. Both measure how far a sample mean is from a population mean, in units of standard error. The core difference is about knowledge: we use z when we know the population standard deviation (σ), and we use t when we must estimate it from the sample.

Core Concept: Known vs. Unknown Standard Deviation

The choice hinges on one piece of information: the population standard deviation (σ).

z-statistic: Use when σ is known and certain. This is rare in real-world research but common in quality control or standardized tests.
t-statistic: Use when σ is unknown and must be estimated from the sample data using the sample standard deviation (s). This is the standard in most econometric and social science research.

Example 1 z-Statistic: Factory Production

A factory knows from years of data that the weight of its cereal boxes has a population standard deviation σ = 5 grams. A new batch of 50 boxes has an average weight of 502 grams. We test if this is significantly different from the target of 500 grams.

z-score calculation:
z = (Sample Mean - Population Mean) / (σ / √n)
z = (502 - 500) / (5 / √50) = 2 / 0.707 ≈ 2.83

🔍 Explanation: Because the population standard deviation (σ=5) is known with certainty, we use the z-statistic. A z-score of 2.83 is compared to the standard normal distribution (z-distribution) to find the p-value.

Example 2 t-Statistic: Student Survey

A researcher surveys 30 students to estimate average study hours. The population standard deviation is unknown. The sample mean is 15 hours/week, and the sample standard deviation s = 4 hours. We test if this differs from the national claim of 12 hours/week.

t-score calculation:
t = (Sample Mean - Claimed Mean) / (s / √n)
t = (15 - 12) / (4 / √30) = 3 / 0.73 ≈ 4.11

Degrees of freedom (df) = n - 1 = 29.

🔍 Explanation: The population standard deviation is unknown, so we use the sample standard deviation (s=4) as an estimate. This introduces extra uncertainty, so we use the t-statistic and compare the t-score of 4.11 to the t-distribution with 29 degrees of freedom.

The Distributions: Normal vs. t

The statistics follow different probability distributions, which affects the critical values for hypothesis testing.

Key Differences: z-Distribution vs. t-Distribution
Feature	z-Distribution (Standard Normal)	t-Distribution
Shape	Fixed, bell-shaped curve	Similar bell shape, but heavier tails
Parameter	None (mean=0, sd=1)	Degrees of freedom (df = n-1)
Tails	Thinner	Fatter (more area in the tails)
Use Case	Known σ or very large n (n > 30)	Unknown σ and small to moderate n
Critical Value (95% CI)	±1.96	Depends on df (e.g., ±2.045 for df=29)

⚠️ Common Pitfalls and Misconceptions

Pitfall 1: Using z when σ is unknown. This underestimates uncertainty, making results seem more significant than they are. Always use t when estimating σ from the sample.
Pitfall 2: Thinking n=30 is a strict rule. The "large sample" rule (use z if n > 30) is a simplification. Modern practice favors using t whenever σ is unknown, regardless of sample size.
Pitfall 3: Confusing the test statistic with the critical value. The z or t score you calculate is compared to a critical value from its respective distribution table to make a decision.

Decision Rule: When to Use Which?

Follow this simple flowchart in your analysis:

Do you know the population standard deviation (σ) with certainty?
- YES → Use the z-statistic and the standard normal distribution.
- NO → Proceed to step 2.
You must estimate σ from your sample. Use the t-statistic and the t-distribution with df = n-1.

In econometrics, we almost always deal with unknown population parameters, making the t-statistic the default tool for regression coefficient significance tests (t-tests).

Example 3 Econometric Regression

In a regression model: Wage = β₀ + β₁*Education + ε
We estimate the coefficient β₁ = 2.5 with a standard error (SE) of 0.8.

To test if education significantly affects wages (H₀: β₁ = 0):
t-statistic = (Estimated Coefficient - Null Value) / Standard Error
t = (2.5 - 0) / 0.8 = 3.125

We compare t=3.125 to a t-distribution (df based on sample size) to find the p-value.

🔍 Explanation: In regression, the standard error of the coefficient is estimated from the sample data. Therefore, we always use the t-statistic (not z) to test the significance of regression coefficients, as seen in standard econometric software output.

Example 4 Large Sample Approximation

A poll surveys 1000 voters. The sample proportion preferring Candidate A is 54%. We test if this is > 50%. The standard error is calculated from the sample proportion itself.

Even with a large n=1000, because we are estimating the standard deviation (of the proportion) from the sample, modern practice uses the t-statistic. However, with such large df, the t-distribution is virtually identical to the z-distribution, so the critical values (±1.96) are often used for convenience.

🔍 Explanation: This shows the convergence principle: as sample size (and degrees of freedom) increases, the t-distribution approaches the z-distribution. For n > 100, the difference is negligible for practical purposes, but the correct theoretical basis is still the t-test.

z-Statistic vs. t-Statistic: A Simple Guide

Core Concept: Known vs. Unknown Standard Deviation

The Distributions: Normal vs. t

⚠️ Common Pitfalls and Misconceptions

Decision Rule: When to Use Which?

Frequently Asked Questions

Further Reading