t-test vs. Mann-Whitney U: when to use each

Updated March 2026

You have two independent groups and a continuous outcome. Should you use an unpaired t-test or a Mann-Whitney U test? This is one of the most frequent decisions in applied statistics, and the answer depends on whether your data meets the assumptions of the parametric test.

The short answer

Condition	Use this test
Data is approximately normal in each group, similar variances	Unpaired t-test
Unequal variances but approximately normal	Welch's t-test (default in most software)
Data is not normal (skewed, outliers, ordinal)	Mann-Whitney U test
Large samples (n > 30 per group), moderate non-normality	Either — t-test is robust here

What each test does

Unpaired t-test

The unpaired (independent samples) t-test compares the means of two groups. It assumes:

Independence — observations in the two groups are independent
Normality — the outcome variable is approximately normally distributed in each group
Equal variances — the variances in the two groups are similar (relaxed by Welch's correction)

When these assumptions hold, the t-test is the most powerful test for detecting a difference in means — meaning it has the best chance of finding a real effect.

Mann-Whitney U test

The Mann-Whitney U test compares the distributions (more precisely, the ranks) of two independent groups. It only assumes:

Independence — observations are independent
Ordinal data — the outcome can be ranked (which all continuous data can)

It does not assume normality or equal variances. This makes it the safer choice when assumptions are in doubt.

Key distinction: The t-test compares means. The Mann-Whitney U tests whether one group tends to have larger values than the other. If both distributions have the same shape, these answer the same question. If the distributions differ in shape, they can give different answers — and the Mann-Whitney may be more meaningful.

When to choose each test

Use the t-test when:

Your data passes the Shapiro-Wilk normality test in each group (or Q-Q plots look approximately linear)
Your sample size is large enough (>30 per group) that the Central Limit Theorem provides robustness
You specifically care about comparing means
You want maximum statistical power for detecting a mean difference

Use Mann-Whitney U when:

Shapiro-Wilk rejects normality and your sample is small
Your data is ordinal (e.g., Likert scale, pain ratings)
Your data has pronounced outliers that would inflate the mean
The distributions are heavily skewed and medians are more meaningful than means
You want a test that's valid regardless of distribution shape

A common misconception

Many researchers believe that if the Shapiro-Wilk test is significant, they must use Mann-Whitney. This is too rigid. Consider:

With large samples, Shapiro-Wilk is overpowered — it rejects normality for trivial departures that don't affect the t-test
The t-test is robust to moderate non-normality when sample sizes are equal and reasonably large
Look at the Q-Q plot: if the points are roughly on the line with minor wobbles, the t-test is likely fine

The decision should be based on the severity of the violation, not just whether Shapiro-Wilk's p-value is below .05.

Effect sizes

Both tests have appropriate effect size measures:

Test	Effect size	Interpretation
t-test	Cohen's d	Small: 0.20, Medium: 0.50, Large: 0.80
Mann-Whitney U	Rank-biserial r	Small: 0.10, Medium: 0.30, Large: 0.50

Always report an effect size alongside the p-value. A statistically significant result with a tiny effect size may not be practically meaningful. See our effect sizes guide for more detail.

How to report each test in APA format

Unpaired t-test

APA format

An independent-samples t-test indicated that the treatment group (M = 23.4, SD = 5.1) scored significantly higher than the control group (M = 18.7, SD = 4.8), t(48) = 3.45, p = .001, d = 0.97, 95% CI [0.38, 1.56].

Mann-Whitney U test

APA format

A Mann-Whitney U test indicated that pain ratings were significantly lower in the treatment group (Mdn = 3) than in the control group (Mdn = 5), U = 156, p = .003, r = .45.

Note: for Mann-Whitney, report medians (not means), since the test is based on ranks.

Decision checklist

Check normality in each group (Shapiro-Wilk + Q-Q plot)
If normal in both groups: unpaired t-test
If normality fails but n > 30 per group and violation is moderate: t-test is likely fine — report the violation in your methods
If normality fails with small samples, strong skew, or ordinal data: Mann-Whitney U
If in doubt: report both tests. If they agree, it strengthens your conclusion

Once you've decided which test to use, calculate the sample size you need with our free power analysis calculator — it supports both t-tests and non-parametric alternatives.

Join the beta to try this in GraphHelix — the AI checks normality and equal variances automatically, and suggests switching to Mann-Whitney U when assumptions are violated.

Join the beta