Understanding effect sizes: Cohen's d, eta-squared, and R-squared
A p-value tells you whether an effect exists. An effect size tells you how large the effect is. APA 7th edition requires effect sizes for all statistical tests, and an increasing number of journals will reject manuscripts that report only p-values.
This guide covers the most common effect size measures, when to use each, and how to interpret them.
Why effect sizes matter
Consider two studies that both find p < .05:
- Study A: t(200) = 2.10, p = .037, d = 0.30
- Study B: t(20) = 3.15, p = .005, d = 1.41
Both are statistically significant. But Study A found a small effect in a large sample, while Study B found a very large effect in a small sample. Without the effect size, you'd treat these as equivalent — they're not.
Effect sizes are essential for:
- Practical significance — Is the effect large enough to matter in the real world?
- Power analysis — Planning sample sizes for future studies (try our free sample size calculator)
- Meta-analysis — Combining results across studies requires standardized effect sizes
- Publication — APA 7th edition mandates effect size reporting
Cohen's d: comparing two groups
Cohen's d measures the standardized difference between two group means. It expresses the difference in standard deviation units.
d = (M1 − M2) / SDpooled
Interpretation benchmarks (Cohen, 1988)
| Magnitude | Cohen's d | What it means |
|---|---|---|
| Small | 0.20 | The groups overlap substantially; the difference is subtle |
| Medium | 0.50 | The difference is noticeable and often practically meaningful |
| Large | 0.80 | The groups are clearly different; minimal overlap in distributions |
When to use Cohen's d
- Unpaired t-test (between two independent groups)
- Paired t-test (comparing two related measurements)
- Post-hoc pairwise comparisons after ANOVA
t(48) = 3.45, p = .001, d = 0.97, 95% CI [0.38, 1.56]
Eta-squared and partial eta-squared: ANOVA effect sizes
Eta-squared (η2)
The proportion of total variance in the outcome explained by the grouping variable. Used with one-way ANOVA.
η2 = SSbetween / SStotal
Partial eta-squared (η2p)
The proportion of variance explained by one factor after removing variance explained by other factors. Used with factorial ANOVA and repeated measures ANOVA.
η2p = SSeffect / (SSeffect + SSerror)
Interpretation benchmarks
| Magnitude | η2 / η2p | What it means |
|---|---|---|
| Small | .01 | The factor explains about 1% of variance |
| Medium | .06 | The factor explains about 6% of variance |
| Large | .14 | The factor explains about 14% or more of variance |
Important: Partial eta-squared is always larger than eta-squared for the same data (because the denominator is smaller). Don't compare partial eta-squared from one study to eta-squared from another. Always note which measure you're using.
R-squared: regression effect sizes
R2 (the coefficient of determination) measures the proportion of variance in the outcome that is explained by the predictor(s) in a regression model.
- R2 = .04 → The model explains 4% of variance (small)
- R2 = .13 → The model explains 13% of variance (medium)
- R2 = .26 → The model explains 26% of variance (large)
For multiple regression, report adjusted R2, which penalizes for the number of predictors. This prevents overfitting from inflating the apparent effect size.
The model was significant, F(3, 96) = 8.42, p < .001, R2 = .21, adjusted R2 = .18.
Other effect sizes you may encounter
| Effect size | Used with | Small / Medium / Large |
|---|---|---|
| Cramér's V | Chi-square test | .10 / .30 / .50 |
| Rank-biserial r | Mann-Whitney U, Wilcoxon | .10 / .30 / .50 |
| Odds ratio (OR) | Logistic regression | 1.5 / 2.5 / 4.3 (Rosenthal, 1996) |
| Pearson r | Correlation | .10 / .30 / .50 |
| Hazard ratio (HR) | Cox regression | Context-dependent; no universal benchmarks |
Context matters more than benchmarks
Cohen himself called his benchmarks "small, medium, and large" with the caveat that they were starting points, not rules. In practice:
- In pharmacology, a d of 0.20 might be clinically meaningful if the drug has few side effects
- In education, a d of 0.40 could affect millions of students
- In basic neuroscience, d > 1.0 is common because experimental control is tight
Always interpret effect sizes in the context of your field, your intervention, and the practical consequences of the effect.
Reporting checklist
- Choose the effect size appropriate to your test (Cohen's d for t-tests, η2 for ANOVA, R2 for regression, etc.)
- Report the effect size alongside the test statistic and p-value — in the same sentence
- Include a confidence interval for the effect size when possible
- Note the magnitude label (small/medium/large) but interpret in context
- Use the same effect size measure consistently throughout a paper for the same type of comparison
Join the beta to try this in GraphHelix — every statistical test automatically reports the appropriate effect size with a magnitude label and 95% confidence interval.
Join the beta