How do I choose an effect size for power analysis?

Use published meta-analyses (gold standard), prior studies in your domain, the smallest effect size of interest (SESOI), or pilot data (with caution). Cohen's benchmarks (d = 0.2/0.5/0.8) are a last resort and should not be used without justification. The effect size is the single most consequential input in a power analysis.

Should I calculate post-hoc (observed) power?

No. Post-hoc power is a one-to-one function of the p-value and cannot tell you anything the p-value doesn't already tell you. Instead, use sensitivity analysis (what effect size could you detect with your sample?), report confidence intervals for the effect size, or use your estimates for prospective power analysis of a future study.

How do I calculate sample size for animal studies?

When effect size estimates are unavailable, use the resource equation method: E = N - k, where N is total animals and k is number of groups. Target 10 <= E <= 20. For example, with 4 groups you need 14-24 total animals (4-6 per group). This satisfies the 3Rs reduction principle required by IACUCs and ethics committees.

What statistical tests does this sample size calculator support?

This calculator supports nine methods: independent t-test, paired t-test, one-sample t-test, one-way ANOVA, chi-square test, Pearson correlation, multiple regression, two proportions, and animal study resource equation method. Each uses the appropriate effect size metric (Cohen's d, Cohen's f, Cohen's w, r, or f-squared).

Sample Size Calculator — Free Power Analysis Tool

How to Choose an Effect Size Recommended reading

The effect size you enter is the single most consequential input in a power analysis. It determines your required sample size more than any other parameter. Choosing "medium" by default is tempting but rarely defensible.

Where to find your effect size (in preference order)

Published meta-analyses — The gold standard. Search for meta-analyses in your field that report effect sizes for your outcome. A meta-analytic estimate pools data across multiple studies, giving you the most stable estimate of the true effect.
Prior studies in your domain — If no meta-analysis exists, look at individual studies that measured a similar outcome with a similar population. Extract their reported effect sizes and use the median or a conservative estimate.
Smallest effect size of interest (SESOI) — Ask: "What is the smallest effect that would be practically meaningful?" This approach, gaining traction in methodological literature, grounds your power analysis in clinical or practical relevance rather than statistical convention (Lakens, 2022).
Pilot data — Use cautiously. Pilot studies typically have small samples, producing noisy effect size estimates that can be misleadingly large. If using pilot data, consider reducing the estimate by 20-30%.
Cohen's benchmarks — A last resort. Cohen (1988) proposed small/medium/large benchmarks (e.g., d = 0.2/0.5/0.8 for t-tests), but these are arbitrary and do not correspond to equivalent magnitudes across different effect size metrics. Using them without justification weakens your methods section.

Effect size benchmarks by test type

Test	Metric	Small	Medium	Large
t-tests	Cohen's d	0.2	0.5	0.8
ANOVA	Cohen's f	0.10	0.25	0.40
Chi-square	Cohen's w	0.10	0.30	0.50
Correlation	r	0.10	0.30	0.50
Regression	f²	0.02	0.15	0.35

Remember: these benchmarks describe what Cohen observed across the behavioral sciences in the 1960s. Your field may be different. An effect of d = 0.2 might be negligible in one context and transformative in another.

References: Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lakens, D. (2022). Sample size justification. Collabra: Psychology, 8(1).

Need help choosing? Read our guide to understanding effect sizes for worked examples across different study designs.

Post-Hoc Power: Why You Don't Need It

Post-hoc power (also called observed power or retrospective power) computes "the power your study had" using the observed effect size after data collection. It sounds useful but is mathematically uninformative.

Why post-hoc power is circular

Post-hoc power is a one-to-one function of the p-value. When your p-value is non-significant (p > 0.05), post-hoc power is always less than 50%. When your p-value is significant, post-hoc power is always greater than 50%. It cannot tell you anything the p-value doesn't already tell you.

As Hoenig and Heisey (2001) demonstrated, observed power calculated from the study's own data is essentially a transformation of the test statistic. Gelman (2019) was more blunt, calling post-hoc power calculations uninformative for the same reason.

What to do instead

If your study produced a non-significant result and you want to understand whether you lacked power:

Sensitivity analysis — Given your actual sample size and alpha, what is the minimum effect size you could have detected with 80% power? This is informative because it uses a fixed (assumed) effect size, not the observed one.
Confidence interval for the effect size — Report the 95% CI around your observed effect. A wide CI that includes meaningful effect sizes suggests inadequate precision, which is a more direct and informative framing than "low power."
Prospective power for future studies — Use effect size estimates from your study (with appropriate caution about overestimation) to plan the next study.

This calculator supports sensitivity analysis mode — switch to it using the analysis mode toggle to determine the minimum detectable effect for your sample size.

References: Hoenig, J. M., & Heisey, D. M. (2001). The abuse of power: The pervasive fallacy of power calculations for data analysis. The American Statistician, 55(1), 19–24. Gelman, A. (2019). Don't calculate post-hoc power using observed estimate of effect size. Annals of Surgery, 269(1).

How to Report Power Analysis in Your Methods Section

Grant applications, ethics reviews, and journal submissions expect a clear power analysis in your methods section. Here is what to include and how to format it.

Required elements

The statistical test you plan to use
Alpha level (typically .05, two-tailed)
Target power (typically .80 or .90)
Expected effect size with justification for the value chosen
Resulting sample size (per group and total)
Attrition adjustment if applicable
Software used for the calculation

APA-style template

For a two-group comparison:

"An a priori power analysis was conducted using [software] to determine the required sample size for an independent-samples t-test. With an expected effect size of d = [value] (based on [justification]), significance level of α = .05 (two-tailed), and power = .80, the required sample size was n = [value] per group (N = [total]). Accounting for an expected attrition rate of [X]%, we plan to recruit [adjusted N] participants."

For ANOVA:

"A power analysis using [software] indicated that to detect a medium effect (f = [value]) in a one-way ANOVA with [k] groups, α = .05, and power = .80, a minimum of n = [value] per group (N = [total]) would be required."

Common reviewer comments to avoid

"The power analysis uses Cohen's medium benchmark without justification" — Always cite the source of your effect size
"Post-hoc power is not meaningful" — Only report a priori power analyses
"It is unclear whether N refers to total or per-group" — Always specify both

Use the Copy APA snippet button in the results to get a pre-formatted methods paragraph you can paste directly into your manuscript. For help reporting your ANOVA results, see our guide to reporting ANOVA results in APA format.

Power Analysis for Animal Studies

Animal research presents unique challenges for power analysis. Ethics committees (IACUCs) require justification that you are using the minimum number of animals necessary to achieve scientifically valid results while minimizing animal use — the "reduction" principle of the 3Rs (Replace, Reduce, Refine).

The resource equation method

When effect size estimates are unavailable or unreliable (common in early-stage animal research), the resource equation method provides a practical alternative to traditional power analysis. It uses the error degrees of freedom (E) of the planned ANOVA design:

E = N − k, where N is total animals and k is number of groups
Target: 10 ≤ E ≤ 20
Below 10: insufficient power to detect meaningful effects
Above 20: likely using more animals than necessary

For example, with 4 treatment groups, you need 14–24 total animals (3.5–6 per group, typically rounded to 4–6).

When to use each approach

Method	Best for	Requirements
Traditional power analysis	Studies with prior effect size estimates	Effect size, alpha, power
Resource equation	Exploratory studies, novel endpoints, no prior data	Number of groups only
Pilot study	Generating effect size estimates for the definitive study	Minimum viable sample

This calculator supports the resource equation method — select "Animal study (resource equation)" in the test selection flowchart to calculate the recommended range of animals per group.

Reference: Mead, R. (1988). The Design of Experiments. Cambridge University Press. Festing, M. F. W., & Altman, D. G. (2002). Guidelines for the design and statistical analysis of experiments using laboratory animals. ILAR Journal, 43(4), 244–258.

Statistical Tests Supported

This calculator supports sample size and power calculations for nine statistical methods. Each method addresses a different research question.

Test	Use when	Effect size metric
Independent t-test	Comparing the means of two separate groups (e.g., treatment vs. control)	Cohen's d
Paired t-test	Comparing two measurements from the same subjects (e.g., before vs. after)	Cohen's d
One-sample t-test	Comparing a sample mean to a known population value	Cohen's d
One-way ANOVA	Comparing means across three or more groups	Cohen's f
Chi-square test	Testing association between two categorical variables	Cohen's w
Pearson correlation	Testing whether two continuous variables are linearly related	r
Multiple regression	Testing whether a set of predictors explains variation in an outcome	f²
Two proportions	Comparing success/failure rates between two groups	Arcsine-transformed difference
Animal study (resource equation)	Planning animal experiments when effect size is unknown	Error degrees of freedom (E)

Not sure which test to use? Our guide to choosing a statistical test walks through the decision based on your research question and data type.

Sample Size & Power Calculator

Power Analysis Resources