What are Effect Size, Power, and Sample Size Calculation and why do We Care?
By Brian Hunter, M.A.
You may have heard about these three terms and find them confusing when approaching a dissertation or project that requires you to calculate them and interpret them into the project.
What are the three term definitions from the title?
Effect size (ES) is a name given to a group of statistics that measure the magnitude or strength of a treatment or phenomena effect. ES measures are the common metric of meta-analysis studies that summarize the findings from a specific area of research. This tells us how easy it is or difficult it may be to find an effect when doing a research project.
The power of any test of statistical significance is defined as the probability that it will correctly reject a false null hypothesis. The question becomes how much power do you want in doing your test?
Sample Size calculation is done to ensure that enough participants or observations are gathered to ensure that the hypothesis testing has enough power to detect and true effect if it is actually present. Sample size calculation, therefore, depends on effect size and power.
How do we calculate Sample Size?
First, we need to know the default alpha level, the power level expected, the effect size of the phenomena under study and the statistical procedure that will be used to test our hypothesis before calculating Sample Size. Whew, that is a great deal of things to know. Where do we begin? We start by conducting what is known as a Power Analysis.
What is a Power Analysis?
1. The primary purpose of power analysis is to estimate sample size. First, the researcher must specify the power level they want to achieve. The default power level is usually .80 to .95 depending on your field of interest.
2. The calculations for power depend on the effect size of the phenomena under study in the population. You can use published experiments similar to the one you will be conducting or a meta-analysis done on your topic of interest as a guide to finding or calculating for yourself the effect size.
3. Use the default alpha level for your field. In behavioral sciences we use an alpha level of 0.05.
4. Choose what statistical test you will use to test your hypothesis.
5. Then, you choose you power level which can be from .80 to .95 which means you are 80% to 95% sure you have enough power to reject a false null hypothesis and prevent a Type II error.
Now that you have done this, what is next?
How do we conduct a Sample Size Calculation?
Once you know your power level, effect size, alpha level and statistical test for the hypothesis, you may use a public domain program known as G*power (Faul, Erdfelder, Lang, & Buchner, 2007). Faul et al. (2007) developed this program at the University of Düsseldorf and have made it available to the public for free. So, G*Power is able to compute power analyzes for many different hypothesis tests such as t tests, F tests, χ2 tests, z tests and some exact tests. G*Power can also be used to compute effect sizes and to display graphically the results of power analyzes. The program may be download for free with the given permission from the developers at http://www.gpower.hhu.de/en.html.
Let’s look at an example of how to do this…..
So, we decide that we are willing to have a power of .80 and we find from a published meta-analysis that the effect size of the phenomena we are studying is d=.30. The effect size in the case is relatively small. We want to do a One-Way ANOVA with three groups (Treatment 1, Treatment 2, and Placebo) on the dependent variable of depression level to test our hypothesis regarding differential effects among two treatments and a placebo.
Now, open up G*power and choose F-testsand then choose ANOVA, fixed effects, one way, omnibus, set power to .80, effect size to .30 and the number of groups to 3. G*power does the calculation and produces two graphics you see below, We found that we need a total sample size of 111 to have enough power (.80) to detect an effect size of .30. Please see Table 1 and Figure 1.
F tests - ANOVA: Fixed effects, omnibus, one-way
Analysis: A priori: Compute required sample size
Input: Effect size f = 0.30
α err prob = 0.05
Power (1-β err prob) = 0.80
Number of groups = 3
Output: Noncentrality parameter λ = 9.9900000
Critical F = 3.0803869
Numerator df = 2
Denominator df = 108
Total sample size = 111
Actual power = 0.8034951
Was that so difficult?
Once you understand what is involved and where to find those values and procedures, this whole idea of sample size estimation turns out to not be so intimidating. See the developer of G*power instructions for use and download below. Good Luck in your research endeavors.
Download the Short Tutorial of G*Power (PDF) written for G*Power 2 but still useful as an introduction. For more help, see the papers about G*Power in the References section below.
If you use G*Power for your research, then we would appreciate your including one or both of the following references (depending on what is appropriate) to the program in the papers in which you publish your results:
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175-191. Download PDF
Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyzes using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41, 1149-1160. Download PDF
To report possible bugs, difficulties in program handling, and suggestions for future versions of G*Power please send us an e-mail.
Download G*Power 220.127.116.11 for Windows XP, Vista, 7, and 8 (32 and 64 bit) (about 20 MB). Please make sure to choose “unpack with folders” in your unzip tool.