Cohens d: Definition, Calculation, and Interpretation of Effect Size

In the realm of statistics, psychology, social sciences, and many other research fields, understanding the magnitude of differences between groups is essential. While p-values and significance testing tell us whether an effect exists, they do not convey how large or meaningful that effect is. This is where Cohens d comes into play — a standardized measure of effect size that quantifies the difference between two means in terms of standard deviation units. This blog post will provide a detailed exploration of Cohen’s d, including its definition, calculation, interpretation, applications, and limitations, followed by a conclusion and a Q&A section to clarify common questions.

What is Cohens d?

Cohen’s d is a statistical measure used to express the size of the difference between two group means relative to the variability observed in the data. It is a standardized effect size, meaning it is unit-free and allows comparison across different studies or variables measured on different scales.

The concept was introduced by Jacob Cohen to supplement significance testing by providing a measure of practical significance — how large an effect is, not just whether it exists.

Cohen’s d is commonly used in:

Comparing experimental and control groups in psychology and medicine.
Reporting alongside t-tests and ANOVA results.
Meta-analyses to synthesize findings across studies.

How is Cohen’s d calculated?

The formula for Cohen’s d is:

$d = \frac{M_1 - M_2}{s_p}$

Where:

M1 = mean of group 1
M2 = mean of group 2
s_p = pooled standard deviation of both groups

Pooled standard deviation is calculated as a weighted average of the standard deviations of the two groups:

$s_p = \sqrt{ \frac{ (n_1 - 1) s_1^2 + (n_2 - 1) s_2^2 }{ n_1 + n_2 - 2 } }$

s1,s2 = standard deviations of groups 1 and 2
n1,n2 = sample sizes of groups 1 and 2

This formula assumes independent samples. For paired or repeated measures designs, a different approach using the standard deviation of difference scores is recommended.

Interpreting Cohen’s d

Cohen provided conventional benchmarks for interpreting the magnitude of d:

Cohen’s d	Effect Size Description
0.2	Small effect
0.5	Medium effect
0.8	Large effect

These thresholds are guidelines rather than strict rules. The meaning of small, medium, and large effects can vary by discipline and context. For example, in some fields, a d of 0.2 might be meaningful, while in others, a d of 0.5 could be considered trivial.

What does this mean practically?

A small effect (d = 0.2) means the group means differ by 0.2 standard deviations — a modest difference.
A medium effect (d = 0.5) indicates half a standard deviation difference.
A large effect (d = 0.8) means the means differ by 0.8 standard deviations, a substantial difference.

Graphically, a large Cohen’s d corresponds to distributions of the two groups with minimal overlap.

Example Calculation of Cohen’s d

Imagine a study comparing test scores between two groups:

Group	Mean Score	Standard Deviation	Sample Size
Treatment	85	10	30
Control	75	12	30

Step 1: Calculate pooled standard deviation:

$s_p = \sqrt{ \frac{ (30 - 1)(10)^2 + (30 - 1)(12)^2 }{ 30 + 30 - 2 } } = \sqrt{ \frac{ 29 \times 100 + 29 \times 144 }{ 58 } } = \sqrt{ \frac{ 2900 + 4176 }{ 58 } } = \sqrt{122.34} \approx 11.06$

Step 2: Calculate Cohen’s d:

$d = \frac{85 - 75}{11.06} = \frac{10}{11.06} \approx 0.90$

Interpretation: The treatment group scores are 0.9 standard deviations higher than the control group, which is a large effect size.

Why Use Cohen’s d?

Standardization: Cohen’s d expresses differences in standard deviation units, allowing comparison across studies with different measurement scales.
Complement to significance testing: While p-values indicate whether an effect exists, Cohen’s d quantifies how large that effect is.
Meta-analysis: Cohen’s d is widely used to aggregate effect sizes across studies.
Sample size planning: Knowing the expected effect size helps researchers calculate the sample size needed to detect an effect with adequate power.

Variations and Considerations

Choice of standard deviation: Sometimes, the standard deviation of the control group or the pretest is used instead of the pooled SD, depending on the design.
Paired samples: For repeated measures or paired designs, Cohen’s d can be calculated using the standard deviation of difference scores, but this requires adjusting for the correlation between measurements.
Unequal sample sizes: The pooled SD formula accounts for different group sizes.
Interpretation context: Cohen himself cautioned that the 0.2/0.5/0.8 thresholds may not fit all fields. Researchers should consider domain-specific norms.

Applications of Cohens d

Field	Use Case Example
Psychology	Measuring treatment effects in clinical trials
Education	Comparing teaching methods on student performance
Medicine	Evaluating drug efficacy between treatment groups
Social Sciences	Assessing differences in attitudes or behaviors
Business	Comparing marketing strategies or interventions
Sports Science	Measuring performance differences between training methods

Limitations of Cohen’s d

Sensitive to variability: If data variability is high, Cohen’s d decreases, potentially underestimating meaningful effects.
Not informative about direction: Cohen’s d shows magnitude but not direction of effect; however, the sign can indicate which group has the higher mean.
Assumes normality: The interpretation assumes roughly normal distributions.
Does not replace significance testing: It complements but does not substitute for hypothesis tests.

How to Calculate Cohens d in Statistical Software

Many software packages compute Cohen’s d automatically or via simple formulas:

SPSS: Available in t-test output or via syntax.
R: Packages like effsize provide functions such as cohen.d().
Python: Libraries like pingouin or manual calculation using NumPy and SciPy.

Example in Python:

pythonimport numpy as np

group1 = np.array([85, 90, 78, 92, 88])
group2 = np.array([75, 70, 80, 72, 68])

mean1, mean2 = np.mean(group1), np.mean(group2)
std1, std2 = np.std(group1, ddof=1), np.std(group2, ddof=1)
n1, n2 = len(group1), len(group2)

pooled_std = np.sqrt(((n1 - 1)*std1**2 + (n2 - 1)*std2**2) / (n1 + n2 - 2))
cohen_d = (mean1 - mean2) / pooled_std

print(f"Cohen's d: {cohen_d:.2f}")

Conclusion

Cohen’s d is a fundamental statistic for quantifying the size of differences between two groups in a standardized way. By expressing differences in standard deviation units, it facilitates meaningful interpretation and comparison of effects across studies and disciplines. While it complements significance testing by focusing on effect magnitude rather than mere existence, it requires careful consideration of context, variability, and study design.

Understanding Cohen’s d empowers researchers, students, and practitioners to communicate results more effectively and make informed decisions about the practical significance of their findings.

Q&A: Common Questions About Cohens d

Q1: What does a Cohen’s d of 0 mean?
A Cohen’s d of 0 indicates no difference between the two group means.

Q2: Can Cohen’s d be negative?
Yes, the sign of Cohen’s d indicates the direction of the difference (which group has a higher mean). The magnitude reflects effect size.

Q3: Is Cohen’s d only for two groups?
Primarily, yes. Cohen’s d measures the effect size between two means. We use eta-squared or partial Eta-squared for more than two groups.

Q4: How is Cohens d different from p-values?
P-values indicate whether an effect is statistically significant, while Cohens d quantifies the size of the effect, regardless of significance.

Q5: When should I use Cohens d instead of other effect sizes?
Use Cohens d when comparing two means, especially in experimental or quasi-experimental designs. For relationships between variables, correlation coefficients like Pearson’s r are more appropriate.

Q6: Can we use Cohens d in meta-analysis?
Yes, we use Cohens d widely in meta-analyses to aggregate and compare effect sizes across studies.

By mastering Cohen’s d, you enhance your statistical literacy and ability to interpret research results with nuance and clarity. Whether you are conducting experiments, reading scientific literature, or synthesizing evidence, Cohen’s d is an indispensable tool in your analytical toolkit. Data Science Blog