From Wikipedia,
the free encyclopedia
An effect size describes
how large the relationship is
between two variables. This
information is important in
scientific research. Often it is
useful to know not only whether an
experiment had an effect, but also
the size of any effects. Effect
sizes are also helpful in
practical situations, for the
purpose of making decisions.
For example, if aliens were to
land on earth, how long would it
take for them to realise that, on
average, males are taller than
females? The answer relates to the
effect size of the difference in
height between men and women. The
larger the effect size, the easier
it is to see that men are taller.
If the height difference were
small, then it would take quite a
while (and much sampling) to
notice that men were, on average,
taller than women
The concept of an effect size
appears in everyday language. For
example, a weight loss program may
boast that it leads to an average
weight loss of 30 pounds. In this
case, 30 pounds is an indicator of
the claimed effect size. Another
example is that a tutoring program
may claim that it raises school
performance by one letter grade.
This grade increase is the claimed
effect size of the program.
In
inferential statistics, an
effect size is the size of a
statistically significant
difference. Effect sizes, along
with N and critical alpha
determine
power in
statistical hypothesis testing.
In
meta-analysis, effect sizes
are used as a common measure which
can be calculated for different
studies and then combined into
overall analyses.
Types of effect sizes
Pearson r correlation
Pearson's r
correlation is one of the most
widely used effect sizes. It can
be used when the data are
continuous or binary, thus the
Pearson r is arguably the
most versatile effect size. This
was the first important effect
size to be developed in
statistics, and it was introduced
by
Karl Pearson. Pearson's r
can vary in magnitude from -1.00
to 1.00, with -1.00 indicating a
perfect negative relationship,
1.00 indicating a perfect positive
relationship, and zero indicating
no relationship between two
variables.
Another often used measure of
the size of the relationship
between two variables is the
square or r, often referred
to as "r-squared" or the
coefficient of determination.
It is a measure of the proportion
of variance shared by the two
variables and varies from zero to
1.00.
Cohen's d
Another simple effect size is
Cohen's d, which is the
difference between two means
divided by the pooled
standard deviation for those
means. Thus,
-

where: meani
and SDi are the
mean and standard deviation for
group i, for i = 1,
2.
Different people offer
different advice regarding how to
interpret the resultant effect
size, but the most accepted
opinion is that of Cohen (1992)
where 0.2 is indicative of a small
effect, 0.5 a medium and 0.8 a
large effect size.
So, in the example of aliens
observing men and women's height,
the data (from a UK representative
sample of 1000 men and 1000 women)
could be:
- Men: Mean Height =
1754 mm; Standard Deviation =
70.00 mm
- Women: Mean Height =
1620 mm; Standard Deviation =
64.90 mm
The effect size (using Cohen's
d) would equal 1.99. This
is very large and aliens should
have no problem in detecting that
there is a substantial height
difference.
One point worth noting, though,
is that in some cases it may be
wise to use a pooled standard
deviation while in other cases it
makes more sense to use just one
of the standard deviations (e.g.,
pre-treatment standard deviation
in a therapeutic trial). However,
perhaps the best method is to use
Hedges' ĝ, as below.
Freely available software (freeware)
will compute most effect size
statistics (e.g., The Effect Size
Generator, GPower).
Hedges' ĝ
Hedges and Olkin (1985) noted
that one could adjust effect size
estimates by taking into account
the sample size. The problem with
Cohens' d is that the
outcome is heavily influenced by
the denominator in the equation.
If one standard deviation is
larger than the other then the
denominator is weighted in that
direction and the effect size is
more conservative. However, surely
it makes more sense to put stock
in the larger sample size? Hedges'
ĝ incorporates sample size
by both computing a denominator
which looks at the sample sizes of
the respective standard deviations
and also makes an adjustment to
the overall effect size based on
this sample size. The formula for
Hedges' ĝ (as used by
software such as the Effect Size
Generator) is

Odds ratio
The
odds ratio is another useful
effect size. It is appropriate
when both variables are binary.
For example, consider a study on
spelling. In a control group, two
students pass the class for every
one who fails, so the odds of
passing are two to one (or more
briefly 2/1 = 2). In the treatment
group, six students pass for every
one who fails, so the odds of
passing are six to one (or 6/1 =
6). The effect size can be
computed by noting that the odds
of passing in the treatment group
are three times higher than in the
control group (because 6 divided
by 2 is 3). Therefore, the odds
ratio is 3. However, it should be
noted that odds ratio statistics
are on a different scale to
Cohen's d. So, this '3' is not
comparable to a Cohen's d of '3'.
See also
References
Lipsey, Mark W., & Wilson,
David B. (2001). Practical
meta-analysis. Sage: Thousand
Oaks, CA.