From Wikipedia,
the free encyclopedia
Clinical Baseline -
Information gathered at the
beginning of a study from which
variations found in the study are
measured. A person's health status
before he or she begins a
clinical trial. Baseline
measurements are used as a
reference point to determine a
participant's response to the
experimental
treatment.
Baseline data should
adequately describe the population
in the trial.
This means including
demographic variables, known
factors that
influence the outcome
(including medications being taken
by participants), factors that are
likely to modify any benefit of
treatment, and those that may
predict adverse reactions. These
factors are called potential
"confounders", because, if they
are imbalanced between the
treatment groups at baseline, they
may result in an apparent
treatment effect when none exists,
or mask an effect that does exist.
Baseline data should also
include any factors (especially
known potential confounders) that
have been used as strata for
randomisation. Stratified
randomisation, described in detail
earlier,2 is used when a baseline
characteristic, such as tumour
stage, is known to affect outcome
risk; the characteristic is
therefore included in the
randomisation algorithm to
minimise imbalances between
treatment groups. This is
particularly useful in small
studies.
If the
study population contains
subgroups of particular interest,
the characteristics defining these
subgroups, and numbers or
proportion in each group, should
be stated. For example, in a
long-term trial of a new
medication for preventing heart
attack, diabetes mellitus would be
a potential confounder (as people
with diabetes have a much higher
risk of heart attack than similar
people without diabetes). Those
with diabetes in this study would
also be an interesting subgroup in
whom the effects of the
intervention might be different.
Similarly, concurrent therapy with
aspirin (which would substantially
reduce the risk of heart attack)
could confound the trial results
if there was an imbalance between
trial groups in the proportions of
patients taking aspirin; aspirin
therapy might also influence the
likelihood of adverse reactions to
study therapy. Baseline factors
can be determined from interviews,
physical examination, laboratory
measures or imaging studies.
Measurement
Baseline
data are measured as close as
possible to the time that
participants are randomly
allocated to study groups, and in
all cases, should be measured
before the allocated treatment
commences (information collected
after the commencement of trial
treatment may have been altered by
the treatment itself, and is
generally not regarded as baseline
data). Ideally, baseline data
should be collected on all
patients screened for eligibility,
as this would provide further
information about the
generalisability of the trial
population. However, this is not
always practicable or affordable,
so some variables (eg, tissue
biopsy, measurement of genetic
markers, expensive imaging tests)
are measured only in actual
participants randomly allocated to
a trial group.
For
factors that are not constant,
the conditions under which the
baseline data are collected should
be stated in the methods section
of the study report. For example,
it should be clear whether blood
pressure recordings were measured
sitting or supine, or after a
specified rest period; also
whether a single reading, the
average of several readings, the
highest of two, or the first of
two or more, was used.
Baseline data as entry
criteria
In some circumstances,
threshold levels of one or more
baseline variables will form part
of the entry criteria for the
study. In this case, if an extreme
value of a baseline factor, such
as high blood pressure, is
required to
qualify a person for entry
into a study, potential
participants whose value on the
day of screening is more extreme
(higher) than their usual level
will be more likely to qualify for
entry. A second baseline reading
of the average blood pressure for
this group will be lower and more
accurately reflect their usual
blood pressure; this is known as
regression towards the mean.3 For
this reason, remeasuring factors
required for entry is desirable,
to establish a more realistic
group average value of the
characteristic at baseline.
Presentation
The baseline characteristics
are usually presented in the first
table in a report. Care should be
taken to include the necessary
descriptive information without
overwhelming readers with
unnecessary details. For example,
in the recent AFFIRM trial
comparing rate control with rhythm
control of atrial fibrillation,
the published first table has 16
baseline characteristics, each
with a mean and percentage value
for the overall group, and for
both treatment groups separately,
together with P values.4 The
resulting table of 107 values and
four footnotes may make it
difficult for some readers to
extract the key information.5 A
simpler presentation appears in
the FRISC II study of invasive
compared with non-invasive
treatments for unstable coronary
artery disease.6 This presents
more baseline characteristics
(20), but by minimising detail
(omitting overall group and P
values), allows a more rapid
comparison of the characteristics
between groups.
Comparability between groups
If randomisation has been
performed correctly, the groups
should be similar in baseline
characteristics, except for the
play of chance. Stratification in
the randomisation process further
restricts the extent of chance
imbalances.2 For continuous
variables (such as blood pressure,
age, cholesterol level), the
similarity of the treatment groups
should be assessed by comparing
relevant summary measures (mean
and standard deviation, or median
and range). For categorical
factors (such as sex, disease
stage), the numbers and
proportions in each category level
should be shown for each treatment
group. The more similar the
treatment groups, the more
credible are the trial results as
reflecting a true result of
treatment, especially if
unadjusted analyses are presented.
Use of P values to assess
randomisation
Use of statistical tests to
compare the balance and/or values
of baseline characteristics
between the study groups and the
presentation of P values are not
uncommon. However, many authors
assert that this is
inappropriate.3,5,8-10 If
randomisation has been performed
correctly, chance is the only
explanation for any observed
difference between groups at the
outset of the study, in which case
statistical tests become
superfluous. Consequently, only if
it is suspected that the
randomisation process has failed
or was flawed, can performing
significance tests on the baseline
data be readily justified.8 It is
worth remembering that, if 20
baseline characteristics are
presented from a trial using
simple randomisation, it is more
likely than not that at least one
characteristic will show a
significant imbalance between
groups at two-sided P < 0.05 by
chance alone (actual likelihood,
64%).
In any case, providing P values
is not a substitute for carefully
describing, in the results
section, any imbalances between
study groups that may be
clinically important. For example,
in a trial of a thrombolytic drug,
a 1% baseline difference in
history of previous intracranial
haemorrhage may not be
statistically significant, but
could still affect haemorrhagic
stroke rates after treatment (an
outcome of the study), and hence
could be regarded as potentially
clinically significant. If there
are imbalances that are considered
important to the final study
results, they should be accounted
for by an adjusted analysis of the
data, not simply noted with a P
value in the first table.7
Other uses of baseline data
A longer-term benefit of
collecting comprehensive baseline
data is that, after outcome data
become available, it allows the
estimation of risk of the outcome
in the control group, related to
various baseline characteristics.
This effectively uses the control
group as an epidemiological cohort
study, providing contemporary
information about predictors of
disease outcomes.
In summary, careful planning
and collection of baseline data
enables performance of a
high-quality trial and allows
readers to clearly see the
internal and external validity of
the study.
References
- David C Burgess, Val J
Gebski and Anthony C Keech.
Baseline data in clinical
trials. MJA 2003; 179 (2):
105-107
- Moher D, Schulz K, Altman D.
The CONSORT statement: revised
recommendations for improving
the quality of reports of
parallel group randomised
trials. Lancet 2001; 357:
1191-1194. <PubMed>
- Beller EM, Gebski VJ, Keech
AC. Randomisation in clinical
trials. Med J Aust 2002; 177:
565-567.
- Friedman L, Furberg C,
DeMets D. Fundamentals of
clinical trials. 3rd ed. New
York: Springer, 1998.
- Wise DG, Waldo AL, DiMarco
JP, et al for AFFIRM
Investigators. A comparison of
rate control and rhythm control
in patients with atrial
fibrillation. N Engl J Med 2002:
347; 1825-1833.
- Pocock SJ, Assmann SE, Enos
LE, Casten LE. Subgroup
analysis, covariate adjustment
and baseline comparisons in
clinical trial reporting:
current practice and problems.
Stat Med 2002; 21: 2917-2930.
- Wallen L, Swahn E, Kontny F,
et al for FRISC II
Investigators. Invasive compared
with non-invasive treatment in
unstable coronary-artery
disease: FRISC II prospective
randomised multicentre study.
Lancet 1999; 354: 708-715. <PubMed>
- Gebski V, Keech AC.
Statistical methods in clinical
trials. Med J Aust 2003; 178:
182-184.
- Matthews J. An introduction
to randomised controlled
clinical trials. London: Arnold,
2000.
- Altman DG, Dore CJ.
Randomisation and baseline
comparisons in clinical trials.
Lancet 1990; 335: 149-152.
- Senn S. Testing for baseline
balance in clinical trials. Stat
Med 1994; 13: 1715-1726.