Scientists use mathematical tools collectively known as statistics to rigorously analyze the information they collect and make sense of their findings. When statistical tools are applied to biological problems, it is sometimes referred to as biostatistics. Studies depend on statistics to take unwieldy amounts of data and describe them in a manageable way and to test the significance of varying interventions on outcome (for example, if a drug had a real effect on a disease).
The term “significant” has a specific meaning in science publications. Scientists use the p value as a standard way of conveying how confident you can be about the truth of a result. The take-away message for looking at statistics in a scientific publication: The lower the p value, the more confident researchers are that their findings of differences are real, or as they say, “significant.”
The p value shows how likely it is that an association occurred by chance. Historically, a p value of .05 has been used to give “95 percent confidence” in a result, or 5 percent likelihood that the association occurred by chance. For example, if a publication states that 50 percent of patients’ cancers improved after taking a certain drug, with a p value of .05, this means that there is a 95 percent likelihood that the improvement was due to the drug and a 5 percent possibility that it occurred by chance.
Statisticians argue over the use of .05 as an arbitrary designation that is archaic in the age of computers that can easily crunch large numbers. But the fact remains that .05 is still a magic number that provides an easy gauge of whether a finding is significant or not. Findings with a p value of .05 are often marked by * in charts and graphs, while a p value of .01 (indicating 99 percent confidence) is marked by **.
One criticism of p values is whether even a very high significance value yields any practical implications. For example, a finding with a high degree of statistical significance could have very little use to patients, while clinically meaningful differences might be overlooked because they don’t meet the arbitrary cut-off. There is a movement among clinicians and statisticians to find ways of presenting the practical importance of findings. But for now, significance tests provide a useful summary of the findings.
Some of the concerns over the practical value of statistical significance can be remedied using a 95 percent confidence interval. This is an indication of the precision of the scientist’s estimate about a process—such as whether a drug made an impact on a disease. For example, say a scientist reports that 50 percent of patients in her study had shrinkage of their tumors, with a 95 percent confidence interval of 40 percent to 60 percent. This indicates that she is 95 percent certain that the actual percentage of patients with shrinkage is somewhere between 40 percent and 60 percent. P values and confidence intervals are related, but the latter conveys more information about how precise the estimate of the effect is. Journals are beginning to require that researchers use confidence intervals.
Another important concept in medical publications is risk. There are two main ways of presenting risk: absolute risk, which is the chance of developing a disease over a period of time, and relative risk, which compares the chance of developing a disease in two groups of people (such as those receiving a treatment and those who are not). Many reports state relative risk reduction rather than absolute risk reduction because it often looks like a more impressive finding. For example, a decrease in risk from 4 percent to 2 percent is a 50 percent relative risk reduction, but only a 2 percent absolute risk reduction. But consider this: A decrease in risk from 80 percent to 40 percent is also a 50 percent relative risk reduction, but with a much more impressive 40 percent absolute risk reduction.
Risk is used in prospective studies—those that take a group of people and follow them over a period of time to see how they fare with or without an intervention. Another type of comparison is a retrospective study, in which researchers study patients who already have a condition to try to determine what might have caused them to develop the problem. In this type of study, odds ratios are used to compare the probability of an event for two groups.
An odds ratio of 1 means that the event is equally likely in both groups, while an odds ratio of more than one means that the event is more likely in the first group and an odds ratio less than 1 means it is less likely in the first group. An odds ratio is calculated by dividing the odds in the treated or exposed group by the odds in the control group. As an example, if a treatment was found to reduce recurrence of a cancer by 25 percent, the odds ratio of the group receiving the treatment compared to the control group would be 0.75.
One last statistical technique to be aware of is correlation, which is used to indicate whether and how strongly pairs of variables are related. Correlation does not show whether one thing caused the other. The changes could be entirely coincidental. For example, if someone found that being a high-school dropout correlated with a higher risk of lung cancer, that doesn’t mean that dropping out of high school causes lung cancer, or that finishing high school protects against it.
However deep you choose to go into learning about the statistical methods of analysis, some basic things to keep in mind in assessing the work include the sample size and who funded the research. Also, even as a non-scientist, you should be able to see the logic of the question being asked and whether the results support the conclusion.
Next: For more information