Hill made it clear, however, that such a study had to comply with some pretty strict criteria in order to be considered valid. These criteria are worth restating, because they stand in sharp contrast to the bulk of epidemiological research:
Is the association strong enough that we can rule out other factors?
Have the results been replicated by different researchers, and under different conditions?
Is the exposure associated with a very specific disease as opposed to a wide range of diseases?
Did the exposure precede the disease?
- Biological gradient
Are increasing exposures associated with increasing risk of disease?
Is there a credible scientific mechanism that can explain the association?
Is the association consistent with the natural history of the disease?
- Experimental evidence
Does a physical intervention show results consistent with the association?
Is there a similar result to which we can draw a relationship?
Above all, as Brignell emphasises, correlation does not prove causation. He draws an analogy with growing tomatoes and fertiliser. It can easily be shown that increasing use of fertiliser will increase tomato yields. But fertiliser does not cause tomatoes, it merely promotes the process of growth. The same goes for smoking and lung cancer. Smoking may massively promote the growth of lung cancer, but from this we cannot conclude that smoking causes the tumours. We can only draw conclusions about causation if we actually investigate the mechanisms that lie behind the link between smoking and lung cancer. Hill and Doll had nothing to say about why cancer occurs, they just provided us with a very valuable lead in the investigation.
Nonetheless, it is an entirely reasonable conclusion to draw that smokers will, on average, die younger than non-smokers, and we do not need to know the precise mechanism to conclude that giving up smoking is prudent from a health viewpoint.
What is not reasonable is the response to this one, classic study. First, it has provided the justification for state intervention in lifestyle in a previously unprecedented way. Secondly, it has encouraged the proliferation of other studies, which make grand statements about disease based on correlations far weaker than those found by Hill and Doll.
Brignell’s book is a handy demolition of the science and statistics behind this epidemic of epidemiology. He shows how statistical tests were originally developed, based on certain assumptions. However, these assumptions have long since been forgotten, so that meeting certain abstract criteria has been elevated above whether the results are actually of any real-world importance.
The most important of these is the test for statistical significance. The idea behind this is that patterns can be found in any set of random results. For example, in the spiked office, there are a number of people who were born in May or June, but none were born in July. It would be possible to draw the conclusion that there is something special about being born in May or June that predisposes people to become journalists. This would be a bizarre conclusion to draw from just a handful of people. In fact, the spread of birthdays is completely coincidental.
In research it is therefore useful to have a preliminary statistical test of results, to see how likely it is that they could be due to blind chance. The usual benchmark is that if the chances of a set of results being coincidental are less than five per cent, it is reasonable to go on to assess whether the results are actually meaningful.
First, just because a study passes this test does not mean its results aren’t a complete coincidence. In fact, by definition, five per cent of studies could pass this test even though the results are meaningless.
Secondly, just because the results are statistically significant doesn’t mean they are practically significant. Brignell gives the example of a book called The Causes of Cancer, written by Richard Doll and Richard Peto. Illustrating Doll’s fall from previous high standards, the book describes some deaths of people in their 80s and 90s as ‘premature’.
These days, however, it seems that any result that passes this ‘p-test’ is increasingly regarded as significant. Five per cent sounds like a low risk of results being meaningless, until you realise that researchers often plough through many, many potential risk factors (what Brignell calls a ‘data dredge’), look for an apparently significant result, then try to speculate some kind of mechanism to explain it, no matter how bizarre. So a test designed as an initial filter to weed out spurious results is used to give credence to them.
Thus he provides a huge list of different factors that have, at one time or other, been accused of causing cancer: abortion, acetaldehyde, acrylamide, acrylontiril, agent orange, alar, alcohol, air pollution, aldrin, alfatoxin, arsenic, asbestos, asphalt fumes, atrazine, AZT…and that’s just the letter ‘A’.
There are also a number of techniques in epidemiology for imposing assumptions on to data. The best of these is trend fitting. No set of data will exactly fit a pattern but often a clear trend can be found nonetheless. However, many studies appear to resort to drawing a line through an apparently unconnected series of measurements to demonstrate an underlying effect.
Epidemiology can be an effective tool when applied to the spread of infectious disease. Unfortunately, there really isn’t anything like enough infectious disease in the developed world to justify the existence of so many departments and researchers. In fact, the overwhelming cause of death in the developed world is old age - a factor that is, incredibly, frequently ignored by researchers. A person in their eighties is a thousand times more likely to develop cancer than someone in their thirties. This factor is so powerful that for most of the causes of disease studied, a very minor underestimation of the effect of age can wipe out any putative effect from the factor in hand.
Age is obvious - but many other confounding factors are not. Therefore, we return to Hill’s first criterion: to be sure there is actually something going on, the effect must be strong. Otherwise, any apparent effect may prove to be entirely illusory.
A topical example of this is passive smoking, and in particular what Brignell calls ‘the greatest scientific fraud ever’. In 1992, the US Environmental Protection Agency published a meta-study, bringing together many other studies on passive smoking. Unfortunately, the results were negative. It appeared that passive smoking was not a health risk at all. Mere facts could not be allowed to get in the way of a health scare, so some imagination was applied to the problem. One negative study was removed - but the meta-study still produced no statistically significant result.
So the goalposts were not so much moved as widened. The organisation found that there was a greater than five per cent chance that the results were coincidental, but less than 10 per cent - so they accepted them anyway. In other words, the EPA accepted a bigger risk that the effect they found was purely due to chance, quite at odds with standard practice.
The increased risk of lung cancer they found - 19 per cent - was frankly too small to have been conceivably detected given the methods they used. There are lots of ways in which inaccuracy could have crept into this final result. For example, is it really possible to merge the results of many different studies, all with different methodologies and subjects, accurately? How could someone’s actual exposure to environmental smoke be measured over the course of years? Were all the people who said that they were non-smokers absolutely honest? As indicated above, were other possible contributory factors such as age, gender and income controlled for accurately?
We can be pretty confident about Hill and Doll’s conclusions about lung cancer because the effect they found is massive - an increased risk of 2400 percent. To suggest that such a small effect as 19 per cent could be accurately measured in this way is like trying to time a race with a sundial.
That has not prevented smoking being banned in public places on the grounds that thousands of people might die from inhaling second-hand smoke. The public health agenda is therefore driven and justified by research that is more often than not completely worthless.
It is undoubtedly the case that Hill and Doll’s study has caused people to give up smoking and extended many lives as a result. But it has also inspired a heap of unnecessary panics based on dodgy research, and public health campaigners only too willing to tell us how to live our lives.
The Epidemiologists: Have They Got Scares For You, by John Brignell, is published on 1 July 2004 but can be ordered now from the author via his website, Numberwatch
Epidemiology uncovered, by Dr Michael Fitzpatrick
(1) The mortality of doctors in relation to their smoking habits; a preliminary report, British Medical Journal, 26 June 1954; see also the fiftieth anniversary follow-up report, Mortality in relation to smoking: 50 years’ observations on male British doctors, British Medical Journal, 26 June 2004
For permission to republish spiked articles, please contact Viv Regan.