Methods Available, But Rarely Used, to See Doping Prevalence; Current Estimates Alarming

A recent report by Olivier de Hon, Harm Kuipers, and Maarten van Bottenburg gathers four different kinds of data in on elite sports and attempts to combine them into one single estimate of the prevalence of doping. One study they draw on is research by Dr. Jim Stray-Gundersen and his colleagues about the doping-marred 2001 World Ski Championships in Falun, Sweden. Stray-Gundersen found, based on analysis of blood parameters rather than analytical testing for banned substances, that doping was "prevalent and effective" in cross-country skiing. One of six Finnish athletes banned at those Championships, Virpi Kuitunen (pictured above) returned from her ban and went on to win five more World Championships gold medals, two Olympic bronze medals, two Tour de Skis, and an overall World Cup title. — A recent report by Olivier de Hon, Harm Kuipers, and Maarten van Bottenburg gathers four different kinds of data in on elite sports and attempts to combine them into one single estimate of the prevalence of doping. One study they draw on is research by Dr. Jim Stray-Gundersen and his colleagues about the doping-marred 2001 World Ski Championships in Falun, Sweden. Stray-Gundersen found, based on analysis of blood parameters rather than analytical testing for banned substances, that doping was “prevalent and effective” in cross-country skiing. One of six Finnish athletes banned at those Championships, Virpi Kuitunen (pictured above) returned from her ban and went on to win five more World Championships gold medals, two Olympic bronze medals, two Tour de Skis, and an overall World Cup title.

“…An estimation of 14-39 percent of current adult elite athletes [have] intentionally used doping.”

This is the striking conclusion to the abstract – the summary paragraph that leads academic publications – of a scientific article published this month. If you’re anything like me, you might be surprised to see such a high value committed to print, yet also wearily resigned to the possibility that this might represent reality. Still, given the almost-complete absence of balanced estimates in the media, how did the authors arrive at these numbers? Is it hyperbolical, or should we consider it the best estimate available to us?

The article, published in the academic journal Sports Medicine, focuses on the uncertainty that exists concerning the prevalence of doping in elite sports. Whilst frequently a topic of heated discussion, both online and between friends in the bar, the question of how many athletes are doping is difficult for most of us to answer without relying on gut instincts. What makes this article particularly interesting is that the lead author, Olivier de Hon, works for the Dutch Anti-Doping Agency. It is clear that those working within anti-doping efforts share the frustrations that the prevalence of doping in sport is unknown (or, at least, unpublished and unavailable to the wider public).

Olivier de Hon, the lead author of the study, is the head of Scientific Policy for Doping Autoriteit, the national anti-doping organization of the Netherlands. (Photo: dopingautoriteit.nl)

To be clear, prevalence is a group-level measure. It does not concern individuals, but rather the proportion of competing athletes that are resorting to doping methods. The authors consider ‘elite’ athletes to be those competing in international competitions, as well as the highest national championship in each sport, and exclude ‘masters athletes’.

The article addresses the various sources of evidence that we might make use of if we wanted to estimate the prevalence of doping amongst today’s elite athletes. The authors describe four broad categories of evidence that we might use to gather information, discussing the pros and cons of each in turn. As discussed below, they prefer objective measurements to anecdotes or extrapolations based on the isolated admissions of athletes. They get their number by a technique generally referred to as “data triangulation”: using multiple sampling strategies, assessing their relative strengths, weaknesses, and biases, and combining all that information into an estimate of reality.

Inferences from performances

Outstanding sporting performances are often accompanied by suspicions of doping. However, if we take such a thought process to its logical conclusion, then all competitions become meaningless, since every winner is automatically convicted of doping and the sport itself loses the very values that first attracted fans. Of course, many fans do become tired of the doping allegations that persistently accompany some sports and choose to turn their back on certain sports.

The article acknowledges that some people have made “(semi)scientific” analyses of performances, based on sections of races. This will be particularly familiar to keen followers of the Tour de France, as debate surrounding such analyses erupt online every July as the peloton passes through the mountains, following ascents that have been used on many occasions over the years, allowing time comparisons to be made. De Hon and his coauthors caution that however much attention such analyses gather, no work of this type has been published in peer-reviewed journals, for which they would be subjected to the scrutiny of trained scientists prior to being accepted for print.

Nevertheless, there seems to be a general – if not absolute – agreement that performances in endurance sports have stalled or even fallen in recent years, following a peak in the 1990s and early 2000s. The same thing happened in 1989, when random drug tests were implemented; that phenomenon has since been subjected to rigorous statistical reviews.

However, the present-day stagnation in times and performances does not necessarily indicate a decline in the number of athletes indulging in doping; it may simply indicate that more stringent anti-doping efforts have reduced the amount of doping any individual is prepared to undertake. Furthermore, other sports have seen the rate of improvement increase, most obviously men’s sprinting, where the world record has tumbled dramatically over the last decade. This has led some sports scientists to speculate, based on their statistical analyses, that we may be witnessing the emergence of a novel and highly effective doping procedure.

Inferences from published personal accounts

Published accounts of personal experiences via press interviews or autobiographies give personal insights into the world of elite sport. Particularly in cycling, autobiographical accounts of doping have been published in recent years. These accounts help to paint a picture of potential doping use but the authors of the review caution that these are best considered as the equivalent of case reports in the medical literature – whilst interesting and potentially indicative of areas deserving particular attention, they are ultimately subjective accounts based on individual experience. The authors caution that humans tend to legitimise their own behaviours based on a perception that others are doing the same, even if there is no proof that this is so. Therefore, in terms of the key question we wish to address – how prevalent is doping in elite sport? – personal accounts are of little use, despite the extensive attention they gain from the media.

Laboratory-based analyses

When WADA releases its annual number of potitive analytical findings, it does not weed out those where the positive test was deemed irrelevant due to an approved Therapeutic Use Exemption - for instance, anyone who had been cleared to use an inhaler to treat asthma, like Marit Bjoergen of Norway. (photo: Fischer/Nordic Focus) — When WADA releases its annual number of potitive analytical findings, it does not weed out those where the positive test was deemed irrelevant due to an approved Therapeutic Use Exemption – for instance, anyone who had been cleared to use an inhaler to treat asthma, like Marit Bjoergen of Norway. (photo: Fischer/Nordic Focus)

You might think that the results of doping tests would be the best source of information on the prevalence of doping use. Indeed, since 2003 the World Anti-Doping Agency (WADA) has published an annual overview of adverse results, including data from all Olympic and Paralympic sports. Between 1987 and 2013, the percentage of what they call ‘findings’ (adverse or atypical analytical results) for doping tests has fluctuated between 0.96 percent and 2.45 percent, and since 2005 has generally been around 2 percent.

However, the authors explain a number of problems with relying on a count of adverse findings to estimate the prevalence of doping. Firstly, all prohibited substances have a time window within which they can be detected (for some this as short as a few hours). We know from confessions of convicted dopers that doping athletes are aware of this limitation, and this led to countermeasures amongst dopers, such as ‘microdosing’, in which prohibited substances are used in doses too small to generate a positive result in doping tests. (There are plenty of other reasons for the low number of positive tests, including a lack of consistent compliance with the WADA Code by anti-doping agencies around the world.)

A second issue is that the official data summarises all adverse findings reported by WADA-accredited laboratories, even when an athlete may have had a valid medical reason for using the substance (as proven by a Therapeutic Use Exemption, or TUE). Thus, as well as underestimating doping due to difficulties of chemical detections, the overviews of laboratory results simultaneously contain an “inherent overestimation” of intentional doping.

Overall, the direct use of laboratory results is unlikely to paint an accurate picture of the prevalence of doping in elite sport, say the authors, though they note that if the annual summaries offered more detail and were made more accessible then these issues could be reduced considerably.

However, test results can be used to estimate the prevalence of doping via a more indirect route, based on the effects that many doping methods have on certain biological parameters that, because of their wider importance for human health, are very well studied. Because prevalence examines patterns across a field of competitors, rather than individuals, analyses of blood/urine samples can be used to estimate a likely prevalence of doping, even when no individual sample offers conclusive proof of doping.

By way of explanation, imagine a scenario in a hall, where people are meeting for speed-dating. After some time watching the crowd, we have the feeling that the men are unusually tall. If – hypothetically – women are more likely to be attracted to tall men, then we might suspect that some of the men have artificially increased their height using concealed wedges inside their shoes. Eager to test our suspicions, we quickly rush to the exit door and, whilst we’re too embarrassed to explicitly ask each man to take off their shoes, we take a quick measurement of their height. Based on the height measurements alone, we cannot say whether an individual was unnaturally tall, because being 6’2″ tall is only a little less likely than being 6’1″ (if the wedges we added 1″). But using all the measurements together, we can compare the distribution of heights in our sample with that of the wider population and statistical tools allow us to estimate what proportion of men must be cheating.

Returning to the problem of doping in sport, anti-doping efforts generate data on each individual (haematocrit level, for example). Haematocrit level is the volume percentage of red blood cells in the blood, and is variable in the general human population, and thus athletes. Higher haematocrit levels represent a greater oxygen carrying capacity of the blood, so increasing this will improve aerobic output. However, this value is naturally fairly stable within individuals, despite marked differences between individuals. But blood doping methods allow athletes to boost their haematocrit values. Thus, if blood doping is occurring within a group of athletes, the distribution of haematocrit values will change, with an overabundance of high values and a scarcity of low values. As with the height example, if we know the expected distribution, we can estimate the proportion of individual athletes that are altering their blood values through doping, even if we cannot identify the particular individuals that are cheating.

Of particular interest to cross-country skiers will be a study of haematocrit values amongst participants at the famously doping-marred 2001 World Ski Championships. Of the tested skiers who finished in the top 50 of their events, 17 percent were found to have ‘highly abnormal’ blood profiles, and a further 19 percent were ‘abnormal’. Even more worrying, amongst athletes winning medals (i.e., placing in the top 3), 50 percent had blood profiles scored as ‘highly abnormal’, compared with only 3 percent amongst those placing between 41st and 50th. De Hon and his coathors suggest that such a direct relationship between performance and blood profiles is not to be expected, because a whole variety of factors will influence results within such a closely matched group of competitors. As a result, they conclude that in cross-country skiing blood doping reached a level where it was ‘performance determining’, rather than merely performance enhancing.

Still, the study of skiers was published more than a decade ago, and more recently blood manipulation is likely to be more subtle. In accordance with this, amongst elite cyclists, the proportion of ‘extreme’ blood profiles fell from 11 percent to 2 percent between 2001 and 2009. However, if we search for abnormal values, we will still miss athletes who naturally have a relatively low haematocrit value but who, through illegal methods, elevate their blood profile substantially but keeping it within the ‘normal’ range. Such athletes are, unfortunately, the very people who gain the greatest performance benefit from blood doping when a cap (e.g., 50 percent haematocrit) is in place. A major advantage of the method used in the study of skiers is that because it considers the values from all tested athletes, it can detect the absence of low values, even if doping athletes maintain their blood profile within the ‘normal’ range.

To date, this approach has only been used once, in a study of blood manipulations in elite athletes in track and field that examined more than 7000 blood samples from 2737 athletes over a ten-year timespan. It was estimated that 14 percent of athletes were blood doping, with nationality found to be a major influence on prevalence.

Other sports, including biathlon, possess the data to allow such analyses, and de Hon and his coauthors call for more studies, including work exploring the differences between countries, teams or performance levels. Given the availability of the data, if governing bodies are willing to make it available, and the clear public interest in the results of such analyses, it is difficult to explain why this has not already been done.

Biathlon has been using a variation on this method to target athletes for additional testing or even to determine which samples should be stored for later re-analysis. But they have kept the methods used to implement recent bans close to their chest. And it is not clear that other international federations are doing the same thing – although they probably should.

Questionnaires

On the face of it, relying on athlete questionnaires to gain an estimate of the level of cheating seems rather far-fetched. And, indeed, the authors note that whilst questionnaires have repeatedly been used on North American high school athletes or European students to estimate the prevalence of doping (15 individual studies are cited), self-response questionnaires are of questionable value when addressing controversial or illicit topics, where those questioned may feel a pressure to give socially accepted answers.

This obviously presents a big problem if we wish to get information from the athletes themselves about doping. Here, the article introduces us to the Randomised Response Technique, a survey method that allows the truth to be provided anonymously, even in response to direct questioning. Each respondent first rolls dice (a random number generator or other process that randomly generates a finite set of outcomes can alternatively be used) and, depending on the outcome, is obliged to answer ‘yes’, ‘no’ or with the truth. Crucially, the researcher does not know the outcome of the dice roll, so cannot know whether the athlete is giving a forced answer or an honest answer. However, if we know the probability of each option, we can later calculate how many of the ‘yes’ and ‘no’ responses were forced and how many were honest. As with the analysis of overall patterns in biological measurements (e.g., haematocrit), individual athletes can’t be picked out as dopers but we do gain an estimate of the likely prevalence of doping, and the anonymity will hopefully give the athletes the confidence to be sincere. The method sounds strange, but it has been used with great success for over 35 years in social sciences.

To date, only one study has used the Randomised Response Technique to investigate doping amongst elite athletes. This study (available in full length here, open-access), published in 2007, asked contemporary German athletes in Olympic disciplines – including sports as varied as cycling, weightlifting, baseball, basketball, swimming and sailing – whether they had ever used banned substances or methods to enhance their performance. This study estimated 26-48 percent of respondents had used banned drugs or methods. Furthermore, when comparing athletes competing at national and international levels, those currently competing at international level were more likely to answer ‘yes’ when asked if they had ever used illegal doping methods.

Reaching conclusions

The article concludes by drawing together what little published evidence there is, and weighting it according to its objective reliability. The authors suggest that the best approach is likely to be a combination of questionnaires using the RRT and estimates based on physiological parameters, as both offer objective data. Unfortunately studies using either approach are extremely rare, and the authors call for more work to be conducted and made public. Forced to base their estimate on just two studies – a RRT questionnaire of German athletes and an analysis of blood profiles in track and field – the authors conclude that 14-39 percent of elite athletes are doping, a fairly staggering figure if one had based expectations on the failure rate of doping tests. But the authors are clear on this discrepancy: “current doping control test results show a distinct underestimation of true doping prevalence.”

Whilst governing bodies of sports may prefer to look away, sports fans with a critical eye will be all too aware of this issue: as sporting careers like Lance Armstrong’s demonstrate, it is possible for an athlete to be doping throughout their competitive career without failing a doping test. Intriguingly, a recent study analysed waste water from fitness centres and detected an array of steroids and stimulants, including testosterone, methyltestosterone and amphetamine, and de Hon and coauthors suggest that the application of such technology at major events, such as the Olympics, may in future provide a cheap and easy means of establishing whether doping is occurring amongst a group of athletes.

At present, then, even experts in the field cannot offer a precise value for the prevalence of doping in sport. The director general of WADA, David Howman, is quoted as stating that the true doping prevalence amongst elite athlete is likely “a double-digit figure“. So despite being arguably the best placed person in the world to proffer an opinion, Howman is able to offer only a very vague estimate – an indictment, perhaps, of the lack of openness on this topic. As the article’s final sentence states, “Tools to evaluate the prevalence of doping use in sports are readily available; they only need to be used more often.” So why are we only talking about it?

–Simon Evans is a British-Australian postdoctoral researcher at the Evolutionary Biology Center of Uppsala University in Sweden. An avid skier ever since arriving in Scandinavia, he is the owner of a five-hour Vasaloppet finishing time. His previous FasterSkier reportage, a first-person account of a trip to the Swedish early-season ski mecca of Bruksvallarna, can be found here.