The science community is now closing in on an example of scientific fraud at Duke University. The story sounds awfully familiar.
ANIL POTTI, Joseph Nevins and their colleagues at Duke University in Durham, North Carolina, garnered widespread attention in 2006. They reported in the New England Journal of Medicine that they could predict the course of a patient’s lung cancer using devices called expression arrays, which log the activity patterns of thousands of genes in a sample of tissue as a colourful picture. A few months later, they wrote in Nature Medicine that they had developed a similar technique which used gene expression in laboratory cultures of cancer cells, known as cell lines, to predict which chemotherapy would be most effective for an individual patient suffering from lung, breast or ovarian cancer.
At the time, this work looked like a tremendous advance for personalised medicine—the idea that understanding the molecular specifics of an individual’s illness will lead to a tailored treatment.
This would be an incredible step forward in chemotherapy. Sensitivity to anti-tumor drugs is the holy grail of chemotherapy.
Unbeknown to most people in the field, however, within a few weeks of the publication of the Nature Medicine paper a group of biostatisticians at the MD Anderson Cancer Centre in Houston, led by Keith Baggerly and Kevin Coombes, had begun to find serious flaws in the work.
Dr Baggerly and Dr Coombes had been trying to reproduce Dr Potti’s results at the request of clinical researchers at the Anderson centre who wished to use the new technique. When they first encountered problems, they followed normal procedures by asking Dr Potti, who had been in charge of the day-to-day research, and Dr Nevins, who was Dr Potti’s supervisor, for the raw data on which the published analysis was based—and also for further details about the team’s methods, so that they could try to replicate the original findings.
The raw data is always the place that any analysis of another’s work must begin.
Dr Potti and Dr Nevins answered the queries and publicly corrected several errors, but Dr Baggerly and Dr Coombes still found the methods’ predictions were little better than chance. Furthermore, the list of problems they uncovered continued to grow. For example, they saw that in one of their papers Dr Potti and his colleagues had mislabelled the cell lines they used to derive their chemotherapy prediction model, describing those that were sensitive as resistant, and vice versa. This meant that even if the predictive method the team at Duke were describing did work, which Dr Baggerly and Dr Coombes now seriously doubted, patients whose doctors relied on this paper would end up being given a drug they were less likely to benefit from instead of more likely.
In other words, the raw data was a mess. The results had to be random.