The Daily WTF is a site that collects programmers’ horror stories. I thought the following horror story [it’s the second story on the page] provides a good example of why it’s important to double-check the code of scientific software.
Long ago, I worked as a programmer at a university’s hearing research lab. They were awarded a large government grant to study the effects of different kinds of noise on hearing. For the really loud and really faint noises, the researchers used animal subjects with ears that are similar to human ears. Specifically, chinchillas.
The chinchillas would be put in to a special chamber for several hours at a time to have their hearing tested. Since the little rodents don’t respond so well to questions like, “which sound is louder?,” a good amount of time had to be spent training them to jump over a little bar in their chamber whenever they heard a beep.
Because a large part of the research project was to study the long term effects of hearing, the tests would have to be run twenty-four hours a day, seven days a week, for several years. Obviously, it was pretty important that the chinchilla testing be automated. But not very important, though. If it had been very important, they would have had someone other than a grad student write it.
I joined the team about a year into the project and was tasked with rewriting the beep-jump-reward program. It was a ridiculous mess of spaghetti code that seemed to have more GOTO statements than actual code. There were no comments anywhere nor any documentation on what the program’s algorithm was for controlling the beeps and rewards.
After a little while, I was able to figure out the algorithm and rewrite the application. A month or two later, the rewrite was put into production. I documented my work, said my goodbyes, and moved on to my next contract.
A year or so later, the researchers compiled the data and noticed some very surprising results: the chinchillas were a lot more hearing-impaired than they should have been. While this may not seem too big a deal, the findings would have some serious ramifications. Occupational noise-exposure laws would be changed, lawsuits would be filed, and billions would be spent correcting the issue.
Before publishing the results, another team of researchers went over the data and study with a fine-toothed comb to ensure that the results were correct. And whammo, they find a bug in my code. Under certain conditions, one part of the application did not correctly check that the chinchilla jumped at the right time. This meant that the program would deny the chinchilla a food pellet, giving it negative feedback when it in-fact did the right thing. This led to so some rather confused chinchillas which had no idea when they were actually supposed to jump.
In the end, over a year’s worth of data was thrown out, a few man-years of work was wasted, and there were a whole lot of cute little rodents that were rather confused and hard-of-hearing. I still feel bad for deafening those poor chinchillas…
This story highlights the secondary importance that many scientists still accord to software in their research. In this case, the writing of a critical piece of software was assigned to a naive grad student who worked without oversight. The researchers apparently did not stop to think that one minor error in that software would invalidate the results of the entire experiment.
Note that the error was only caught by a second, completely independent team of researchers who went over the first team’s code “with a fine-toothed comb”. Note also that they didn’t just review the data the software generated but the actual software itself. Something that was never done with the CRU software until a whistle blower exposed it to the world.
This is how software used in finance, the military and (hopefully) most regulatory agencies is double and triple checked. Contrast this with the superficial and amateurish way that the CRU (and presumably all other) climatologists created, maintained and tested their software.
The extraordinary thing about the entire political-scientific “climate change” complex is the vast disconnect between the unprecedented magnitude of the public policy we will base on this data and the quality of the software code that generates the data. It is like finding out that the software used to decide whether or not to launch nuclear weapons was written by an overly bright teenager in his high school computer lab.
Given that hundreds of millions of lives depend on us getting the science on “climate change” just right, we need to hold the software that climatologists use to the same standards we demand for financial, medical and military software.
If we don’t get it right, deaf chinchillas will be the least of our worries.