More on Crappy Scientific Software

Via Slashdot comes this article in the Guardian that reinforces the points I made in my previous post: No One Peer-Reviews Scientific Software, Scientists are Not Software Engineers and Scientific Peer-Review is a Lightweight Process.

The article makes that same points that (1) there is little to no professional quality-control in the creation and maintenance of scientific software and (2) scientific software should be as open and scrutinized as scientific hardware.

This observation is especially important:

Computer code is also at the heart of a scientific issue. One of the key features of science is deniability: if you erect a theory and someone produces evidence that it is wrong, then it falls. This is how science works: by openness, by publishing minute details of an experiment, some mathematical equations or a simulation; by doing this you embrace deniability. This does not seem to have happened in climate research. Many researchers have refused to release their computer programs — even though they are still in existence and not subject to commercial agreements.

(Note: In this context, “deniability” means that the hypothesis or theory must be constructed so it can be proven wrong, i.e., that you can deny the truth of it.)

Scientific hypotheses differ from hypotheses in other fields specifically because scientific hypotheses can be conclusively proven wrong by experiment.

A scientific hypothesis becomes a theory only when the one experiment that could prove it wrong has been attempted repeatedly. Key to that repeatability is that all scientists understand the minute details of each attempt so that they can reproduce it exactly.

Keeping scientific software secret destroys reproducibility. If you have two or more programs whose internals are unknown, how do you know why they agree or disagree on their final outputs? Perhaps they disagree because one made an error the other did not or perhaps they agree because they both make the same error. You can never know if you have actually reproduced someone else’s work unless you know exactly how they got the answer they did.

There is no compelling reason to keep scientific software secret. In the case of science upon which we base public policy on whose outcomes the lives of millions may depend, such secrecy could be lethal.

13 thoughts on “More on Crappy Scientific Software”

  1. So why don’t these yahoos use products like SAS and P-STAT (no doubt there are others). What brought the buffoons down was statistical analysis and at least some of their defense is that they are scientists, not statisticians. Fine, use professional statistical analysis SW.

  2. When Judah Folkman first proposed the the principle of angiogenesis as a target for cancer treatment, he was ridiculed. One reason was that he was a surgeon meddling in molecular biologist’s field but more importantly, his early studies could not be duplicated by other labs. Eventually, he loaned lab staff to help other labs get started. Here is an interview. There is even a book about his struggle to get acceptance.

    Not only did he provide all his data and methods, he provided lab assistants to get others started. Some of those people had been ridiculing him because they couldn’t duplicate his results and they accused him of publishing false conclusions. It was a huge struggle.

    A bit of a contrast with the climate folks.

  3. The Theory that Jack Built

    This is the theory that Jack built.

    This is the flaw
    that lay in the theory that Jack built.

    This is the mummery
    hiding the flaw
    that lay in the theory that Jack built.

    This is the summary
    based on the mummery
    hiding the flaw
    that lay in the theory that Jack built.
    ….
    –from The Space Child’s Mother Goose, by Frederick Winsor and Marian Parry

  4. One of the threats of Obamacare is “consensus guidelines.” I have no problem with the evidence based guidelines and I know no doctors who resist learning when it is presented in a manner that is neutral to financial considerations. I attended an annual program on congestive heart failure last weekend that USC puts on every January. I would say there were a thousand physicians paying $125 for the day. Many of the big national meetings are dying out because doctors don’t have the money they had 25 years ago to attend meetings. I go to one about every three months. They do a good job and all the evidence is laid out. Consensus guidelines have little data behind them and tend to be the opinion of older professors of various fields who spend a weekend noodling over what the best way to treat disease X is. They are also suspiciously interested in cost. Now, I am interested in cost, probably more than most of my colleagues, but there is a place for one and a place for the other.

  5. The worst, most amateurish code I have ever seen is that produced by scientists. Control flow like a bowl of spaghetti, global variables everywhere. A nightmare to understand, as that poor programmer in the Hadley CRU noted in his in-line comments – so, perfect ground for hiding little ‘adjustments’ and ‘tweeks’.

  6. Michael Kennedy – I wonder how many areas of medical “consensus” would be changed if completely opened up to the public?
    Oh, not all that much. Just say the name ‘Marshall Protocol’ here and see what response you get to naming an independent medical research project. Nobody will bother to read the science, they’ll just call it quackery. And yet, it works well enough that those who’ve gone through it are willing to step out of the shadows take the abuse when we claim it works.

  7. The bizarre part of all of this is that there are scientific journals that public computer and statistical code… A normal scientist would publish his work there to simply get more publications out of the same research program. There is a reason they want to keep this secret, instead of adding to their publication count…

  8. @dearieme:

    Yes. In science, falsifyability means that others can demonstrate that you are in the wrong. In politics, deniability means that others cannot demonstrate you are in the wrong, even though you actually are.

  9. In addition to the code to perform calculations, there is a known set of problems with computer math and how both a particular combination of computer language and hardware can make the math wrong. I have personally encountered such problems over a number of times and have been dismayed at the lack of understanding about computer precision and accuracy. Ask someone who writes structural code or missile control code.

    This is compounded by human nature reacting to every number that comes out of a computer as the result of a correct calculation (i.e. gospel)

    Finally, I have personally witnessed scientists and engineers who are shown an obviously wrong model and do not say anything negative about it.

  10. NedLudd,

    The entire field of chaos mathematics can trace its origins back to a hardware rounding error that caused a program to return significantly different results each run even though it was fed the same data each run. What kind of program was the computer running, you might ask?

    A weather model.

    Unfortunately, a lot of people can program some so they have a tendency to believe that they understand high level programming. This tendency grows worse if they are in charge of creating the logical models that the programs implement. They forget that there is a great deal of minutia in very program that the high level view never sees. There are also peculiarities in languages, compilers, libraries, procedures, techniques and hardware. Just because you understand the big picture does not mean you understand the details of implementation nor possible errors that could result from those details.

    A physicist can explain all the forces acting on a bridge and he might even be able to sketch out an abstract model of a bridge. However, you wouldn’t want to cross a bridge that had been actually wielded together by a bunch of physicist. Just because they understand the bridge abstractly does not mean that they understand how to make a good weld.

    Yet, in climate modeling we are in exactly this type of scenario. We’ve let people who might have the best understanding of the abstract relationships in the climate also be the hands on people who create the code that turn those abstractions into computer code.

    As we would expect, the climate bridge is more than a little shaky.

  11. Ned/Shannon…a horrible example of this kind of problem occurred with a Patriot missile battery in the Gulf War. The battery’s radar identified a Scud missile and started to track it, but soon lost the track because of a *rounding error*. The Scud hit a barracks, and quite a few people were killed.

    The Patriot’s radar processing system maintained a “range gate,” ie, the area in space within which it expected the next sighting of the missile to occur. The rounding problem led to an error in the range gate calculation which accumulated very slowly over time: Patriots were normally rebooted before the problem made its appearance, but this one, operating under combat conditions, had run for a long continuous period.

    The Israelis had run one of their Patriots for a long time, and found the problem and reported it to Raytheon. A patch had been created and was on its way to the Gulf when the above incident happened.

Comments are closed.