More on Crappy Scientific Software

Via Slashdot comes this article in the Guardian that reinforces the points I made in my previous post: No One Peer-Reviews Scientific Software, Scientists are Not Software Engineers and Scientific Peer-Review is a Lightweight Process.

The article makes that same points that (1) there is little to no professional quality-control in the creation and maintenance of scientific software and (2) scientific software should be as open and scrutinized as scientific hardware.

This observation is especially important:

Computer code is also at the heart of a scientific issue. One of the key features of science is deniability: if you erect a theory and someone produces evidence that it is wrong, then it falls. This is how science works: by openness, by publishing minute details of an experiment, some mathematical equations or a simulation; by doing this you embrace deniability. This does not seem to have happened in climate research. Many researchers have refused to release their computer programs — even though they are still in existence and not subject to commercial agreements.

(Note: In this context, “deniability” means that the hypothesis or theory must be constructed so it can be proven wrong, i.e., that you can deny the truth of it.)

Scientific hypotheses differ from hypotheses in other fields specifically because scientific hypotheses can be conclusively proven wrong by experiment.

Read more

Peer Review as Talisman

Mark Steyn says:

Like all the poodles of the environmental beat, Margot O’Neill repeats those magic words “peer review” every couple of paragraphs like a talisman to ward off evil deniers.

From my “Scientific Peer-Review is a Lightweight Process” :

By the way that proponents of Catastrophic Anthropogenic Global Warming (CAGW) wave it about as a talisman to ward off criticism, a lay person could be excused for thinking that peer review is a rigorous process that is central to the functioning of science and that verifies the conclusions of a scientist’s research.
 
Peer review is nothing like that.
 
Peer review isn’t even central to science. Science functioned fine for centuries without peer review and scientists who work in secret or proprietary environments do not use it. Instead, peer review serves economic and social functions related to scientific publishing and does nothing else. Peer review somewhat protects the integrity of scientific media, not the quality of science itself.

I would just like to point out that Mark Styen steals from the best. ;-)

Scientific Scandals Past

Sometimes forgotten lessons will get refound. Writing up a comment on the breast cancer guidelines brouhaha, I dredged up what turned out to be an inappropriate analogy, but one that is useful elsewhere.

Remember LynxGate? The allegation at the time (early 2000s) was that forest service employees falsely added lynx hairs to collection samples in order to get habitat declared protected when it should not have been. After investigation, a more complicated story emerged, one of false consensus, unauthorized controls/faked samples, and a general finding that there was no conspiracy.

The 1998 Weaver survey, at the time considered reliable but since discredited, showed a much more extensive lynx habitat than the federal three year survey was detecting. Independently, a couple of government employees decided to submit control samples of lynx hair, one obvious, the other less so, without going through the normal process of creating such controls that would ensure that their data would not get mixed in with the rest of the survey results. The intention, as reported to the investigators, was to ensure that lynx was not getting misidentified as domestic cats (feral domestic cats do live in the woods sometimes).

The lesson that a false consensus can make scientists skip certain safeguard protocols got buried as the right found itself embarrassed and the left uninterested in any sort of blood sport against people on its side.

Fast forward to today’s Climategate. From the Harry Read Me. we find, about 40% of the way in:

If an update station matches a ‘master’ station by WMO code, but the data is unpalatably
inconsistent, the operator is given three choices:

[BEGIN QUOTE]
You have failed a match despite the WMO codes matching.
This must be resolved!! Please choose one:

1. Match them after all.
2. Leave the existing station alone, and discard the update.
3. Give existing station a false code, and make the update the new WMO station.

Enter 1,2 or 3:
[END QUOTE]

You can’t imagine what this has cost me – to actually allow the operator to assign false
WMO codes!! But what else is there in such situations? Especially when dealing with a ‘Master’
database of dubious provenance (which, er, they all are and always will be).

False codes will be obtained by multiplying the legitimate code (5 digits) by 100, then adding
1 at a time until a number is found with no matches in the database. THIS IS NOT PERFECT but as
there is no central repository for WMO codes – especially made-up ones – we’ll have to chance
duplicating one that’s present in one of the other databases. In any case, anyone comparing WMO
codes between databases – something I’ve studiously avoided doing except for tmin/tmax where I
had to – will be treating the false codes with suspicion anyway. Hopefully.

One of the things that happened in Lynxgate was that the “obvious” control being sent in was not so obvious to the lab which had in other contexts seen plenty of legitimate samples be that sloppy. They treated it as legitimate data.

So what happens if somebody randomly decides to give the CRU unit at the UAE a bit of control data with not so unusual but falsely high values? In 2 out of the 3 choices the control will be included with the rest of the data. In option 3, a false station would be added to the list of WMO stations and used going forward. This is part of the process of good databases going bad and bad ones not being corrected that Harry famously complained about just a little bit later in the same file.

Somebody will, if they haven’t already, claim that nobody would ever just submit false data, that this can all be explained away by climate station central offices not keeping up with new stations in the field. And that would sound plausible, unless you’ve forgotten that obscure scandal that wasn’t, Lynxgate where they did just that based on the mistaken conclusions of a soon to be discredited study.

But this isn’t the only past scandal that is illustrative of the large potential problems facing CRU. Pulling in Briffa’s suspect Yamal chronology you have an additional difficulty. It seems that some data points are more equal than others in the climate game. Any unusually influential data points now have to also get traced back to an actual station, something that hasn’t been done on any of them.

And how good are those actual stations? Anthony Watts’ experiment over at surfacestations.org is pointing to the answer “not very”. If you look at a global map of stations it’s amazing how many of the stations are in the USA. Watts’ survey of all USHCN stations is 82% complete and only 10% of stations have an NOAA error rating of less than 1C.

So without any conspiracy we seem to be betting trillions on science that does not adequately coordinate to prevent control data from entering real data sets, has practices in the discipline that are inadequate to guard against undue weight, and is taking large chunks of its data from weather stations whose error bars far exceed the global warming signal we’re all supposed to be worried about.

At this point a finding of “no conspiracy” would not reassure me. It should not reassure us at all.

Bias Confirmed

Megan McArdle, an AGW true believer, seems to think that most of the problems highlighted by Climategate are due to confirmation bias. That is where the experts tend to accept data that is in line with what they expect, while assuming that anything which goes against the prevailing theory must just be faulty in some way.

I’d agree with her except for the way the people involved in the scandal went against the law to delete emails, hatched plans to punish other scientists whose work showed different results, and even worked to discredit scientific journals which dared to publish contrary research.

That sort of willing participation in unethical and illegal behavior doesn’t fit any definition of “confirmation bias” I’ve ever come across. Crooks, liars, cheats and con artists act like that, not respectable scientists who simply put a bit more weight on one side of the scale.

It is certainly true that the history of science is rife with examples of confirmation bias. But, while debate and disagreement might become heated, it is rare to come across a case where one side of the issue actively schemes to silence their opponents through purposely causing them some form of harm.

In this instance, I suppose the AGW dissenters should be grateful that only their careers were damaged.

UPDATE
The Belmont Club has a post that is worth your time.