The physicist Richard Feynman defined scientific honesty as:
It’s a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty–a kind of leaning over backwards. For example, if you’re doing an experiment, you should report everything that you think might make it invalid–not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you’ve eliminated by some other experiment, and how they worked–to make sure the other fellow can tell they have been eliminated.
The Lancet Iraqi Casualty Survey is a dishonest piece of work. Setting aside all concerns about its methodology and practical implementation, it is easy to see that the paper was written in an intentionally deceptive manner designed not for scientific clarity but for political impact.
I thought I would detail this dishonesty in the form of fisking. Pack a lunch, it’s a long one.
[Download the PDF of the study so you can follow along and keep me honest]
As I mentioned in my last post, the major deception in the paper is the highly selective inclusion and exclusion of the outlier Falluja cluster throughout the paper. The Falluja cluster proved highly problematical for several reasons.
First, it returned a death-by-violence rate so high that it meant that at least 1% of the entire Iraqi population (>250,000) had died in the war almost all due to Coalition airstrikes. Since this clearly was not the case (you couldn’t hide that many bodies) it was clear that the Falluja cluster was an outlier, a statistical freak that, for whatever reason, returned a nonsensical result.
Falluja was also a problem cluster because it was collected in a wholly different method than all the other clusters. Down in the depths of the paper the authors spend a good half-page explaining how the measurement of the Falluja cluster differed from all the others. [p6 pg 6-7, p7 pg1]. GPS couldn’t be used to randomly select the cluster based on demographics, unlike the other clusters, so they did it manually. They had to be wary of insurgents and did a quick in-and-out. In other clusters, less than 8% of the households were abandoned, in Falluja 44% were, meaning that a lot of the results were actually extrapolation.
This presents the possibility that far more deaths had occurred than were reported and the interviewees that remained were the relatively lucky ones (underestimating mortality), or large numbers of residents had fled elsewhere and were still alive. Thus, the deaths reported by the remaining families might represent a disproportionate number of deaths from the larger community that used to live in the area, leading the interview data to overestimate mortality.
Not even considered in the paper is the very real possibility that the insurgents/jihidists who controlled Falluja at the time were able to influence the results. Even if they did not do so, many of the Sunnis in Falluja were hostile to the Coalition and might have taken the opportunity to exaggerate.
So the Falluja cluster was highly problematical even before it returned its off-the-chart death rate. For an honest researcher, the Falluja data presented three choices. He could: (1) include the cluster in all analysis since he had no a priori reason for excluding it, (2) declare the cluster’s data highly suspect and exclude it from all analysis, while taking to care to inform the readers that he had done so, or (3) provide dual analyses, one of which included the Falluja cluster and one of which did not, again taking great care to differentiate the two.
The authors of this study chose option negative four, i.e., confuse the issue as much as humanly possible. Roberts et al weave the Falluja data though the paper like a poison skein. Virtually every major statement they make is grounded on the Fallujah data. If those data are excluded, most of what they state is not true, yet they almost never make it clear when they are relying on the Falluja data and when they aren’t.
(Let me define some acronyms for convenience. IFC is Including Falluja Cluster. EFC is Excluding Falluja Cluster)
Let the fisking begin.
The paper goes off the rails in its opening Summary [p1] . The summary is super important because it is all that 99% of the world will ever see of the study. The media will report only the summary. The summary is where the spin really happens in a scientific paper.
The Findings paragraph starts well.
The risk of death was estimated to be 2·5-fold (95% CI 1·6–4·2) higher after the invasion when compared with the preinvasion period. Two-thirds of all violent deaths were reported in one cluster in the city of Falluja. If we exclude the Falluja data, the risk of death is 1·5-fold (1·1–2·3) higher after the invasion.
That’s all kosher.
We estimate that 98000 more deaths than expected (8000-194000) happened after the invasion outside of Falluja and far more if the outlier Falluja cluster is included.
So here we see Falluja honestly defined as outlier, but why don’t they include the estimated deaths IFC? Probably because the estimated range would immediately leap up past 250,000, which is an embarrassingly high number. Instead they bury the Falluja number in the depths of the paper.
The major causes of death before the invasion were myocardial infarction, cerebrovascular accidents, and other chronic disorders whereas after the invasion violence was the primary cause of death.
So is this true IFC, EFC or both? How can you tell? You can’t without looking at the study’s internals.
This is only true IFC and only just barely (51%). EFC, deaths from violence caused just 24% of all deaths. [p5 pg3]. Further, the breakdown of deaths by disease type is just pointless and silly, except that it creates the impression that deaths by violence are larger later on. It’s sleight of hand. I’m surprised they didn’t list infected hang nails as a cause of death.
Violent deaths were widespread, reported in 15 of 33 clusters,
IFC or EFC? It’s IFC. EFC is 14 of 32 clusters. The actual distribution of the 21 EFC over the 14 affected clusters is not provided in the study. It’s entirely possible that 13 clusters saw one death and 1 cluster saw 8. In fact, given the comically broad confidence interval of this study this type of distribution is more likely than not.
and were mainly attributed to coalition forces.
Curiously, this is never broken down inside the study, save as a passing reference that is IFC.
Most individuals reportedly killed by coalition forces were women and children.
This is easily the most provocative sentence in the Summary. Strangely, it is not supported even IFC. Table 2 shows that of 73 deaths IFC, 33, or 45% percent, were women and children. Even adding in the two elderly deaths gives us only 48%. Weird. Of more interest is that EFC, 62% of deaths are of military-age males and only 38% are of women, children and the elderly. These data make the exact opposite emotional point that the authors sought to make. The 2-to-1 adult-male to everyone-else ratio is what we would expect to see if most of the targets of Coalition attacks were military.
Correction: dsquared has pointed out in the comments that I made an error here. The sentence actually says that of the deaths caused by the Coalition are mostly women and children so I needed to subtract the 12 deaths identified as being caused by non-Coalition actors. Those deaths are of 11 adult males and 1 adult female. So IFC, the ratio of deaths among women and children must be calculated a percentage of 62 total deaths which is 28 children + 4 women for a total of 32 which is 52%.
Unfortunately, based on the paper it is impossible to know how many of the non-Coalition deaths occurred outside of Falluja so I can’t calculate the percentages EFC.
Fortunately, Les Roberts has apparently provided this information in an email.
I am forced to admit that the additional information does support the Summary’s statement. In fact it supports it to such a degree that I am surprised they didn’t break it out separately as it really helps their case.
The information that Roberts provide says that all 12 of the non-Coalition attributed deaths occurred outside of Falluja. Thats means of the 21 deaths EFC only 9 or 42% are attributed to the Coalition. The gender and age breakdown becomes 3 adult males, 1 adult female, 4 children and 2 elderly. Percentage wise that is 33% adult male, 11% adult female, 44% children, 22% elderly. So 66% of deaths EFC attributed to the Coalition are of women children or the elderly. The ratio of non-combatants to possible combatants is higher EFC than IFC.
Wonder why they never pointed that out in the paper? Plausibility perhaps?
The risk of death from violence in the period after the invasion was 58 times higher (95% CI 8·1-419) than in the period before the war.
This is IFC. EFC is something like 22 times higher.
The Interpretation is the money graf for the entire paper. This is what will enter the worldwide dialog to be enshrined as revealed truth.
Interpretation Making conservative assumptions, we think that about 100000 excess deaths, or more have happened since the 2003 invasion of Iraq
This is only a conservative assumption IFC. Apparently, “conservative” means tossing out obviously bad outliers. EFC it is in fact the mainline finding. Indeed, there is a 50% chance that the actual number is lower. An actual “conservative” estimate would be about one interval down, which given the wide CI would fall somewhere in the 60,000-70,000 range. 58% of those would be deaths from violence EFC, so combat deaths would run 34,800-40,600, of which 62% were military-age males. That probably doesn’t have the same impact as a nice round 100,000, though.
Violence accounted for most of the excess deaths
This is true both IFC and EFC. But I think few realize this is in fact a ratio of deaths from violence to deaths from illness or accident. Increasing the health of the general population will make it appear that violence is getting worse even if the total number of people dying drops. Still, it is unambiguously supported by the study.
and air strikes from coalition forces accounted for most violent deaths.
This is only supported IFC. The study does not breakdown causes of deaths due to violence as being IFC or EFC. All such mentions are IFC. In fact, data are presented that could mean that most deaths from violence EFC were not due to airstrikes, but the study does not provide sufficient detail to be sure.
We have shown that collection of public-health information is possible even during periods of extreme violence.
As long as you don’t mind a huge amount of slop in your data.
Our results need further verification…
On that we can agree.
…and should lead to changes to reduce non-combatant deaths from air strikes.
(This last bit captures the utter pointlessness of this study. What in the entire study would help anybody reduce non-combat deaths at all? This study is completely useless in any practical sense. Despite the snotty implication in the conclusion that the US is shows a depraved indifference to civilian causalities, the study provides no guidance or information whatsoever that could be used to improve the situation. It is clearly of no value to anyone save as an anti-Coalition propaganda tool. I would be much more tolerant of the imperfections of the internals of this study if anybody could demonstrate any practical use for it.)
I think it is clear that the Summary for this study, which, again, is what most of the world will see and act on, is deliberately written to conceal the use of the Falluja cluster to make the most attention grabbing claims. EFC is a much, much different report. Honest researchers would have made that clear.
The conflation of IFC and EFC continues in the body of the paper.
From [p5 pg3]:
The main causes of death reported for the 14·6 months before the invasion were myocardial infarction, cerebrovascular accidents, and consequences of other chronic disorders, accounting for 22 (48%) reported deaths (table 2). After the war began, violence was the most commonly reported cause of death, either including (73/142 [51%]) or excluding (21/89 [24%]) the Falluja data, followed by myocardial infarction and cerebrovascular accidents (n=18) and accidents (n=13; table 2)
Violence is only clearly the most common cause of death IFC. It is only the case EFC if illness is needlessly broken up into subcategories. Here the authors are arranging the information to convey a certain impression. EFC 21 of 89 deaths, or 24%, were the result of intentional violence. The remaining 76% of deaths resulted from illness or accident. Is it really honest to say that EFC “violence was the most commonly reported cause of death?”
Let’s try a little quiz.
From [p5 pg4]
Figure 2 shows the number of deaths reported during the study period, disaggregated as non-violent deaths, violence in Falluja, and violence in all other clusters. An increase of violent death was noted during the occupation, and violence was geographically widespread, with violent deaths reported in 15 of 33 clusters (45%).
A) IFC or EFC?
Violence-specific mortality rate went up 58-fold (95% CI 8·1-419) during the period after the invasion.
B) IFC or EFC?
Table 2 includes 12 violent deaths not attributed to coalition forces, including 11 men and one woman. Of these, two were attributed to anti-coalition forces, two were of unknown origin, seven were criminal murders, and one was from the previous regime during the invasion
C) IFC or EFC?
Of the 28 children killed by coalition forces (median age 8 years), ten were girls, 16 were boys, and two were infants (sex was not recorded). Aside from a 14-year-old boy, all these deaths were children 12 years or younger.
D) IFC or EFC?
Hell, they’re all IFC.
I am beginning to wonder if the authors actually understand what an outlier is. I would love to know how many of the “12 violent deaths not attributed to coalition forces” occurred outside of the Falluja cluster. Since only 21 deaths total are EFC, if 12 of them were non-Coalition related that would change the complexion of the study immensely. If only 4 or so were EFC that would alter the assertion that most of excess deaths were the result of Coalition action. The study doesn’t provide this kind of breakdown.
I also like the “Of the 28 children killed by coalition forces…” Wow, fire up the Star Wars imperial march. Of course, EFC that would be “of the 4 children killed by coalition forces.” I guess that lacked punch.
This survey indicates that the death toll associated with the invasion and occupation of Iraq is probably about 100000 people, and may be much higher
How much higher? What about 200,000+ higher! Why didn’t you put that in the Summary? Shouldn’t people know your study says IFC that 300,000+ people have died? Good lord, the bodies must be stacked up like cordwood! Why the soft sell? Oh right, it’s an insanely high number and obviously wrong. Better toss out that cluster. Glad you didn’t use its results everywhere else in the study.
We have shown that even in extremely difficult circumstances, the collection of valid data is possible, albeit with limited precision.
Limited precision. Heh. What’s an order of magnitude or a factor of three? Give them a break! This study is vitally important for reasons nobody can explain to me.
In this case, the lack of precision does not hinder the clear identification of the major public-health problem in Iraq–violence.
I am so glad someone spent money to determine that violence is a major public-health problem in a war zone. And they did it so precisely too!
Several limitations exist with this study.
Please stop. I can’t breath.
Second, the January, 2002, to March, 2003, rate applied to the 366 births recorded in the interview households post-invasion would project 10·4 infant deaths, whereas we noted 21 to have happened.
IFC or EFC? Has to be IFC. The study only recorded 10 nonviolent infant deaths, and only 4 from violence EFC, so to get 21 deaths they would have to be IFC.
Second, as Spiegel and colleagues documented in Kosovo,21 there can be a dramatic clustering of deaths in wars where many die from bombings. The cluster survey methodology we used may have, by chance, missed small areas where a disproportionate number of deaths occurred, or conversely, selected a neighbourhood that was so severely affected by the war that it represents virtually none of the population and thus has skewed the mortality estimate too high. The results from Falluja merit extra consideration in this regard.
Extra consideration obviously means injecting the results into every statement unless expressly noted otherwise.
Despite widespread Iraqi casualties, household interview data do not show evidence of widespread wrongdoing on the part of individual soldiers on the ground. To the contrary, only three of 61 incidents (5%) involved coalition soldiers (all reported to be American by the respondents) killing Iraqis with small arms fire.
IFC or EFC? Has to be IFC since only 21 deaths are reported EFC. Anything over that has to be IFC.
The remaining 58 killings (all attributed to US forces by interviewees) were caused by helicopter gunships, rockets, or other forms of aerial weaponry
I would love to see an actual breakdown of all of this IFC and EFC. 12 of the deaths from violence were from non-Coalition actors. How many of those where IFC or EFC? It is possible to construct a distribution from these hints that 15 of 21 deaths from violence resulted from causes other than airstrikes. If most of those where EFC, it would invalidate the assertion that most people died in Coalition airstrikes.
Many of the Iraqis reportedly killed by US forces could have been combatants. 28 of 61 killings (46%) attributed to US forces involved men age 15-60 years, 28 (46%) were children younger than 15 years, four (7%) were women, and one was an elderly man.
IFC, but you knew that didn’t you?
The Geneva Conventions have clear guidance about the responsibilities of occupying armies to the civilian population they control. The fact that more than half the deaths reportedly caused by the occupying forces were women and children is cause for concern.
Of course, here in the study’s concluding paragraph it is IFC all the way.
I think I have shown that the study’s authors, despite labeling the Falluja cluster an outlier, and identifying it as a cluster measured differently from all of the others, rely on it heavily to draw the overwhelming majority of their conclusions. In fact, they use it in almost every statement or conclusion except for their “conservative” estimate.
Without Falluja most of this study absolutely evaporates. Without Falluja most of the victims of violence are not women and children as the authors asserted both in the opening Summary and the closing paragraph. Without Falluja, violence isn’t the major cause of death, illness and accident are. Without Falluja it is even possible that most of the victims of violence didn’t die in Coalition airstrikes.
Falluja may be a statistical outlier but the authors use it as the heart of the study. They are dishonest to do so.
97 thoughts on “Fisking Falluja”
This is a pretty good fisking, but it probably won’t sway anyone. Seems that minds are pretty much made up even before they get a chance to see the report.
Have you Chicagoboyz been taking vitamins? Holy guacamole, I can’t keep up!
Reposting this from the last thread:
I still disagree about the “do a better study, and publish that” comment. The tremendous error bars, plus the enormous sensitivity of the study’s result to a single cluster indicate that the study has almost no utility in differentiating between numbers of dead.
As an example, consider what’s probably the best estimate of civilian deaths, (direct deaths, not excess) Iraqbodycount.com. They measure by direct enumeration, have a well-defined minimum and maximum. Currently, they report a minimum of 17,085 civilian deaths, and a maximum of 19,457. Of those two numbers, which is the closest to the truth? This study won’t tell you. The error bar is too big. How about 18,271 and 36,542? That’s the Iraqbodycount.com mean and twice the mean. Using the Iraqbodycount.com methodology, 36,000 deaths is utterly absurd. Using the results of this study, it’s almost as likely as 18,000.
If you wanted to get a useful estimate of excess dead, a far better approach would be to examine morgue records, and perhaps do a statistical sampling to determine what percentage of dead in certain areas don’t get brought to the morgues. That might give you an error of 10,000 or so — certainly you’d get something comparable to the Iraqbodycount.com error bar. Doing the study this way — even stipulating that the researchers did it with the purest of intentions — gives you error bars that are too big to do anything useful with them, like tell the difference between a credible and a ridiculous number of civilian dead.
If the past is any guide, your critiques will shortly be conclusively countered, either in this comment thread or on blogs run by people who believe in the integrity of the Roberts study as surely as you doubt it.
I would like to pre-comment on a common rebuttal that concerns the handling of the Fallujah outlier.
“The manner in which Roberts et al. declared one cluster to be an outlier and altered study design to remove it from some subsequent analyses is completely ordinary and/or accepted practice among statisticians.”
Commenters posting such critiques are making appeals to authority. At this point in the discussion, I think it is fair to ask them to provide citations to the authorities on which they are basing their claims. Those of us who are not statisticians should have the chance to see if Authorities do indeed speak as Roberts’ supporters claim they do.
Handling outliers when sampling heterogenous populations is hardly an obscure statistical backwater. Roberts is not the first study to use clusters to measure heterogeneous characteristics of a group of people.
So it should be fairly easy to point to:
1. A web page at a site that is broadly considered to be authoritative and reliable, e.g. NSF, universities, Medscape, PubMed;
2. An epidemiology textbook that describes the situation in question, and suggests applying the method Roberts used;
3. A paper in the peer-reviewed literature, in which an analogous outlier problem was confronted, then dealt with comparably to how Roberts handled Fallujah.
For (2) and (3), tomorrow (Weds 3/23) afternoon at 5pm ET, I am willing to visit a local med school library, photocopy and scan the cited textbook pages, and download the pdfs of the cited articles. Interested parties can email me at amackay-0001~at~usa~dot~net for these files.
Citations should contain enough information to permit the reference to be located. This means author, publisher, year, and page for textbooks–and I’m unlikely to find obsure or out-of-date editions. For articles, I’d ask for the first author, journal, volume #, page, year, and, preferably, PubMed PMID number.
This offer isn’t unlimited, as I, too, have to work for a living. Nonetheless, at least for this thread, I am willing to severely discount those appeals to authority that don’t cite chapter and verse from sources that reasonable, informed lay people can agree upon as relevant and authoritative.
AMac, I think the locus classicus is “Some Remarks on Wild Observations”, by William Kruskal in Technometrics in 1960, plus the seven references in its bibliography. Here’s yer link, including the wonderful passage
In the case of this study, what we learn from the Fallujah datapoint is that there were some very high-violence clusters which saw a very high death rate indeed. Since the majority of Iraq was not like Fallujah, it would be wrong (and not “embarrassingly high”, whatever the hell that means; I suspect that Shannon Love is judging the researchers by his own disgraceful standards) to print a number which extrapolated the Fallujah death rate to the majority of Iraq. However, since the number of deaths due to violence in Fallujah was clearly material relative to the total excess death rate, nor was it appropriate to ignore this data point entirely.
Other than that, this “fisking” is, to put it bluntly, weaker than beer piss. There is nothing there to argue with. It takes the form:
I am quite prepared to believe that Shannon Love can’t read a scientific paper, but I’m damned if I can see what it has to do with the death rate in Iraq. The rest of it takes the form:
as if people who died in Fallujah weren’t dead. What is truly astonishing is that the Chicago Boyz “fisking” apparently now agrees that the actual survey was sound, and that based on the evidence collected, it is likely that thousands of people died in Iraq as a result of the invasion, but moves on with nary a pause to castigate … not the fact that those deaths happened, but the manner in which they were reported! This is doubly appalling; dishonest in the first place because you now apparently admit that your previous claims that the study was “Bogus” are wrong but haven’t acknowledged that fact, much less apologised for it. And disgusting in the second place because you are making it clear that the only reason you cared about this study in the first place had nothing to do with life and death in Iraq but was simply concerned with the possible political effect on the Republican Party. This is absolutely shameful and I’ve half a mind to write to Milton Friedman and tell him what kind of rubbish is going out under his picture.
There is as far as I can see, only one specific criticism of the study in this post and (quel surprise) it is wrong. Shannon comments, in regard to the claim that the majority of deaths attributed to coalition forces were of women and children:
This is easily the most provocative sentence in the Summary. Strangely, it is not supported even IFC. Table 2 shows that of 73 deaths IFC, 33, or 45% percent, were women and children. Even adding in the two elderly deaths gives us only 48%. Weird
The only thing that’s “weird” is that Shannon missed the explanation of this result, half way down the right hand column of page 7 (by the way, how do you “bury” something in an eight page paper ?!?). Of the 73 deaths through violence in table 2, 61 of them were attributed to coalition forces; of these, 28 were of children under 15 years and 4 were of women. This is not difficult to find (I used the built-in search of Acrobat to look for the word “women”) and the fact that Shannon missed it reinforces my impression that he was not actually reading the study, but rather selectively looking for points to try and knock off it. Which makes the quote from Feynman at the top exhorting toward the maximum possible humility and honesty in exposure, look a little bit embarrassed at the company it is keeping.
The use of the Feynman quote is particularly revolting, given that Love is still pushing an argument (the “cluster sampling critique”) that he knows to be wrong. For shame.
PS: We appear to have a lot of undertakers on the internet, given the confidence with which people make assertions about the graveyard industry and whether or not “you can’t hide that many bodies”, “the dead would be piling up like cordwood”, etc. This is, frankly, crap. Iraq’s death rate has varied signficantly over time; it has been higher than the 12.3 cum-Fallujah crude mortality rate for much of that time (the crude death rate was 16.0 in 1970)
Zach, Iraqbodycount only gives a lower bound to the number of deaths since it is clear that not all deaths are reported in the media. How many journalists were there in Falluja during the attack, for example? Thus claiming that “36,000 deaths is utterly absurd” is simply wrong. All it would mean is that there are lots of unreported deaths, which is quite reasonable.
Those of you who oppose the Lancet study because you think it is bad science rather than because you don’t want to know how many have been killed should join this petition in asking for a better study to be made:
Wow. Some people truly do not get it.
What S. Love has done here, apparently unknown to him, is to demonstrate that the authors of the study are indeed consistent in their use of EFC and IFC.
This may strike many of you not versed in statistics nor experienced in publishing peer-reviewed papers as absurd. Let me explain.
The authors DO NOT throw out the Fallujah cluster data from the study when reporting their raw data. This is indeed appropriate. Every instance that Love reports the use of IFC data was when the authors were reporting raw data.
The authors DO throw out the Fallujah cluster data, as an outlier, when reporting their calculated extrapolations based on the raw data. This is also appropriate. Every instance that Love reports the use of EFC data is when the authors are reporting such extrapolated figures.
The authors are perfectly consistent in the way they present the data, as Love has proven, unbeknownst to him.
“I was in favour of excluding the Fallujah data before I was against it”
I am quite prepared to believe that Shannon Love can’t read a scientific paper, but I’m damned if I can see what it has to do with the death rate in Iraq. The rest of it takes the form:
If your so smart, how come you didn’t say whether it is or isn’t EFC?
However, since the number of deaths due to violence in Fallujah was clearly material relative to the total excess death rate, nor was it appropriate to ignore this data point entirely.
Are you arguing here that although the number of deaths in Fallujah is not like the rest of Iraq, how they died is similar?
Andjam, I’m not arguing with you at all.
“The authors DO throw out the Fallujah cluster data, as an outlier, when reporting their calculated extrapolations based on the raw data. This is also appropriate.”
So you contend that when the authors emphatically state in their study Summary that:
“Most individuals reportedly killed by coalition forces were women and children. “
and in the concluding paragraph:
” The fact that more than half the deaths reportedly caused by the occupying forces were women and children is cause for concern.”
…that they are throwing out the Falluja cluster data?
I don’t think so.
Try drinking less coffee.
The 1960 paper you cite, Some Remarks on Wild Observations, speaks precisely to the point in question. At 1200 words, it is shorter than S. Love’s post (3000 words), and barely longer than your contribution(920 words). Anybody who cares enough about this issue to have read this far in the Comments should click the link and read Kruskal before proceeding.
You quote accurately from the paper. However, other extracts suggest that, pace S. Love, the Roberts analysis strayed far from the standards Kruskal proposed:
dsquared’s claim is that what S. Love complains about in this post amounts to Roberts “carrying out an analysis both with and without the suspect observations.” The claim of the Fisking itself is precisely the opposite: that Roberts misleads the Lancet’s generalist audience, conflating the IFC and EFC analayses to produce a desired result.
I will re-read Roberts before commenting further (use this link instead of the one supplied in the post), except to remark that–IIRC–nowhere did Roberts clearly state the mean and confidence interval for the IFC data set. If so, this violates Kruskal’s guidelines at a very basic level.
dsquared, your many disagreements with S. Love are noted, as is their bilious manner of delivery. Given limited time, I have tried to restrict my comment to the central point. By citing Kruskal and commenting on its applicability, I think you have moved the debate forward.
Disputo do your comments also apply to the authors’ much-quoted Interpretation of their findings, as found on page 1 of the study?
‘I would love to know how many of the “12 violent deaths not attributed to coalition forces” occurred outside of the Falluja cluster. Since only 21 deaths total are EFC, if 12 of them were non-Coalition related that would change the complexion of the study immensely.’
The authors have provided this information in an email upon request. All 12 violent deaths not attributed to coalition forces were ex Fallujah. 9 were blamed on the coalition, 3 from small arms fire and 6 in bombings.
I dont’ suppose it’s plausible that any party in Iraq would use women or childer under 15 as combabtants, especially in Fallujah. Nah, never.
The claim of the Fisking itself is precisely the opposite: that Roberts misleads the Lancet’s generalist audience, conflating the IFC and EFC analayses to produce a desired result.
Have a look at Disputo’s excellent and very clear comment above; the numbers are not conflated. Raw numbers (like the fact that the majority of coalition-attributed deaths were of women and children) include the Fallujah data while extrapolated numbers (like the 100,000 deaths estimate) exclude it.
The mean and confidence interval for the post-invasion death rate including the Fallujah cluster is given on page 4, bottom right-hand corner. It’s 12.3, CI 1.4-23.2, design effect 29.3.
I think that a “bilious manner” of delivery is entirely appropriate in this case, and given that Shannon Love has accused the Johns Hopkins team of falsifying their data and intentionally acting as propagandists for “the Fascist elements in Iraq”, I think that my comment actually constitutes turning the heat down a little bit!
Yeah. And the investigators rushed to publish the study before the U.S. election out of pure public-spiritedness. I am shocked that anyone would even consider attributing political motives to them.
dsquared (9:59am), one comment.
In response to my claim (8:16am)
Yes, and the IFC data is also given in the abstract, as “The risk of death was estimated to be 2.5 fold (95% CI 1.6-4.2) higher after the invasion when compared with the preinvasion period”
To be more specific, the money quote of this paper is the numerical estimate of excess deaths, presented in the abstract as “98,000 more deaths than expected (8,000-194,000)…” This is the sentence that launched a thousand articles–“100,000 dead!”, not “12.3, CI 1.4-23.2, design effect 29.3!” Kruskal’s criterion, again:
Roberts should have been obliged to present the IFC excess fatality calculations along with those for EFC data set. This would have made clear to the lay audience of the Lancet that “the broad conclusions of the two analyses are quite different.” Some might then proceed to “view any conclusions from the experiment with very great caution.”
That you, as a statistician, place more credence in the death-rate or the risk-of-death summaries than in the casualty-figure summary does not address this point.
As an officer on a fractious non-profit board, I know the value of having an adverse, oppositional document on record.
It gives those who wasted great resources, both rhetorical and reputational, on opposing U.S. military action in Iraq, something to point their continuing criticisms to.
Without the Lancet study, any number of Lefty journals, socialist politicians, and other anti-Americans world-wide would be beached boats in search of water, camels without a desert, baseballs without a bat.
Their shrill defense of this bogus media creation tells all. Shannon, your post is a halogen bulb to these cave-dwellers. Keep it up.
The very principle that you cite is precisely the one which the study followed. “If the broad conclusions of the two analyses are quite different, I should view any conclusions from the experiment with very great caution.” Exactly!
So if you have data which implies a very large number of excess deaths, but you have reason to regard one cluster as an outlier, the responsible thing to do is exclude the outlier and see whether the “broad conclusion” is affected. In this case it isn’t, but the excess death number is now lower. So you report the lower number while noting that it is a conservative estimate.
That is exactly what was done: the number emphasised, which “launched a thousand articles” as you put it, is the lower, more conservative number.
It seems to me your quarrel is with the thousand articles. If so you are on solid ground. Many of them were crap. But the study itself is sound.
I am shocked that anyone would even consider attributing political motives to them.
Jonathan, the post “A Lie in a Labcoat” is not “attributing political motives” to the research team in some general sense. It is making two very specific claims:
1) that the Johns Hopkins team intentionally falsified their study
2) that they did so in order to provide propaganda for fascists in Iraq.
Both of these claims are defamatory; apparently you and Shannon believe that you would have some sort of “public figure” defence if sued, which is none of my business. But trying to claim that you’re not making a serious charge here is unlikely to help you.
I think you’re being a bit precious here. If you want the “excess deaths” number based on the death rate estimated cum-Fallujah, all you have to do is subtract 5.0 (the preinvasion death rate) from 12.3 and multiply by the population of Iraq (26m) and by 1.5 (to gross up the annual rate to eighteen months). That gives me about 285,000 excess deaths, comparable with the 98k number. All the “analysis” has been done here; all that has happened is that the authors haven’t done the multiplication for you because, as Disputo says, it’s bad form to present extrapolated numbers on the basis of outliers.
… oh hang on, no, it’s even a bit better than that. Look at page 5, middle of the right hand column, where they not only say that the Fallujah cluster on its own would be extrapolated to about 200,000 excess deaths in the 3% of Iraq represented by the cluster (about 200,000 in Fallujah plus about 98,000 in the rest of Iraq is not far from the 285k figure I got by multiplication above, hurray). I think the authors were pretty flawless here. In general, I am actively amazed that pretty much every time someone has made a decent, well-informed criticism of the study, I go back and read it again and find that it’s dealt with – this implies to me that the peer review that this paper went through, despite being truncated, must have been very thorough indeed. The paper has far fewer methodological and presentational flaws than I would normally expect from a journal article.
Btw, the Lancet doesn’t really have a lay audience; its readership consists of British doctors.
Kevin Donoghue (11:18am):
> It seems to me your quarrel is with the thousand articles.
That’s not the subject of discussion here.
If I may be briefly philosophical, I think one divide is between those who believe:
While others are more skeptical. My own reasoning:
It is perhaps unsurprising that we are not bridging this chasm.
dsquared (11:32am) wrote:
>AMac, I think you’re being a bit precious here.
Thank you, I suppose.
Returning to the subject under discussion:
>If you want the “excess deaths” number based on the death rate estimated cum-Fallujah…
EFC, “98,000 more deaths than expected (8,000-194,000)…” Seven words.
IFC, “298,000 more deaths than expected (?,000-??,000)…” You took circa 260 words to arrive at the preceding sentence. Yes, you certainly could have been briefer, and, given your training, you can probably estimate the confidence index figures (I can’t).
Recall Kruskal’s words:
Roberts et al. have not presented the two analyses in the form most accessible to the lay target audience. One may take lay, in this context, to mean not conversant with sophisticated statistical methods, a fair description of the majority of the Lancet’s readers.
One may take lay, in this context, to mean not conversant with sophisticated statistical methods, a fair description of the majority of the Lancet’s readers.
No. You cannot get an MD in the UK without taking a course in medical statistics.
Thanks for the legal advice but you misinterpreted my comment. Believe it or not, I don’t speak for Shannon. While I agree with many of Shannon’s arguments, it is also true, as your own involvement in a group blog suggests you appreciate, that not every contributor here necessarily agrees with everything that every other contributor writes.
To return to the substance of my comment: I was attributing political motives to the investigators, who it appears to me had an anti-war agenda. Can I prove it? No. But given their eagerness to publish before the election, as well as their eagerness to overlook what seem to me to be serious weaknesses in their sampling methodology, I think it’s a reasonable inference and deserves to be taken into account by anyone considering the worth of their study.
>No. You cannot get an MD in the UK without taking a course in medical statistics.
Perhaps amusingly, these two posts are heartfelt testimonies to the difference between passing a grad-level statistics course and being conversant with sophisticated statistical methods.
When y’all finish the Lancet study of Iraqi deaths, would you turn all this statistical talent toward the numerical nomolies of voting patterns in King County Washington? I mean, how one goes about getting a sample size greater than the population and all that …
given your training, you can probably estimate the confidence interval
No I couldn’t; what we’ve got here is two distributions, not one (the baseline whole-Iraq population plus the high-violence population represented by Fallujah). The combination of these two population means is at least potentially a meaningful number, but no linear combination of their variances is. In order to get a confidence interval for the combined population, you would have to a) make some quite strong assumptions and b) do some really quite high-faluting numerical integration, because the combined distribution would not have a standard form. (I actually have quite a lot of experience with this sort of computation, because I deal with financial data, which tends to give you a combination of “normal” volatility, plus occasional clusters of very high volatility during crashes. I can therefore say with a degree of authority that there is no very good way to make general statements about percentile confidence levels of this sort of distribution)
The sensible thing to do here if you are interested in the answer to the question (rather than interested in the distributions themselves from a technical statistics point of view) is to just give the baseline number and report the Fallujah cluster as a separate piece of information. Which is what they did.
“1) that the Johns Hopkins team intentionally falsified their study
2) that they did so in order to provide propaganda for fascists in Iraq.”
Actually, I never suggested that the John Hopkins team falsified their study. Rather I asserted that they presented the results in a dishonest fashion intended to spin the results for maximum political effect.
For example, why did they assert 100,000 additional deaths when their data actually says 98,000? I think they did so not for any valid scientific reason but rather because 100,000 is a more attention grabbing number. The choice of the number was governed by marketing, not scientific concerns.
I do think it likely that the study was planned and publicized purely for political effect but I did not assert that it was done by or for the direct benefit of the Fascist. (I do think it likely the Fascist sought to subvert the study in Falluja indpendently ) I DID assert that the only EFFECT of the study vis-a-vis Iraq itself was to provide propaganda for the Fascist. The Fascist in Iraq probably danced a little when it was released. The authors of the study are presumably smart enough to realize this would be one of the effects of the study but went ahead with it anyway.
The study is entirely useless for implementing any kind of policy change that might make life safer for the Iraqi people.
The only effect this study will have on the people of Iraq is to prolong the war and all its killing and suffering.
I devoutly wish that one of the authors or sponsors of this study would sue me. I would love for them to have to defend their work in court.
I feel like Chicagoboyz has become my new journal club!
I want to address Amac’s request from way back in this thread regarding his concerns about “appeals to authority.” I just did a quicky google search for “dealing with outliers” and this is the 1st hit I found.
If you scroll down you’ll find a bit about deletion. I thought this might be a nice one to show you because it’s not super technical. And I wouldn’t want Amac to have to run to the nearest university library and check out a dusty copy of some 1960s Leslie Kish book or something.
So here’s the thing. You really never want to have to throw out data whether it be an individual observation or a whole cluster. Aside from the fact the data collection is difficult and expensive, you’re messing with your abilty to make inference when you throw away data. Why? Because frequentists statistics is based on the assumption of random sampling process, so when you start throwing stuff out you are biasing your sample. The reason why you’d throw out and outlier is not just because it’s extreme, but because it’s extreme in a way that makes you think it’s not real and maybe some sort of error. Like if you have a 20 year old that weighs 15 pounds in your sample — that’s basically impossible and is probobly a data entry error or something. If you read the Roberts et al paper, it’s the extremity of Fallujah *coupled* with the fact they were not able to collect data in a way that was consistent with the other clusters (not allowed to use GPS.) This gives them concerns about the cluster’s validity. If this is truly a problem, statistics won’t be able to fix it, and throwing out the cluster is a reasonable option to present to the reader. If it’s just an extreme cluster, then the “design effect” will allow for the extra variance from the more extreme clustering. It’s not always obvious what problem you’re dealing with.
A few short points:
1. The Lancet study estimates death rates, not war dead. The US government has provided estimates for death rates in Iraq. In the CIA factbook you’ll find a figure of 5.66 for 2004, of 5.84 for 2003 and of 6.02 for 2002. They do therefore provide a number for excess death, and it’s negative, ie the war has “saved lives”.
2. Death rates, however, change for a variety of reasons, weather or changes in the age structure of the population for example. An increase or decrease in death rates may have reasons that are not at all, or only tenuously, causally related with a particular military intervention.
3. Anonymous epidemiologists of standing defending the Lancet study’s methods have been claimed. Is there a reference to a detailed explanation by a scientist of standing with relevant experience as to what to make of the various death rate estimates available?
4. No military power provides estimates for civilians killed through collateral damage. The US has in the past provided body counts of combatants killed, a practise that led to accusations of the US military inflating body counts to look as if they were winning, or worse, of being bloodthirsty, measuring success only through death and encouraging slaughter by making the body count a yardstick.
5. Without control of territory it is impossible to provide ver accurate figures of ennemy or civilian deaths, or to easily distinguish between them.
6. The Interim Government of Iraq now provides figures for Iraqi police, National Guard and civilians killed in terrorist and military operations.
7. Figures from morgues and hospitals are also released, with some indication of how many are due to ordinary crime.
8. No figures have been published by the Iraqi Government or the coalition on civilians killed before full control of Iraqi territory was established. There is, however, a compensation scheme in place and some figures have been released on how much has been paid out (in particular in the case of Najaf and Fallujah).
9. Not publishing such figures is standard practise, and there is no deviation from previous conflicts. Such estimates have routinely been provided only by NGO’s or the country affected, often with considerable delay before accurate data could be established through detailed investigations.
10. In my opinion, further data should be provided, with the Iraqi government in a leading role, and funds provided by the coalition both to finance investigations, and to provide compensation, as is already the case.
11. I think that the number of civilians killed accidentally by coalition weaponry is around 3000, with a majority due to small arms fire.
Jonathan, could we get this clear; are you prepared to defend Shannon’s two very serious challenges or not? If so, you presumably won’t mind my replying in kind. If not, then I’m not minded to argue; the main authors certainly wanted to stop the war (if you had done a piece of research which suggested that a war was responsible for 100,000 excess deaths, wouldn’t you be against it) and probably wanted John Kerry to beat George Bush in the election. But this does not change the means or standard deviations of the survey results, and to claim that it does is the purest and simplest form of argumentum ad hominem.
Shannon, I would have hoped that recent experience with me (including the fiasco over your disastrous “cluster sampling” argument, which I note you are no longer defending, although still happy to link to it) would have taught you that you do not tend to have very good luck when you try to get away with lying to me. But apparently, not so much.
On my 1) above, you accused the authors of colluding with the insurgents in order to include the Fallujah cluster. That’s “falsifying the study intentionally”.
On my 2) above, you wrote
That’s “intentionally acting as propagandists”. The clue is in the final clause of the final sentence, which, in case you forgot what you wrote, reads “I conclude the effect was intended”.
And finally …
For example, why did they assert 100,000 additional deaths when their data actually says 98,000? I think they did so not for any valid scientific reason but rather because 100,000 is a more attention grabbing number. The choice of the number was governed by marketing, not scientific concerns.
Well, think what you want, but the paper is actually quite clear. It says that the figure of 98,000 refers to the 97% of the country represented by clusters other than the Fallujah cluster. 98,000 divided by 0.97 is 101030, which rounds to “about 100,000”. I can no longer remember whether this is the second, third or fourth stupid calculation mistake you have made when discussing this study, and yet you have the temerity to post quotations from Richard Feynman about scientific honesty? You are embarrassing yourself.
When you decide to delete “A Lie in A Lab Coat” and “Fisking Fallujah”, as you presumably will do at some point in the near future, could I prevail on your good nature so as to send me a copy of the comments thread, by way of a souvenir?
Thanks for the excellent fisking, Shannon. The political hacks who tried to foist this pseudo-statistics on the pre-election public need to have their heinies whacked. Lancet could use a new editorial staff as well. Rush job, hack job, what stupidos.
it doesn’t really help their case, if the case is to show that there have been tens of thousands of bombing deaths of women and children.
5 deaths does extrapolate to around 15000, and 9 deaths to around 30000 out of the 60,000 violent deaths and 100,000 total claimed excess death.
But those 5 deaths represent just two incidents. Two interviews in other words represent over a seventh of the total excess death, and all ex Falluja killing of women and children by the coalition.
Dsquared likes to attack minor errors, I haven’t yet seen him address any of the serious concerns in earnest, and the tone of his latest posting is such that I would not reply to it. Sadly, I am more and more of the opinion that for him, it’s more about petty US centred partisanship than about concern for Iraqis.
Interestingly, Mike Harwood points out that the fact that not a single death is classified as being of an “insurgent” isn’t exactly in the study’s favour, considering that other estimates of “insurgents” killed range from around 15,000 to 25,000.
I see that a fairly major correction has been added; for this, very much thanks, and by way of good faith I will now withdraw the epithet “dishonest”.
Heiko do you have some sources for those insurgent death statistics? They’d be much appreciated. Thanks.
>I wouldn’t want Amac to have to run to the nearest university library
Not to worry, I said I’d be here, and I am :-)
Robin High’s Outlier essay is also excellent reading, thanks for the pointer. I have learned something about statistics and something about data presentation today.
Andjam, I’m not arguing with you at all.
I’m flattered that you seek my opinion, but as I think I indicated in my previous comment, apparently not clearly enough, I think that Shannon can defend his own arguments quite well enough without my assistance. And since Shannon has already argued that you mischaracterized his “two very serious challenges” I am unsure as to what you want from me. If I find Shannon’s main argument persuasive, and say so, am I now required to defend him against every objection that you raise? I don’t think so, and that’s certainly the case with my last comment, which I made primarily to record my own opinion.
I would not necessarily be against such a war. First of all, civilian-death data weren’t available in 2002/2003, when the important decisions were made — it’s post-war hindsight to suppose otherwise. Second of all, if my study showed a range of deaths as wide as the Lancet study showed I would not have trusted the results and would have discounted them. Finally, and I think most important, civilian deaths are not, and should not be, the main concern in deciding whether to go to war.
What I actually wrote was this:
So I made a speculative inference, which I plainly labeled as such, about the anti-war agenda of the Lancet investigators, and suggested that their having such an agenda is relevant in any consideration of their research. That is not an inappropriate point to make in the context of a controversy whose participants’ scientific positions correlate closely with their political positions.
BTW, are you arguing that the investigators didn’t have an anti-war agenda? And how many of their supporters do not have an anti-war agenda?
The more I read over the methodology the more spurious I believe the results are.
My own tentative conclusions–
1. I lack the expert knowledge to say if Roberts’ choice of sampling tools was correct. In the absence of compelling reasons to doubt it, their cluster sampling approach seems reasonable.
2. Roberts knew, or should have known, that postwar mortality in Iraq would be geographically heterogeneous. That they did not provide for the possibility of having to treat a violent cluster is prima facie evidence of poor study design.
3. Common sense, not statistical expertise, strongly suggests that the Roberts study was grossly underpowered. The huge ranges encompassed within the confidence intervals are evidence of this. Instead of 33 clusters, some much higher number was required. The infant mortality “check” was ambiguous given the lack of clarity on prewar figures, and presumably infant mortality patterns are less heterogenous than adult mortality patterns.
4. The Fallujah outlier was not handled according to the suggestions of either Kruskal or High, the only two Authorities on the subject that have been cited here.
5. While the study has other problems, I can’t evaluate their seriousness. But every study has problems.
6. S. Love is basically correct in charging that Roberts’ text often muddies the waters between IFC and EFC data. dsquared is therefore incorrect, in my opinion, in claiming that it is clear, and especially in supposing that it would be clear to the intended readership (not statisticians).
7. By Roberts’ own account, this study was done quickly, and expedited through peer-review to make it to print in time to affect the US presidential election. Roberts have lost any presumption of impartiality.
8. If Roberts had the worst of motives for doing this study, it would have been a simple matter for them to “tighten up” the data to achieve a more impressive confidence interval. Who would ever know–certainly not the Lancet’s peer-reviewers! There has been no evidence presented that Roberts acted in concert with jihadis or Baathists in Fallujah, or that they sympathized with them, or that the primary data is anything but what Roberts claim, or that the analysis contains phony numbers. Yes, the text of the article must be parsed carefully because Roberts’ presentation is partisan. Beyond that, S. Love should re-think the bases for the grave charges he has made, presumably in anger, about Roberts motives.
I’m going home.
More objectivity from Lancet:
Heiko, I think the number of civilians accidentally killed by coalition forces is approximately 28,433. Roughly 2679 were from ricochets off building walls and another 6992 were the result of traffic accidents.
The rest died from being kicked in the head by American cavalry horses, like those Prussian guys–I derived this result using the Poisson distribution and then pulling it out of my ass.
I’m still working on the number killed by American bombs and rockets used in urban areas, a number I think I can obtain without too much difficulty by sitting in my armchair, doing a bit of googling, and then slapping down any darn figure I choose.
Data available from morgues (eg information on rough proportions of those killed being due to ordinary crime and gunshot wounds), press accounts of bombing incidents, numbers of Iraqi police killed, the numbers published by the Iraqi Health Ministry on people killed in terrorists and military operations, considerations of how many wounded there’d be per death confirmed by the Health Ministry figures, knowledge of other bombing operations, soldier’s accounts, the experiences of Iraqi bloggers, kill ratio estimates by American officials and independent analysts, and in a few instances actual estimates for specific engagements (like today’s nearly 100 insurgents, or the numbers provided for Fallujah and Najaf which add up to 2000-3000), information available on compensation payments,
they all give a pretty consistent picture, and that’s a significant increase of violent crime, equivalent to something like 10,000 deaths, “insurgent” and regime military deaths around 15,000-25,000 and civilians killed by the coalition around 3000, with small arms fire edging out bombing deaths.
I’ll look up some of the sources for estimates of “insurgent”/Baathist/islamist combatants/terrorist deaths (either kill ratios or analyst estimates).
I am confident that my estimates are much more reliable (not to mention meaningful) than anything you can extract from 2 self-reported incidents ex Fallujah involving 5 women and children, and 1 report of a man dying as a result of aerial weaponry, or for that matter from the Lancet’s Fallujah data, which is contradicted by all other information available on the number of civilians likely to have been killed in that town (including the insurgents own claims).
Heiko — would you care to write up your approach and results?
Hey Amac, I hope you had a good library trip. I find that reading stat books can be exhausting.
Amac, you’ve been pretty upfront about your lack of the detail here. So I don’t quite understand how you go from starting your post with a disclaimer of specialist knowledge, to making very specific criticisms of the statistical methods of the study. Not surprisingly, because of this, you’ve made a few mistakes.
2. On your “2”, there is no “prima facie evidence of poor study design”, because there is no way in which you can “provide for the possibility of having to treat” an outlier; otherwise it’s not an outlier. If the observed death rates are coming from two different underlying distributions, then there’s nothing you can do to smooth that out, and any attempt to do so would be wrong. The correct study design would be one that gives you the maxmium chance of finding out that there are high-violence clusters if there are, and it doesn’t seem obvious to me that this study fails on that criterion.
On your 3, common sense is almost always a poor guide to good statistical practice. The large confidence intervals are evidence of the fact that death is a rare event, and that the variance postinvasion was very large, not evidence that 7800 individuals in 33 clusters was too small a sample. A larger sample would always be better, but this is not a reason to believe that the means and the confidence intervals in the sample you have are wrong without any other evidence (it’s also not a reason to believe in an underestimate rather than an overestimate). The infant mortality figures weren’t anomalous, and your “assumption” that infant mortality is homogeneous is unfounded and likely to be wrong.
On your 4, you’re just wrong and I apologise for not having explained this more clearly. As far as I can see, you’re still wanting an extrapolated number cum-Fallujah to have been given. This would have been a possibly seriously misleading point estimate and could not have had confidence levels calculated for it for the reasons I discussed above (along with a bit of gratuitous showing off about my own stats chops). This is not what Kruskal meant by “carry out the analysis” in both ways; the outlier was dealt with perfectly sensibly.
On your 5, the fact that “every study has problems” clearly isn’t a reason to distrust this one unless you can say what those problems are. In general, I’ve found a distressing tendency among Lancet critics to make airy, general methodological criticisms and shy away from saying what specific errors they think have been caused. I believe this rhetorical technique was originated by Stephen Milloy and it’s one of the reasons I don’t like him.
On your 6, I’m afraid that once more, I have to ask you to be specific. What particular false impression are you saying that “lay readers” of the Lancet would have got from the paper? The only way in which I can think the summary misleads readers is that it inclines them to believe that the most likely figure is 100,000, when based on the full dataset there is reason to believe it is higher. But I doubt that’s what you mean.
General accusations of “muddying the waters” make no sense in this context; the distinction is only important when making an assessment of the statistical validity of the claims, which is something that “lay readers” are not able to do anyway.
Your 7 is argumentum ad hominem, pure and simple, Roberts could be the second coming of Michael Moore and it would not affect one single standard deviation in this study.
Heiko: Does your figure include deaths in Fallujah, Najaf and Samarra? If so, how?
Here is a good example on why it is hard to get casualty figures from Iraq:
US sources claim 85 rebels and 11 Iraqi soldiers were killed when a camp was overrun. The rebels apparently still are in that camp and claim that only 11 of them were killed, and then from aerial bombardemnt not ground combat. Local hospitals report they have received no bodies. Who do we believe?
Geez, late to the party again. I kept waiting around for more comments at the “ Lie in a Lab coat “ thread, not thinking that everyone had migrated to a new thread (I don’t frequent Chicago Boyz).
At the risk of coming off as a lazy self-plagiarist, I’m going to repeat some of my points from the previous thread.
Until a defender of the study provides some explanation for the discrepancy between the 30,000 excess deaths (ex-Falluja) claimed by the authors to have occurred through coalition bombing, as opposed to the 17,000 deaths represented by the actual data (6 bombing deaths ex-Falluja), I can’t see how the 100,000 excess death estimate can be viewed as reliable.
Examining the volatility of the small numbers represented by the various excess death subsets takes this one step further. As I mentioned before, an identical study (same methodology, same sample size) is absolutely capable of providing us with 30 bombing deaths ex-Falluja, and in such a manner that none of these deaths could be written off as outlier deaths. That’s if the study defenders are correct that we have tens of thousands of bombing deaths, ex-Falluja. Of course, such a finding pushes the excess deaths from bombing alone well past 100,000, based on the extrapolation practices exhibited by the authors in the actual study.
But that’s only part of the quandary raised by this proposition. It can also be taken in the other direction. Is an identical survey capable of providing a zero result for bombing deaths, ex-Falluja? I say, of course it is. The actual survey counts 6 deaths, from 3 incidents. Can anyone effectively argue that a zero result isn’t reasonably possible from a second survey, given these small numbers?
Before anyone tries, remember what Heiko mentioned earlier. The Lancet study identifies one insurgent death out of the 100,000 excess death estimate, and he’s a “maybe!” The few estimates that coalition and Pentagon spokesperson have publicly released for insurgent deaths are generally from 15,000 to 25,000. Each excess violent death from the Lancet survey represents roughly 3,000 extrapolated deaths. Therefore, the Lancet study picks up 3,000 insurgent deaths (with the “ maybe “ caveat attached), while the actual toll could be as high as 25,000. With this in mind, I think it’s fair to say an identical survey could report that there were zero deaths from coalition bombings, even though we know this isn’t the case.
What we’re left with then, is this:
The number one cause for the 100,000 excess death figure (as claimed by the authors) is coalition bombing. The authors claim approximately 30,000 dead from coalition bombing, but their own data indicates 17,000. Because of the extremely small numbers making up the number of deaths and incidents from bombing, subsequent identical cluster studies could cause this purported number one cause of excess death to either become non-existent all together, or dwarf the previous 100,000 estimate all by itself. In essence, the number one cause of death could move from 30,000, to well over 100,000, or to zero, based solely on the neighbourhoods selected for subsequent studies.
I see no way around this. One can’t simply rely on the study’s relative risk ratio increase post invasion and claim that this solves the problems presented by the volatility from the key subset causes of death. First of all, the increase in the relative risk, for the most part, gives us the 100,000 excess death estimate, whether we pretend it doesn’t exist or not. Second of all, certain key death subsets, for all intents and purposes, ARE the excess death figure. If we accept the authors’ extrapolations, deaths from coalition bombing and accidents ALONE account for approximately 54,000 of the excess deaths. If key subsets such as these can’t withstand wild fluctuations from study to study, how can we invest much confidence in a 100,000 excess death estimate, based on a single study?
Moreover, on top of this, there still exists the lack of corroborative accounts, be they anecdotal or media and government supplied, of multiple catastrophic death tolls from coalition air strikes to account for a tens-of-thousands death toll from bombing. I know I’m repeating myself, but the Serbian bombing campaign involved the dropping of 25,000 pieces of aerial ordnance. The most reliable estimates of civilian deaths (Human Rights Watch, the Michael Mandel indictment attempt) number 500 to 1,800. This should give any objective person pause before accepting the 30,000 bombing death estimate, keeping in mind as well that the study authors are on record (letter to the Independent newspaper) suggesting that a further 50,000 to 70,000 deaths may well have occurred in the excluded Falluja cluster, primarily from air strikes.
D Squared, I have great respect for your statistical expertise and intelligence. However, when it comes to this question of bombing deaths, I have to disagree that
“……common sense is almost always a poor guide to good statistical practice.”
Sorry, my link failed. It’s to the Richard Garfield interview at EPIC.
If you take the upper bound of Heiko’s numbers, which are unscientific but consistent with reports, that’s about 40,000 violent deaths, which isn’t too far out of wack with the Lancet paper. However the Lancet study was done 6 months ago, many of the violent deaths have occured since then, and its “conservative estimate” is high. The methodology used means that bias in any one direction could skew the results significantly. The samples were also taken of a very reacionary culture at a time when, I believe, anti-americanism was very high. And it looks like little effort was made to verify the survey results. The results are very imprecise and not very practicle.
I’m curious know what would happen if the study was repeated.
This is absurd. Some basic points and assumptions about the survey are all one should need to consider:
1) The authors had a political motivation, even if how they used what data they could collect is sound. Given this bias, it appears reasonable to assume they went looking for rather than recording data (one can do this without inventing data).
2) The authors didn’t have enough solid data to make extrapolations that are in any way meaninful. It’s pretty clear the sample size was too small.
3) They don’t appear to have factored in the possibility that many areas they went to had populations either fearful of insurgents or sympathetic to insurgents, thus increasing the chance of getting bad data.
4) No other estimate of deaths from any reputable source comes anywhere near the figures they suggest.
5) They authors didn’t appear to attempt to compare the pre-war death rates data they collected with other sources (this is especially relevant given the pre-war death rates we can find from reputable sources mentioned in previous posts)
6) Perhaps most important, they don’t bother to estimate the number of people wounded, which would be ridiculously high given their estimates of those killed by bombing.
7) The timing and locations they chose to collect data in were nowhere near conducive to scientific study. There was no way for the authors to effectively support or even compare their conclusions with records from hospitals, morgues, civil authorites, Iragi and US military and police, or any other group one would think prudent to consult
aaron, if you are curious what a repeat study would give I can only urge you to sign this petition:
It’s the “anti-war” crowd that while they accept the Lancet study as the best data available today still wants to see better studies, while the “pro-war” people refuse.
Mike, I remember that you did some good work on this subject (and got hold of the numbers from Garfield) at Tim Lambert’s site, so refresh my memory, please?
Until a defender of the study provides some explanation for the discrepancy between the 30,000 excess deaths (ex-Falluja) claimed by the authors to have occurred through coalition bombing
Where does this claim appear? I can’t find it; that may be because I’ve stared at this article so long that I’m going word-blind, but it may also be that this is a piece of extrapolation not in the actual report. I’ve searched for “coalition”, “bombing” and “30” but I’m coming up blank. Is it in the paper, or was the claim made elsewhere?
If the 30,000 figure is a result of applying the cum-Fallujah breakdown of causes to the ex-Fallujah total excess deaths extrapolation, then it wouldn’t surprise me if it wasn’t in the study, because I would not think that a legitimate calculation. This would be an example of Disputo’s excellent point above about the use of outliers and the distinction between raw and extrapolated numbers. But I seem to remember that you had a quite detailed argument about these numbers over at Deltoid; I don’t suppose you could give me the link so I can bone up on your point, please?
that’s wrong. The US government has through the CIA factbook an opinion on excess death. I’d like to know how they made their estimates, but it’s just not true that there isn’t any.
The Lancet study’s sample size is so small that it cannot estimate civilians killed by coalition aerial weaponry or small arms fire with any accuracy (2 reported incidents literally account for all the children and women killed through the use of aerial weaponry ex Fallujah).
At best, you can argue that it might be large enough to have a stab at death rates.
What I don’t understand about this study and the comments posted here about it is that the base from which all the excess deaths were calculated was taken as “The crude mortality rate was 5.0 per 1000 people per year…” (page 4). But according to the CIA FactBook for Iraq the mortality per 1000 people has been:
1995 — 6.82 1999 — 6.56 2003 — 5.84
1996 — 6.57 2000 — 6.4 2004 — 5.66
1997 — 6.33 2001 — 6.21
1998 — 6.57 2002 — 6.02
Now, as I understand it, if the crude mortality rate is much above 5.0, most, if not all of the excess deaths disappear. Can someone who supports the study explain why the study uses 5.0 as the mortality rate? What is their source?
From page 3 of the study: “crude mortality, expressed as deaths per 1000 people per year, was defined as: (number of deaths recorded/number of person-months lived in the interviewed households)x12x1000.”
Would that the CIA were so helpful. Since the post-invasion mortality is measured in the same way, the excess deaths figure stands as the best evidence we have.
Heiko, a number without any description on how it was derived is useless, *especially* if it comes from one part in a conflict. Most likely the CIA figure is just some rehash of the Iraqi government figures already known to be inaccurate. If the US government wants to present a credible figure they will have to hire independent researchers who work openly, stating the method they use, but as general Franks so eloquently put it, “We don’t do body counts”.
The Lancet study certainly was too small to make any kind of reliable breakdown of the casualties like you give example of. The more specific questions you want to answer the larger the sample need to be. However, it was large enough to give a rough estimate of the total number of excess deaths.
Ralph and Heiko; the CIA does not actually provide numbers for “2002”, “2003” and “2004”. They provide numbers for “2002 (est)”, “2003 (est)” and “2004 (est)”. As you no doubt guess, that little three letter word means “estimated”. The factbook does not give any details of how it comes up with that estimate (for which I do not blame it at all; it would be a monster exercise to cite every number in every factbook entry, and entirely not worth doing). I suspect, looking at the “Notes and Definitions” section that from year to year, if they don’t have updated census numbers, they just update the rate for changes in the age structure of the population – so, the Iraq “est” number has been falling because the population has got younger (I have not checked whether the population has actually got younger; if not, then something else is driving the estimates).
However, I am more or less certain that the CIA Factbook estimate is not based on fieldwork, because there has been no census carried out in Iraq since 2002, and no reliable information about death rates for rather longer; all the published work for the last few years has been based on estimates of one kind or another. So the answer to the question “why does the study use 5.0 as the mortality rate?” is “this was the rate that they found when doing their fieldwork and there is no better estimate to work from”. In any case, since the study was structured as a cohort study, they would have had to have used the number they found anyway in order to calculate the risk ratios.
CCW, those demands are meaningless ubless they are applied to all sources of information. By your reasoning I would be able to deduce that the Administration’s refusal to perform a proper body count was because the number was so high and we are back where we started.
Of course the factors you mention can be important which is why we have things like peer review. If people who confess to know little of statistics choose to assume that these are the most relevant considerations in an article published in a not famously excitable journal then all is chaos.
In any case the fuss about the Fallujah (current population?) sample is bizarre. Had the Fallujah data been included it would have produced a far higher estimate of death rates. It was not included in the main estimates not merely because the results were extreme but bcause conditions there made it impossible to conduct the survey exactly in accordance with the planned methodology. The only reason I can see for finding this to be a flaw in the survey is that it was in fact falsely exagerated by “insurgents”. But as THomas points out above it is not obvious how the insurgents would spin the numbers if they were given the chance. If Coalition strategy was to force the population to repudiate the insurgents, would they really try to say the Americans would kill them all?
Criticisms of the Lnacet report’s methodology, data and bias would be much more convincing and productive if they were conducted in parallel to other statistics on the same subject. So for example we know that Iraq Body Count will systematically understate the numbers. Heiko, may eventually improve upon that with a little extrapolation and a lot of triangulation but seems likely to be susceptible to the same tendency and under reporting of deaths in places of extreme disorder such as Fallujah and Najaf. The CIA numbers are very odd indeed, apparently suggesting that the war cut death rates and appearing in the face of official refusal to provide numbers or conduct counts and the frequent justifications for doing so. For sure it will have problems with bias and is somewhat wanting in openness. Where is the vitriol directed towards the CIA?
Mike, it is not true that the authors claimed 30,000 dead from coalition bombing outside Falluja. One of the authors said that in an interview. I asked him about it and he said that it was a mistake.
Just to clarify that I certainly don’t think that “vitriol” is appropriate toward the CIA for producing their estimated numbers; I would bet dollars to doughnuts that these estimates just fall out of a spreadsheet formula based on the age structure. The CIA is producing a factbook on every country in the world on limited resources and it’s entirely understandable that they should use estimates like this, but that doesn’t mean we should take them at face value.
the Iraqi Body Count database has its uses:
But you need to be aware of what they are counting.
Look at the last few entries eg:
Al Tameen, target Shiite funeral, weapon (suicide bombing), deaths about 50.
If you do this systematically (which is a lot of work, and to present that coherently, with explanations as to how and why one is distributing the blame, it is even more work) you find that only about 1 in 6 of the deaths are actually likely to be directly due to coalition weaponry.
It is also not at all clear that this represents an undercount. The deaths have not all been verified, let alone verified as civilian. For example, General Waed Yussuf Yacoub, officer in charge of the Kirkuk criminal investigation department is listed as a victim. I would find it very hard to describe him as a civilian, let alone a civilian killed by the coalition.
When you see AFP and Al Jazeerah as the sources, it’s likely that they used Iraqi stringers as their primary source. Some of these Iraqi stringers may present biased accounts.
Clearly, some deaths have been missed. I know of some incidents described by Najma that I do not believe have made it into the Iraqi Body Count database. But, on the other hand, the Iraqi Body Count lists a lot of people, where it’s unsure whether they really have been killed, or whether they were civilians (and many that I would definitely not class as civilians).
I’m think dsquared is probably right that the CIA factbook estimates don’t reflect any new data. I’m pretty sure he’s right that the estimates are based on population age data.
The US has in the past done detailed body counts of ennemy fighters, and this was abandoned after accusations that this created incentives for inflating said body counts, be it through lying on the parts of soldiers or by soldiers being given an incentive to kill, when there were alternatives.
The US was consequently simultaneously accused of providing inaccurate figures, and by giving those figures to be encouraging needless death.
It seems reasonable to me that numbers should be provided by independent sources and the government of the territory in question.
In the case of Iraq, due to the actions of the insurgents, few independent, non-Iraqi analysts would currently wish to perform detailed field work (who wants to end up beheaded merely for being a foreigner?). And for about the last year, there are various figures provided by the Iraqi government. The gap that needs verification is the period between early 2003 and early 2004. I expect this to be dealt with over the next few years, but it’ll require extensive and likely expensive field work, looking at individual cases in depth, and is not really a current priority.
I believe that the numbers are too small for statistical extrapolation to be useful, particularly because the veracity of people’s claims would need to be investigated, and as the deaths will be so heterogeneous and clustered that extrapolation from a random sample of the population will likely be less enlightening than informed extrapolation from known, reported incidents.
Hi Aaron and dsquared,
it’s clearly more than just the age distribution that goes into the estimate.
The CIA factbook gives a death rate for Zimbabwe of 23.3. Its population is much younger and its birth rate (death rates in the first year are very high, a birth rate of 33 and an infant mortality of 10% implies about 3.3 deaths per 1000 inhabitants just for infants) similar to Iraq.
What I want to see is a detailed discussion of the various estimates available by an expert who knows how they have been derived. None has been provided yet by those claiming that the Lancet study is superior to the alternative estimates available.
Zimbabwe is an interesting case, because initially there was some improvement under Mugabe, just like under Saddam. At roughly the same time, things went downhill for Iraq and Zimbabwe, with the rather important difference that in Zimbabwe there were no comprehensive UN sanctions or US invasions.
Merely misrule allowed things to go downhill, something to keep in mind when trying to apportion blame for the rise in death rates under Saddam in the 1990’s.
I confess to not understanding what an AIDS epidemic in Zimbabwe has to do with violent deaths in Iraq. In any case, Heiko, perhaps you would be so good as, pending this mythical survey in which the CIA Factbook decides to put its methodology on display (and attract, presumably, an army of know-nothing critics), to stop trying to throw mud at the Lancet study? This is the methodological version of Kaplan’s fallacy; using uncertainty about methods as an argument for believing that one method is better than another.
“One should openly acknowledge science is political and not be afraid to get stuck into the debate…to me that’s one of the failures of science. It sees itself as being very apolitical, and that’s just nonsense.”
– Richard Horton
Heiko, the Lancet estimate is superior because we know how it was derived. Even with the large uncertainties that is far preferable to any figure that CIA estimates without telling how they did it.
If you don’t agree with this I made an estimate myself that 400,000 civilians have been killed by US troops in Iraq since the invasion, and since I refuse to give any explanation for how I derived that figure you’d better take it at face value and assume it is just as reliable as the CIA figure.
“Any thing CIA is bad”. Make sure you tighten the elastic on your tinfoil hat.
CIA factbook is a good source, like dsqared pointed out though, the estimates are based on historical data and trends and not recent oberservations and events.
I have issued a challenge to Lancet defenders in this post.
“Can anybody point out information contained in the study, as published in Oct 2004, that would let a real world decision maker make changes to policy, strategy or tactics that would have saved lives in Iraq?”
Please pile on.
I have replied. While you’re issuing challenges, would you care to respond to mine – to either defend or abandon your “cluster sampling critique”?
Thanks for your 1:30am response to my post of 3/23 6:36pm.
My point 2 interpreted Roberts’ failure to anticipate very violent clusters–in a study of known-to-be-heterogeneous violence–as evidence of poor study design. You disagree; we’ll have to leave it at that.
I’m upfront about my lack of specialist statistical knowledge. Like many educated people, I often make informed judgements based on what I have learned, combined with common sense. That you—an expert in this area—see things differently is not altogether surprising. Tellingly, you say “common sense is almost always a poor guide to good statistical practice.” (point 3) As far as the fundamentals of statistical practice, my old stats prof would dissent. I suspect Richard Feynmann would, as well.
For point 4, that “the Fallujah outlier was not handled according to the suggestions of either Kruskal or High”: perhaps if you had written the abstract and discussion, the paper would have been a closer approximation to that ideal. The criticism stands, for the reasons already stated.
On 5, you correctly disparage “airy, general methodological criticisms,” but that is itself an airy criticism, not particularly germane to what I have written. Google will assist any casual readers of this thread who are flummoxed by my allusion to “other problems.”
As to my assertion that Roberts “muddies the waters,” (point 6), I’m referencing the contents of the very post to which this thread is attached. My method was to print out S. Love’s fisking, your rebuttal (3/23 2:47am), Disputo’s critique (3/23 6:43am), and then go through the Lancet paper.
Since you accuse me of “argumentum ad hominem, pure and simple,” I’ll copy point 7 in its entirety:
“By Roberts’ own account, this study was done quickly, and expedited through peer-review to make it to print in time to affect the US presidential election. Roberts have lost any presumption of impartiality.” link to account. dsquared, perhaps you can be quick to offer insults to correspondents with a point of view different from your own.
I’ve tried to stick to the main points, as the conversation has moved on. Between us, we’ve supplied enough text and links to allow interested readers to draw their own conclusions. That’s the objective (at least, my objective).
Give up, dsquared ingnores what is inconvenient to him and relys on repetition of his assertions and overly verbose writing to distract. It is very likely that the methodology would overstate or understate deaths and the classifications of those deaths. Dsquared is a political hack who doesn’t want his talking points questioned. The one sidedness of his thinking, as well as his tactics, are evidenced in his posts which are there for everyone to read.
“I have issued a challenge to Lancet defenders….”
I appreciate Steven DenBeste’s point about giving out free ice-cream and being asked why there are no nuts and raisins in it, but all the same I have to say that, if you decorate a website with pictures of great social scientists and announce therein that you are about to demolish an important scientific paper, your readers are entitled to expect that you will stick with the job until you have either accomplished that task or admitted that the paper is a bit better than you thought.
If, at long last, you concede that the Lancet study was a scholarly piece of work then by all means let’s move on. If not then please come up with something that you wouldn’t be ashamed to put in front of the guys who gave Chicago its academic reputation.
“It says that the figure of 98,000 refers to the 97% of the country represented by clusters other than the Fallujah cluster. 98,000 divided by 0.97 is 101030, which rounds to “about 100,000?”
So they either rounded up from 98,000(2,000) or down from 101,030(-1,030). This in no way changes the validity of my criticism. I am pointing out that they rounded to the more marketable number of 100,000 instead of providing the actual number, whatever that was. There was no technical reason to round at all.
More over, if they did calculate the figure as you suggest the researchers were incredibly sloppy. Remember this is a cluster study. Each cluster represents a unique area of Iraq calculated by the demographic software. If a cluster is removed, the remaining cluster must be recalculated to stretch over the area of the missing cluster.
I suppose it is possible that they recalculated everything and ended up at exactly 100,000 but what are the odds of that?
Sorry, Amac, I really am lost in verbiage (largely of my own construction). I’m looking for specific criticisms and coming up short. I don’t understand:
1. What your particular complaint is about the handling of the outlier, and how you think it doesn’t comply with the Kruskal guidelines
2. What specific methodological flaws you think this study has (if you’re endorsing some of Shannon’s, it would help one heck of a lot if you could provide links).
3. What errors you think a layman would be likely to make as a result of the “misleading” summary. I don’t think it is misleading; I think it’s very accurate as a summary of the evidence. Therefore, I don’t think it’s on to just say “this was misleading” without saying how it was misleading. I think I’ve dealt with all of Shannon’s points with respect to this; if you don’t agree, once more, links or excerpts would be very helpful, thank you.
So they either rounded up from 98,000(2,000) or down from 101,030(-1,030). This in no way changes the validity of my criticism.
Hahahahaha. You can’t seriously mean this, can you? Of course it does. Down is not the same as up. You screwed up. Admit it.
More over, if they did calculate the figure as you suggest the researchers were incredibly sloppy. Remember this is a cluster study. Each cluster represents a unique area of Iraq calculated by the demographic software. If a cluster is removed, the remaining cluster must be recalculated to stretch over the area of the missing cluster.
This is, to put it bluntly, nonsense. I am shocked that you have the temerity to mention “this is a cluster study” when you have been caught out in a complete fabrication on the subject of cluster studies in the past (one which, I hate to repeat myself, you have yet to admit to). Do you really have so little respect for me that you expected me to fall for this?
Each cluster represents a unique area of Iraq calculated by the demographic software.
Which demographic software? Each cluster represents the thirty households closest to a random geographically selected location and is extrapolated to represent “3% of the population”. The study says, several times, that each cluster represents 3% of Iraq.
If a cluster is removed, the remaining cluster must be recalculated to stretch over the area of the missing cluster
This is just simply meaningless verbiage. It really doesn’t refer to anything. The transformation involved is just a scaling.
The CIA World Factbook is also the first hit on Google for a search on ‘Iraq’, so the web community seems to find it useful (or the CIA manages to trick its way up the rankings, or whatever).
But comparing their stats to the Lancet’s is an apples and oranges game, and it’s probably wise to keep them out of the equation(s) here.
Someone asked for and someone else (dsquared, actually) provided the Lancet’s error bounds for their post-war mortality calculations with and without the Fallujah cluster, viz:
The mean and confidence interval for the post-invasion death rate including the Fallujah cluster is given on page 4, bottom right-hand corner. It’s 12.3, CI 1.4-23.2, design effect 29.3
Now one noteworthy number here is the “design effect” (DE) of 29·3. This is a measure of the reliability of a survey using cluster sampling (CS) when compared to a similar-sized survey using a simple random sample (SRS). Where the DE is 2, the effective sample size of a random CS survey when compared to an SRS with the same number of sampling units is halved.
So to drop the acronyms, a random cluster sample survey of 1000 households whose “design effect” is 2·0 will have the effective sample size – and reliability – of a simple random survey of 500 households where each was selected independently of all the others (you can guess why that isn’t done very often, and if you can’t, just think gasoline prices, travel time, wages, money).
And how does anyone know what the design effect of a particular survey will be? Well, you can’t work this out until you’ve collected at least some results, because the variation between clusters plays an important role in calculating it. Suffice it to say that the smaller the variation found to exist between clusters, the smaller the design effect, and the less a survey will suffer in comparison to an SRS, the (expensive) ideal.
Readers who’ve been paying attention will be able to work out for themselves the “effective sample size” of a cluster sample survey of 988 households whose design effect is 29·3. That the authors of the Lancet study decided to drop the Fallujah outlier is hardly surprising when you consider their discovery that:
Much better to base estimates on an effective sample size of 494 households than on a sample with an effective size of 34!
I leave it to fact-checker extraordinaire dsquared to issue all needful corrections, etc., regarding the above.
dsquared asked AMac (1:09pm):
What is your particular complaint about the handling of the outlier, and how do you think it doesn’t comply with the Kruskal guidelines?
Kruskal: “My own practice in this sort of situation is to carry out an analysis both with and without the suspect observations. If the broad conclusions of the two analyses are quite different, I should view any conclusions from the experiment with very great caution.”
AMac, 3/23 5:18pm: The claim of [S. Love’s fisking is] that Roberts misleads the Lancet’s generalist audience, conflating the IFC and EFC analayses to produce a desired result.
AMac, 3/24 11:59am: [I came to agree with S. Love’s charge that Roberts conflated IFC and EFC analyses.] My method was to print out S. Love’s fisking, [dsquared’s] rebuttal (3/23 2:47am), Disputo’s critique (3/23 6:43am), and then go through the Lancet paper.
dsquared, of course you disagreed with me, and will continue to do so. I am not expecting consensus. Interested readers can do as I did, and draw their own conclusions. Perhaps some will find your and Disputo’s points to be convincing.
What specific methodological flaws do you think this study has?
This refers to my point 5 of 3/23 6:36pm, reproduced in its entirety: While the study has other problems, I can’t evaluate their seriousness. But every study has problems.
Let me rephrase this minor point: I’ve focused my writing to where I thought I might advance the discussion. Critics have highlighed other potential problems. While every study has problems, elucidation of the facts and contexts of these other aspects of Roberts will depend on exchanges between those critics and Roberts’ defenders.
What errors do you think a layman would be likely to make as a result of the “misleading” summary?
(This response assumes no serious methodological flaws in the study.)
–That the study has clearly shown that, conservatively, at least about 100,000 excess deaths have happened since the 2003 invasion of Iraq. This Oh That Liberal Media website post gives seven examples of reporters who, indeed, took that as the take-home message.
–That the statement “after the invasion, violence as the primary cause of death” is supported by much stronger data than 21 violent deaths out of 90 total, or 73 violent deaths out of 142 total, if the excluded cluster is re-included. Recall Kruskal.
–That “Violent deaths were widespread, reported in 15 of 33 clusters, and were mainly attributed to coalition forces” is supported by much stronger data than 21 violent deaths in the non-outlier clusters, of which 9 were reported as due to coalition action. Yes, the sophisticated reader can glance up two paragraphs, sees 33, mentally subtract 1, and deduce that the sentence must refer to outlier-included statistics. Recall Kruskal.
–That “Violence accounted for most of the excess deaths [antecedent for most: “about 100,000”] and air strikes from coalition forces accounted for most violent deaths” is supported only if the Fallujah cluster excluded from the immediately-prior sentence is re-included for this statement. More than violating Kruskal’s precept, this grammatical sleight of hand invites misinterpretation.
As you say, you think the Summary is an accurate account of the evidence. Again, I am not expecting consensus. Again, interested readers will draw their own conclusions.
I don’t think that going further rounds will induce many long-distance thread-readers to change their opinions on these matters. Though I won’t post again on this, I will of course look forward to reading any response, likely disagreeing with much of it.
Thanks also for the overall politeness you have shown me on this thread, though this appreciation is blunted by the rather, um, vivid language you and your pals use at your own site in describing the mental faculties and personality traits of those of us who have commented negatively on Roberts. I confess the latest Crooked Timber thread has the flavor of one of Michael Kinsley’s gaffes, (paraphrasing), what happens when a politician says what he really means.
The CIA world factbook is very useful – as a collection of publically available information. If I want to look up the size of the population of a country, its area or stuff like that I often use it. On the other hand they are trying to fill in a standard form for all countries, and that means that if they don’t have a number they guess, as one must assume they did when it comes to Iraqi mortailty.
Love, it is common practice to round a number to indicate the number of significant digits in it. A number of 98,000 would suggest that the real number was between 97,500 and 98,500, which is misleading. Yes, you can specify the standard deviation or 95% confidence interval to give an exact description of the uncertainty, and that was what was done in the article, but to the wider audience that would just be confusing, and even worse, use it in a press release and you can be certain that some journalists will drop that qualifier and just say 98,000 dead. Rounding to 100,000 was the most honest thing to do.
“I’m upfront about my lack of specialist statistical knowledge. Like many educated people, I often make informed judgements based on what I have learned, combined with common sense. That you—an expert in this area—see things differently is not altogether surprising. Tellingly, you say “common sense is almost always a poor guide to good statistical practice.” (point 3) As far as the fundamentals of statistical practice, my old stats prof would dissent. I suspect Richard Feynmann would, as well.”
Google on “Kahneman Tversky”. Not only do laypersons (non-statisticians) do HORRIBLY in judging situation that rely on probabilistic reasoning, even statisticians do horribly on it. Dsquared is right on and there is voluminous evidence to support his notion that “common sense” gets statistics wrong.
D Squared, Tim:
Sorry for not replying sooner, had to get some sleep. Thanks for the compliment D Squared.
You’re both correct that the 30,000 bombing death estimate, ex-Falluja, does not specifically appear in the study text. This was arrived at from two sources. One is the EPIC (Education for Peace in Iraq) interview that study author Richard Garfield gave in November 2004. This is the interview that Tim is referring to when stating that he contacted Garfield, and Garfield is now on record that he was “ mistaken.”
I’ve provided the URL to the Garfield interview in my previous post. The article contains a bar graph apportioning the deaths from the 100,000 excess death estimate. 30,000 deaths are attributed to coalition bombing, and 57,600 for all violent deaths. Just below the graph, Garfield makes this statement, which he has now apparently corrected:
“In areas of Iraq, with the exception of the North, all had a rise in the mortality rate and most were due to violence. Real change was in deaths due to violence.[The majority of the 57,600 deaths due to violence was attributed to air assaults.]”
The second source for the 30,000 bombing death estimate comes from the study itself, and is contained in it’s “ money quote, “ from page one:
“Interpretation: Making conservative assumptions, we think that about 100,000 excess deaths, or more have happened since the 2003 invasion of Iraq. Violence accounted for most of the excess deaths and air strikes from coalition forces accounted for most violent deaths.”
While there was some discussion at Deltoid that the authors may have also been lumping the Falluja deaths in when making this conclusion, it seemed to me that the consensus was this premise couldn’t be defended, especially after you yourself linked to the Garfield interview on December 15th. The fact that one of the study authors had pegged ex-Falluja bombing deaths at 30,000 put an end to any substantive argument that the authors were including Falluja.
I think we had all come to accept that the study’s “ Interpretation “statement simply couldn’t have been made so sloppily and ambiguously, given its importance to the study. Since the risk of violent death had risen 58 fold post-invasion, and based on the above statement, it was agreed at Deltoid (and elsewhere) that 58,000 of the 100,000 excess death estimate came from violence. For simplicity, the rounded figure of 60,000 was consistently referred to at Tim’s site. Again, based on this key statement from the study itself, that the majority of violent deaths were from coalition air strikes, the figure of 30,000 bombing deaths was adopted.
To be fair Tim, I think you should acknowledge that the 30,000/60,000 figures were adopted as accurate by yourself, myself and other regular commenters during the early Lancet threads at your site. D Squared, I believe you were also in the boat with the rest of us, but I can’t confirm that without retracing all of Tim’s very lengthy threads. I think I’m on firm ground when I say I don’t recall you arguing to the contrary at the time.
Just to set the record straight then Tim, I think it’s somewhat unfair to infer that I’ve been pushing inaccurate numbers in this area, of my own making. Here’s a quote from you on December 9th, from your “ Lott on the Lancet “ post, November 28th:
“Mike, thanks for getting the breakdown of the figures. The pre and post-invasion periods were of unequal lengths, so you would need to adjust for that when working out how many excess deaths there were. I think you get about 60% from violence (not counting Falluja).”
I’ll understand if you don’t want to reveal the contents of Garfield’s explanation, out of respect for his privacy. If on the other hand you could provide some details concerning his correspondence with you, I’d be very interested. I’m curious as to whether he provided any explanation on HOW he managed to make this error.
I have several concerns with the revelation that Garfield has retracted his figures and statement from the EPIC interview. Of course the most obvious one is, how could one of the study’s leading figures get something this crucial so wrong? I’m also concerned with the fact that other study authors such as Les Roberts and Gilbert Burnham reinforced in interviews the point that the 100,000 figure excluded Falluja. In my view, this very much calls into question their motivation for wording the key “Interpretation” statement the way they did. This also renews the controversy over why the authors failed to clarify in the study the breakdown and distribution of non-coalition caused violent deaths. Without clarification obtained from the authors outside the study, it was impossible to discount the Interpretation statement that most of the excess violent deaths in the 100,000 estimate were from air strikes. Now, we not only know this statement is incorrect, we also have one of the authors admitting it.
I’ve seen many heap blame on the media for the widespread misrepresentation of the 100,000 death estimate, while at the same time deflecting blame from the authors for this. Tim, I think your revelation concerning Garfield’s admission changes this dramatically, and in fact the authors deserve the lion’s share of the blame for this.
Mike, I’m pretty sure I never endorsed a 30K figure; I seem to remember specifically bowing out of that thread because I didn’t have the time to devote to the sort of work that you were doing.
Garfield is AFAICT off base here; the data does not support robust estimates on the breakdown numbers, although it does on the overall risk ratios. My very first Lancet post said right up front that I don’t like extrapolated numbers and this is a good example of why; you can’t base extrapolated numbers on outliers, but you also can’t discard the information from clusters like Fallujah. I’d note that, of course, this is the benefit of a really good peer review like the Lancet study received; it weeds out loose and indefensible statements like this one, which is why there aren’t any in the actual paper.
On the other hand, in my second most recent Lancet post, I pointed out that it is a bit much for people to be banging on about this or that detail of the breakdown of the extrapolated number when they are a) visibly not calling for a more accurate, better-resourced study to be carried out b) visibly not condemning the coalition forces for “not doing body counts” and c) trying their level best to minimise or ignore the whole study and its conclusions, which are robust.
(neutral observers will see that there is a sort of travelling “Lancet study community” who swap comments on all sorts of different websites and remember all their past debates. You can join this community if you like; the only real membership criterion is that, because we’re a bit serious about the science, we don’t call people “scientific whores”)
Just in case it wasn’t clear enough above, Mike has a genuine and valid criticism here; Garfield’s statements, although not made in the Lancet paper, were a genuine mistake and should not have been made. Score twenty points for sensible Lancet critics who do the work and base their arguments on facts about the data rather than baseless accusations about motives.
Thanks again for the kind words. I do in fact recall you expressing your disagreement with the use of extrapolated numbers, particularly attributions for individual death subsets. I may well be wrong to suggest you accepted the 30,000/60,000 figures.
That being said, the extrapolated numbers are to some extent simply another way of expressing the increase in the relative risk. In other words, criticism of the extrapolations is also criticism of the accuracy of the increase in the relative risk. The data used to calculate both are the same.
I won’t bother to repeat my argument from my original post, dealing with my perception of the volatility from study to study of certain key death subsets, partly beccause the argument has admittedly lost some of its punch in the wake of Garfield’s correction.
For the record, I have been in agreement with Tim at Deltoid over the need for additional studies. On one occasion, I suggested the best option might be an actual count for the Falluja cluster, accompanied by a broader cluster sampling for the rest of the country. However, now I’m of the belief (as I mentioned in my post at the ” labcoat ” thread here) that an actual count of all violent deaths is the best and most accurate approach, at least for purposes of determining the violent death component of the aggregate excess death figure. As I mentioned previously, such a process is nearing completion in the Former Yugoslavia, and is shattering the previously accepted death figure of 250,000.
I do have one disagreement with you, D Squared, in relation to your statement that
“I’d note that, of course, this is the benefit of a really good peer review like the Lancet study received; it weeds out loose and indefensible statements like this one, which is why there aren’t any in the actual paper.”
I believe the contentious ” Interpretation ” statement on the study’s page 1 must clearly be viewed as a ” loose and indefensible statement” when dealing with the portion of it that labels coalition air strikes as accounting for the majority of violent deaths, especially now that Garfield has retracted his near identical statement from the EPIC interview. I agree though, that this has no effect on the aggregate excess death figure or the risk ratio.
Mike, I’m pretty sure I never said that there were 30,000 dead from bombing. I did say that of the 60,000 violent deaths, half were from coalition activity. Garfield’s exact words were
Thanks for the additional details regarding Garfield’s correspondence with you. However, from my perspective, it muddies the water further in the context of the Interpretation statement from the study. And yes, I think you’re correct that you didn’t explicitly commit to a 30,000 bombing death figure.
When Garfield states that
“‘Majority of deaths from the air’ was mistated as being based on non-falluja experience.”
is he referring to a misstatement in both his EPIC interview and the Interpretation statement form the paper, or just the former? Clearly, his last statement to you only refers to his own error at EPIC.
I can’t see how he can separate his error at EPIC from the statement in the paper, because they’re essentially saying the same thing, and in doing so, making the same error.
To put this another way, is this really Garfield’s faux pas, or do all the authors share it? I can’t reconcile the fact that Garfield’s error also seems to be identical to the error made by observors and commenters not involved in the study, and was based largely on the key statement from the study. I believe the 60,000/30,000 figures had actually been adopted at your site before we were aware of Garfield’s interview.
In the study, and by many in the “anti-war” crowd, excess death and “war dead” caused by the coalition seem synonymous, when they clearly aren’t.
Traffic accidents are just not morally equivalent to soldiers shooting children carelessly.
And excess death for this conflict doesn’t have to be defined over such a short period as two to three years. The invasion will have effects for decades, and not just in Iraq, but worldwide.
Worldwide, Unicef claim that 11 million children die that wouldn’t have to. There’s a million traffic accidents, the death tolls from Aids and Malaria.
The true impact of the invasion lies in the longer term picture, and whether it contributes to a stable, democratic Mid East, that allows Asian nations to grow and substantial resources to be devoted to solving Africa’s problems.
Of course, not everything in the future depends on the outcome in Iraq, but the figure of excess deaths or people saved over the next few decades will likely absolutely dwarve the impact on the death rate in Iraq that’s been experienced so far.
Excess dead can disappear, or excess “lives saved”, can be countenanced by all the lives lost in a future civil war and chaos in the Mid East.
War dead stay war dead, a soldier who shoots a child cannot make that undone, no matter how much infant mortality falls.
The moral culpability for war dead is very different from excess death.
I would like to close with the role of accidents in the claimed excess death figure.
After thinking the matter through, I think that the importation of hundreds of thousands of second hand cars has the potential for 10 or 20,000 extra deaths.
Iraq has many young people, people in an age group known to tend towards risky driving. I believe many of those would have had access to a car for the first time over the last two years, so not only do we have more young drivers on the roads, but worse drivers with hardly any experience, who’d have to cope with much more crowded roads, and who’d be driving very old cars.
The Lancet study’s results would be consistent with an extra toll of over 20,000 from road traffic accidents. But should they be counted as “war dead”?
“The moral culpability for war dead is very different from excess death.”
Heiko, this is something I think everyone ehre agress on, and nowhere in the Lancet study does the authors equalize war dead with excess dead. Critiquers and people using study for political statements often do it, but that is hardly the fault of the authors.
I suppose I fault the authors for this “confusion” more so than you do.
“The information that Roberts provide says that all 12 of the non-Coalition attributed deaths occurred outside of Falluja.
Boy, that seems really odd. Considering all the execution chambers, torture chambers et. al. found in Fallujah by the Coalition forces.
One wonders how those deaths could have failed to make the study. Methinks their statistical analysis wasn’t the only thing in here that was severely flawed.
The obvious control to all this theoretical debate is the US military’s casualty figures. The US has suffered ~10,000 disabling casualties ie soldiers wounded in battle who needed hospitalisation or morturarisation. Only about 1500 of these have died, about half the usual fatality ratio, mainly thanks to the US’s superb emergency care facilities.
The US military has about 150,000 soldiers in-country, about ten times the estimated size of the actual guerilla force (this is the standard counter-insurgency ratio). That is about the same number of Iraqi persons who, the study predicts, have received disabling wounds in this conflict. (Obviously the Iraqi medical facilities are inferior, thereby giving a higher mortality ratio.)
Why would it be surprising that the US would inflict one Iraqi military casualty per annum for every one soldier that is on active service in this theatre of operations? Given the kind of firepower at its disposal, the urban context of the battle and the popular nature of the insurgency it is surprising that the US has not inflicted more casualties. (I put this down to smart weaponry, careful targetting and discriminatory use of fire power.)
Or, to put it another way, I would not be at all surprised to see that the US killed ten times as many insurgents as it lost with serious wounds. ie 10,000 US WIA to 100,000 Iraqi KIA. The US military’s force amplifiers are easily an order of magnitude greater than the organic weapons used by the insurgents.
After thinking the matter through, I think that the importation of hundreds of thousands of second hand cars has the potential for 10 or 20,000 extra deaths
Think some more. This would be tantamount to assuming that 1 in 10 of the new drivers killed someone in a car in their first year of ownership. That takes quite some doing when you spend most of your time queuing for petrol.
Comments are closed.