Vindication Is So Sweet

Way back October of 2004 I posted a critique of a study published in the Lancet that purported to show that:

…about 100000 excess deaths, or more have happened since the 2003 invasion of Iraq. Violence accounted for most of the excess deaths and air strikes from coalition forces accounted for most violent deaths.

I called foul immediately, and I ended up writing a series of posts detailing my arguments. Now I find out from Michelle Malkin (via Instapundit) that David Kane, Institute Fellow at the Institute for Quantitative Social Science at Harvard University has authored a paper, soon to be presented, that demonstrates using detailed statistics just how deceptive (my adjective) the original study was.

Kane shows that if the Falluja cluster is included in the statistical calculations, the confidence interval dips below zero, which is a big no-no. Since the study’s raw data remain a closely guarded secret, Kane cannot be absolutely certain that the inclusion of the Falluja cluster renders the study mathematically invalid…

…but that’s the way to bet.

In science, replication is the iron test. I find it revealing that no other source or study has come close to replicating the original study. All my original points still stand.

Ah, vindication is sweet.

58 thoughts on “Vindication Is So Sweet”

  1. Shannon – Congratulations. And thanks for your work then.

    (By the way, there was one very active leftist — I forget his name — who defended the studies endlessly. Has he been heard from?)

  2. Shannon, I think you’ve really done a lot of good work on this and deserve your pat on the back. I wish I had seen all of your stuff earlier. The crux of the problem is that people can churn out this kind of rubish quite quickly, but it takes a lot of effort and diligence to dispute it. I was also posting on the subject a number of times. I have a summary of problems that I found in the study here. It contains some issues that haven’t been addressed elsewhere.

  3. Good for you, Shannon. My own hackles were raised by the fact that the SD was greater than the mean – usually, but not always, a sign of poor sampling and invalid statistics. You did a great job putting numbers to all our doubts. As my advisor used to say – if you can’t put a number to it, it’s a religious discussion.

  4. Careful Shannon. The dispute is ongoing, with numerous and challenges to Kane’s claims. The Lancet authors say they stand by their results.
    You may also want to consider how civilized society regards people who insist, for example, that no more than 2 or 3 million were killed by the Nazis, as opposed to the widely quoted 6 million figure. People tend to question their motives before or without even considering their math.

  5. “The dispute is ongoing, with numerous and challenges to Kane’s claims.”

    I don’t mean this sarcastically — In all seriousness, what challenges?

    “You may also want to consider how civilized society regards people who insist, for example, that no more than 2 or 3 million were killed by the Nazis, as opposed to the widely quoted 6 million figure. People tend to question their motives before or without even considering their math.”

    When I’ve seen this come up, that’s usually not the comparison. I’ve just never seen somebody who is claiming only 3 million people died in the Holocaust have their motives widely impugned.

    What is common is for someone to claim it was only a few hundred thousand and then be attacks. In that situation, they can say it was minor compared to everything else that happened–and because that estimate is so far from everybody else’s and is such a different event in its magnitude, its affect on the world, its affect on Jews, and its place among large genocides in history–that it’s the equivalent of saying Holocaust didn’t happen at all.

    With Lancet, you had most study claiming 600,000 deaths and literally every other study claiming it was a fraction of that. I have seen a single other study even remotely in the ballpark of what Lancet claimed.

    So you had a situation where war opponents could finally latch on to figure and say we were the equivalent of Saddam Hussein (a stat which most anti-war activists I’ve dealt with now take at face value), and everything else, if it wasn’t for Lancet’s claim, pointing to that analogy as being very difficult to make. IBC data, UN data, and even Iraq Body Count all contradict Lancet.

    In this case, the Holocaust analogy is reversed. Lancet is the outlier-not its critics. It’s the one who’s saying something very different happened than everybody else.

  6. Pingback: Gay Patriot
  7. Oliver Suess-Barnkey,

    The dispute is ongoing, with numerous and challenges to Kane’s claims

    I don’t know. Looks like Kane is holding his own so far. His main problem is that the raw data for the study was never provided to anyone and has now “been lost” so Kane has to infer what the data was.

    Besides, I found another study which appears to refute at least one of the main and most often quoted assertions of the 2004 Lancet studies. I am working on a post of it now.

  8. You may also want to consider how civilized society regards people who insist, for example, that no more than 2 or 3 million were killed by the Nazis, as opposed to the widely quoted 6 million figure. People tend to question their motives before or without even considering their math.

    Which is pretty much what Robert and friends are doing. They are lying to foster the fortunes of the remains of Fascist regime. You might want to consider how civilized society regards liars who seek to frustrate the cause of democracy and self-determination.

    What you don’t understand is that there is something far more important going on here than the mere politics of the moment. Major scientific institutions have been highjacked. In effect, John Hopkins and the Lancet are complicit in a self-serving lie. If we don’t fight this thing here our scientific institutions will collapse and we will have huge problems long after Iraq has become a footnote in history.

  9. “His main problem is that the raw data for the study was never provided to anyone and has now “been lost” so Kane has to infer what the data was.”

    There’s no good reason not to have provided the raw data to other researchers, especially critical ones like Kane but a host of bad reasons not to do. Then it mysteriously disappears.

    Funny, how I still have my old, yellowed, index cards with handwritten notes from grad school research and the Lancet has lost the original data from a blockbuster and highly controversial study published in one of the world’s most prestigious journals. And the editors have no copies either.

    Just think of the odds.

  10. Shannon, the political nature of your motives is transparent here. Any fair-minded reader can conclude beyond doubt that you view Lancet/Johns Hopkins’ conclusions substantially as an affront to your political views about the war.

    There is nothing necessarily wrong with political motives, other than the attempt to camouflage them as a desire to defend “scientific institutions.”

    As for your declaration of “vindication” in an ongoing disput, why do you think you should declare victory when the researcher you’re relying on, by your own admission, is merely “holding his own” in the dispute.

    Shannon writes:

    “His main problem is that the raw data for the study was never provided to anyone and has now “been lost” so Kane has to infer what the data was.”

    This is case-closing evidence that Kane’s observations are speculative as regards the Lancet/Johns Hopkins results. Careful reading of Kane’s comments show that he presents his critique as just that and, moreover, painstakingly avoids assertions about Lancet/Johns Hopkins’ motives.

    If Shannon’s motives were scientific, rather than political/rhetorical, why would he declare victory based on inferences versus data?

    I hope Shannon shares my unalloyed desire for an accurate count of civilian deaths caused by the U.S.-led invasion of Iraq. If he does, surely he will agree that the best way to dismiss the Lancet/Johns Hopkins figures is to come up with more verifiable ones, which, neither Kane nor anyone else I’m aware of, has done or even claimed to do. I’m sure all readers welcome sincere contributions to that effort.

    Tatyana, You seem to have misinterpreted my comments. I wasn’t comparing the U.S. military to its WWII German counterpart. I was comparing people who use statistical analysis to question the extent of the holocaust to people who use statistical analysis to question the extent of civilian deaths in Iraq.

    I made the comparison as a suggestion to Shannon that rhetorical assertions about Lancet’s motives doom his credibility as an objective scientific and/or mathematical observer.

    We live in an era of historical revisionism. Japanese “scholars,” many with nakedly political motives, increasingly question the death toll of the Nanking massacre. People on several political sides of discussions about Cambodian history debate the Khmer Rouge death toll. Journalists cite the U.S.-Vietnam war native death toll as variously between 1 million and 3 million.

    In each of these cases and others like them, fair-minded people will be skeptical about the assertions of people with obvious political/ideological stakes in the discussion.

  11. “I hope Shannon shares my unalloyed desire for an accurate count of civilian deaths caused by the U.S.-led invasion of Iraq. If he does, surely he will agree that the best way to dismiss the Lancet/Johns Hopkins figures is to come up with more verifiable ones, which, neither Kane nor anyone else I’m aware of, has done or even claimed to do.”

    Oliver, the UN and Iraq Body Count both have figures for this and they’re far different from Lancet. Every study I have seen on this (and there have been a number) differs from Lancet and the fact that, despite that, its numbers are the ones most people are familiar with is what irritates everybody here.

    The UN and IBC both have figures for total civilian deaths (not deaths just from coalition forces as the Lancet study said it had provided), but total deaths by everybody, including terrorist attacks on civilians (which caused the bulk of the deaths), and including regular crime in Iraq – Iraq Body Count put it at ~60,000 and the UN put it at ~70,000.

    Again, though, this is including regular crime (not just from the war). According to the UN, those numbers drop to 20,000 to 30,000 (from 2003-2006) if you take crime out of the picture.

    Now, that sucks, but it’s a far cry from Lancet’s 100,000 excess deaths by coalition forces, it’s a far cry from most of the deaths being the result of our air strikes, and it’s an incredibly far cry from the Lancet study that said 600,000 people had died as a result of the war.

    A bunch of people have compiled figures on this throughout the war, both before and after Lancet, and every single one of them that I’m familiar with differed greatly from it. Lancet, however, fits the narrative that most anti-war activists wanted and so those were the figures that they publicized. Lancet is the one whose contradicting all other scholarship on this. It’s the one who’s being revisionist here, not the people who critique it.

  12. Another problem Kane encounters is that the authors of the study refuse to release their code. Deltoid is a great source to follow these things. Warning – don’t jump in unless you are up to their level – just follow it for what it is worth.

  13. AD–You’re not correctly describing what the 2004 Lancet article asserted. It’s not entirely your fault–the summary at the beginning and many leftwing summaries are inaccurate in the same way. The claim that most deaths came from air strikes only follows if you include the Fallujah cluster. Excluding Fallujah, there were (taking the midrange figure) about 60,000 deaths from violence and roughly 40 percent of these were from coalition forces.

    IBC doesn’t claim it counts all the deaths. They don’t believe the Lancet figures, but they do acknowledge that their numbers are an undercount.

    I’m agnostic on the numbers myself. It’s significant that there are large scale polls done in Iraq as recently as early 2007, but nobody has tried to replicate the Lancet studies and determine the true number once and for all. If the US government wanted to know the truth, they could have sponsored an enormous survey by independent researchers, but they never have.

  14. Donald, touché, but I’d argue it’s equally significant the Lancet guys didn’t release a lot of their data for review either.

  15. As I said, the Lancet/Johns Hopkins numbers are to date the most verifiable. I never suggested they were the only numbers.

  16. Oliver’s comments are very ilustrative of a type of argument the left has used successfully and frequently in attacks to the morality of “right-wing” entities.

    What they do is bring up a ridiculously enlarged body count for “right-wing” entities while dennouncing them of lacking morals, and when someone (a) challenges their numbers or (b) compares them with their left-wing antagonist, leftists accuse them of dishonoring the dead by engaging in an “accounting of death”. Under this scheme, the leftist can effectively attack any “right-wing” entity that produced as much as one death, leaving the impression it is morally as bad as Mao Tse or Pol Pot or Hitler (left’s preferred).

    State-produced death is definitely something that must be minimized if not eliminated. But so is the misuse of science for rhetorical purposes. Long live Shannon for setting the record straight.

  17. I’m not sure how frequently my arguments are used, but I would thank Tokyo Tower for observing that they have been successful.

  18. Oliver Suess-Barnkey,

    Shannon, the political nature of your motives is transparent here. Any fair-minded reader can conclude beyond doubt that you view Lancet/Johns Hopkins’ conclusions substantially as an affront to your political views about the war.

    That is extremely true. However, you don’t seem to realize that the paper was intentionally written to be an affront to the pro-democracy advocates! The authors intentional misrepresented the results of their study in order to create support for abandoning the people of Iraq. Of course they pissed me off, they’re fighting, intentionally or not, for the other side.
    In order to create the propaganda tool they wanted, the authors lied about the results of their own study! When you exclude the massively outlier Falluja cluster the study really says approximately 100,000 excess deaths occurred over the study period. Of those deaths, 76% were from accident, disease or old age. Only 24% of deaths resulted from violence of those the majority resulted from attacks by insurgents.
    However, in their Interpretation paragraph, the one that became the sound bite that every one heard, they reported 100,000 death most caused by violence and most violent deaths caused by Coalition airstrikes. That is a lie. No single set of data presented in the paper supports that contention in the least.
    Do you now understand why I object on scientific as well as political grounds?
    By the way, if you want to raise the specter of political biases did you know that one of the authors Dr. Richard Garfield conducted a study under the auspicious of Saddam’s reign back in 2000 that purported to show fantastically high (1 in 10) infant mortality rates in Iraq? It was used as propaganda tool in Saddam’s quest to get sanctions lifted. After the invasion, the study was shown to be utterly corrupted and Garfield had to formally withdraw it. Neither of the two Lancet studies supported it either. Garfield also wrote a glowing study of Castro’s health care system. See a pattern here?
    Further, Les Roberts openly acknowledged that he did the study quickly and rushed it into publication specifically in an attempt to influence the 2004 presidential election.

    If Shannon’s motives were scientific, rather than political/rhetorical, why would he declare victory based on inferences versus data?

    I don’t. I claim vindication that a professional statistician has confirmed one of my many objections i.e. that the confidence interval is comically wide. If the motive behind this paper hadn’t been to manufacture a propaganda weapon this paper would have never, ever made it into publication in Lancet with that wide of an interval. I have NEVER seen a major paper with confidence interval that wide. If you can find you I would really like to see it.

    I hope Shannon shares my unalloyed desire for an accurate count of civilian deaths caused by the U.S.-led invasion of Iraq.

    I do. What you fail to grasp is that the actual findings of the study compare very closely to other studies such as the UN report, the Iraqi government and the Iraqi body count. The study actually showed only around 30,000 deaths from violence once the Falluja outlier was removed. If the authors had not intentional obscured and misrepresented their findings they would have had a fairly decent study.
    Bad data is far worse than no data at all. When we have no data we move carefully, continually questioning. When we have bad data, however, we charge ahead and make major errors without realizing it. For example, the real data of the study suggest we should improve access to health care and do more to suppress insurgent violence. The fraudulent interpretation of the study suggest we should restrict Coalition air strikes.
    Even if we knew nothing about the results of the study we should not make any decisions based on the study for the simple reason that accuracy of its basically methodology have never been determined. Only two other cluster sampling studies of war zones have been conducted and the results of those do not agree with other methods. So, is this study’s methodology more accurate or less? How would you tell objectively? You can’t and the fact that so many embrace the study as nigh revealed wisdom and insinuate that people who question it are akin to holocaust deniers suggest that to me that it is they and not I who have the problem with political biases.
    In any case, you should obtain a copy of the study and evaluate it for yourself. (I can mail you one if you can’t find it online). Simply go through the study and keep careful track of which statements rely on the Fallujah cluster to be true and which do not. Take note of how all the most often repeated and most contentious assertions arise from a fraudulent combination of two separate sets of data.
    Don’t trust anyone else. Verify for yourself.

  19. Oliver Suess-Barnkey,

    A further thought:

    If Shannon’s motives were scientific, rather than political/rhetorical, why would he declare victory based on inferences versus data?

    Why do my motives matter to the argument at all? I am not claiming to be an authority figure. I don’t say that anyone should distrust the study because I am so damn smart or experienced. Instead, I have rather meticulously shown why the numbers, data sets and other technical attributes do not agree with study’s Interpretation/Main Finding.

    It doesn’t matter who I am or what my motives might be. The study is what it is. The numbers are all written down.

    In my opinion, anyone who raises questions of motive in a technical discussion have revealed themselves to be either clueless or disinterested in the truth.

  20. I went over to Deltoid and read through the 130+ comments there and added my own. I didn’t like what I saw there. In between the impenetrable mathematics (I am not a statistician) there was quite a bit of politics as well.

    —————-

    There are two conversations going on here, statistical and political. The statistical dominates but it is pretty clear that for some participants Mr. Kane’s paper must be false because, if true, it would help destroy a mythic narrative that they need to advance their political cause and science should bend to the need of the political (#118 seems to be the most explicit of this group). This implies both that scientists have a shared political world view and that that world view should come first in setting direction no matter where little inconveniences like facts would lead. Both are false for the vast majority of scientists but it is important for the long-term health of enlightenment values that such assaults must be repulsed even-handedly.

    No matter what the statistical merit of the paper (and I’m not competent to judge those), I hope that the vast majority of the participants would note and reject the idea of politics first, science second no matter what stripe of politics is being pimped. If not, I would suggest that Mr. Kane go elsewhere for his critique as he will not get an honest hearing here, just better or worse disguised political attacks. Talk about Michelle Malkin drew multiple critical remarks because many wanted to ensure her political myth telling was not enhanced but go back and look at the other side in this very thread. There seems to be no even-handedness so far about trying to keep politics out of the debate and after 130 comments, it really should have shown up by now.

    On the statistical front I note a few disparaging remarks (rapture) about negative death rates and how this would only be possible in high in-migration areas. There were quite a few refugees from Saddam’s Iraq (my understanding is that a million fled). Many of those people went home after the US overthrew his regime and that those camps were emptied during the time frame of the 2004 study. So far as I have been able to divine, nobody has taken into account the effects of the returning refugees should have had on the 2004 study. As best I can tell, there were approximately 300,000 returnees, many of them single male military deserters with low death rates. Depending on how *they* clustered, a local zone’s death rate might very well have dropped.

    Eli Rabett in #104 asks where are the lower zones of mortality post invasion. I would suggest starting to look in the kurdish zones for example which are generally peaceful and doing quite well. The US military posts a quarterly report to Congress and one of the features is a graph showing violence by province. The top 4 or 5 provinces generate about 80% of the violence. At the other end of the graph you might find your zones of lower mortality, especially Shia areas that were heavily under the thumb of Saddam’s goons before and are now receiving their fair share of medical and other supplies. It isn’t directly relevant for this study time period but the 4 Shia provinces that have reverted to local control are probably better off today and might have been better off in the study time period too. Good politicians, good administrators would have started their work long before the formal provincial handover had taken place.

  21. “Kane shows that if the Falluja cluster is included in the statistical calculations, the confidence interval dips below zero, which is a big no-no.

    You’re talking rubbish. Just from bonehead statistics, the study was about “excess deaths” – i.e. it was derived from the difference between two numbers. If the “confidence interval dips below zero”, it simply means that the null hypothesis (that there were no excess deaths) remains not disproven at that confidence level. Not a “big no-no” at all but a normal outcome.

    Indeed, Kane says this himself on page 2 of his paper:

    “The Lancet authors cannot reject the null hypothesis that mortality in Iraq is unchanged”.

    So you don’t even understand what Kane is stating.

    And the more advanced criticisms can be found here.

    Specifically, Kane is not distingushing between two types of clusters – “war zones” and “non-war zones” To quote a commentator:

    Why don’t you try redoing your analysis of P(CMRPost) etc. using the assumption of an unobserved covariate? This unobserved covariate is a categorical variable, takes values 0 or 1, and its regression coefficient is about – what – 7? So in any cluster where it takes the value 1, it shifts the mean of the mortality to the right by (estimated fallujah CMR – CMR in remainder of iraq). The unobserved covariate is, obviously, measuring the presence or absence of a major military conflict. You can model various assumptions about the distribution of this variable, but the best assumptions are obvious: before the war, it has a value of 0 with constant probability 1. After the war, it has a binomial distribution with p=(number of clusters in fallujah)/(number of clusters in the country). I bet this will make p=1/32, approx.
    […]
    David, its not an “interesting model” which I should go away and work out – its the answer to your conundrum. What you are doing is claiming that a very very unusual event observed in a sample of 32 observations is a true random draw from the same distribution as the other 31 observations, when we have STRONG evidence to suspect that it is a random draw from a different distribution. Fallujah was a planned event, in which the US government moved a lot of resources to ensure that the mean of the observations drawn from that area would be shifted significantly to the right. You can`t claim that it is simply another observation from the same sample as the other 31.
    […]
    David, at present your entire paper rests on your assumption that the reported confidence intervals are based on parametric estimates from a unimodal distribution. The paper says pretty specifically that they were calculated by bootstrapping from a bimodal empirical distribution. This is a big problem for your paper. The nature of the dataset makes it very clear that the bootstrap was the correct way to calculate the confidence interval for the risk ratio. That is another big problem for your paper. Finally, your paper has the implication that it would be sensible for a statistician to conclude that the discovery of mass deaths in Fallujah is evidence in favour of the proposition that the death rate had fallen. That’s the really, really big problem for your paper.

    As to why the difference between unimodal and non-unimodal distributions makes a difference – see here.

  22. Phoenician in a time of Romans,
    Not a “big no-no” at all but a normal outcome.
    I was being flippant but in an accurate study we wouldn’t expect the null hypothesis (in this case, that mortality decreased or remained the same) to show any significant probability at all. Kane points out:

    First, any empirical researcher is vaguely suspicious of a result which
    just barely rejects the primary null hypothesis, in this case, that mortality
    in Iraq is unchanged. Given this testimony from Roberts and Burnham,
    isn’t it likely that a small change in the model specification would lead to a
    confidence interval which includes zero?

    Zero in this case indicate no excess deaths while a negative number indicates improved mortality. As the confidence interval touches zero or laps over to negative the probability increases that reality is actually something different than what the study’s main finding says. By common definition, anything over a 5% chance that the null hypothesis is true is non-trival. Kane asserts:

    Given the data, thereis a10% chance that ∆CMR

    Delta CMR is the difference between post and pre mortality rates. That difference is used to calculate excess deaths. Excess deaths only occur if delta CMR>0. In other words there is a 10% chance that zero excess deaths occurred or that mortality actually improved.

    What you are doing is claiming that a very very unusual event observed in a sample of 32 observations is a true random draw from the same distribution as the other 31 observations, when we have STRONG evidence to suspect that it is a random draw from a different distribution.

    Hallelujah! I argued incessantly that cluster-sampling was an highly inaccurate method to use in a war zone. Cluster-sampling only produces accurate results with highly homogenous distributions of the sampled phenomenon. Since deaths from violence in a war zone are highly heterogeneous, the method is inherently flawed.

    You can`t claim that it is simply another observation from the same sample as the other 31.

    That is a crying shame because the study is only statically valid if any particular event is roughly as likely to occur in one cluster as in another. If any one cluster has a significantly different chance of an event then the cluster must be excluded.
    None of this explains why the authors kept the confidence interval for the falluja data a secret. Given that the inclusion of the Falluja data radically changes the outcome of the study, they had no valid excuse for doing so.

  23. In my opinion, anyone who raises questions of motive in a technical discussion have revealed themselves to be either clueless or disinterested in the truth.”

    and then,

    “The paper was intentionally written to be an affront to the pro-democracy advocates! The authors intentional [sic] misrepresented the results of their study in order to create support for abandoning the people of Iraq.”

    So which is it, Shannon? Is it ok to question motives, or not?

  24. Tweed Says,

    Is it ok to question motives, or not?

    Its okay to question motives AFTER you have had the technical discussion. Which I did, ad nauseam.

    The only concrete evidence of political motive is the fact that the study is flawed and/or its results were intentional misrepresented by the authors. With a technical discussion first. you just have a shouting match.

  25. I was being flippant but in an accurate study we wouldn’t expect the null hypothesis (in this case, that mortality decreased or remained the same) to show any significant probability at all.
    Really? I guess I must have cheated in that last study I did on searching behaviour online then. “There is insufficient reason to reject the null hypothesis” is a perfectly acceptable result from a study, which is what Kane is stating is the case (wrongly, I believe, based on the discussion on that thread).
    Zero in this case indicate no excess deaths while a negative number indicates improved mortality. As the confidence interval touches zero or laps over to negative the probability increases that reality is actually something different than what the study’s main finding says.
    No – when the confidence interval includes zero, the study’s main finding is that there is insufficient reason to reject the null hypothesis at that confidence level. The range of the confidence interval is a statement about that finding. What Kane was saying in your quote above, if I understand correctly, was that he was dubious about the results given the sensitivity of the confidence interval to the technique chosen.
    Hallelujah! I argued incessantly that cluster-sampling was an highly inaccurate method to use in a war zone. Cluster-sampling only produces accurate results with highly homogenous distributions of the sampled phenomenon. Since deaths from violence in a war zone are highly heterogeneous, the method is inherently flawed.
    Which is why the outlier of Fallujah was not included. The method is flawed if you assume a unimodal distribution – which the authors didn’t.
    Think for a moment – you’re in the position of saying that without including war zones, the sample shows a better than 95% chance that a lot of extra people died – but that if you include the war zones there’s a chance nobody extra died because of the war.
    That doesn’t make sense. The reason it doesn’t make sense is because of a statistical artifact that occurs if and only if you treat the full sample – war zones and non-war zones – as a unimodal distribution. The authors didn’t, as the people in that thread spent much time trying to explain to Kane. Again, examine this.

  26. Ah-hah – so that’s the error in HTMLing that causes that problem. Sorry about that.

  27. How could Shannon get Kane’s point so wrong?

    Kane’s entire analysis is based on the exclusion of Falluja, but Shannon says the problem is that Falluja was included.

  28. There was what I consider a telling interview on a show called “This American Life” in 2005. I jotted down the parts of it I found most interesting, and since Kane’s paper deals so much with the issue of their Fallujah cluster, it seem relevant:

    Host: At the end of three weeks there was only one more cluster to survey. The team had saved it for the end because it was the most dangerous one: Fallujah. Remember, this is September 2004. Insurgents controlled the city, and it’s basically under siege from the Coalition. They’re shelling it regularly.

    Les Roberts: It just seemed crazy to go there. And I said to Riyadh: Riyadh, we had been to 32 of our 33 picked neighborhoods. We had actually only thought in the end we would get to 30. We aimed for 30 and picked 33 with the thought that 10% of places would be too unstable for us to get to. So we’ve done better than we had expected. We have a terrible story to tell. The mortality is way up. Whatever you find in Fallujah is not going to change the story. Think of what we’re going to gain. We’re going to gain nothing.

    And he said, “God picked those random locations. God wants me to do this work. I must do this.”

    And we went back and forth and back and forth, and, I was brought up Catholic, and I had never really thought about it or understood it until that moment in time, but in my head I actually built up a weight. What’s the likelihood of something bad is going to happen to these guys and how bad is that and what’s the likelihood of something good coming from what they do, and how good is that, and I sort of put a weight on each of them. As I spoke with Riyadh, he actually did not have the capacity to do that, because for him, doing God’s will and this work were inseparable. He couldn’t separate out risk because that was separating out, sort of, faith. The more we spoke the more that I understood that on some very, very fundamental level that we couldn’t communicate with each other about our motives here. And in the end he went.

    TAL: Only one other interviewer agreed to go to Fallujah with Riyadh, a doctor who had relatives there he wanted to check up on. Their car was stopped three times on the way into the city. Heading to their random spot, they saw devastation everywhere. Houses were bombed. Rubble lay in the streets. The block they stopped on was no different. They had to visit 52 households to get the requisite number of interviews. 23 homes were either temporarily or permanently abandoned. Neighbors said that in the abandoned houses most people had died, but this data couldn’t be substantiated. So it wasn’t even included in the survey results. In the 30 households they did survey, there were 53 deaths. 52 of these were violent deaths. All but one caused by coalition weapons. 24 of the people killed by coalition bombs and bullets were under 12 years old. And with that, the survey was over.

    (end transcript)

    Some alarm bells go off for me.

    Roberts says to Lafta that: “Whatever you find in Fallujah is not going to change the story.”

    But that’s exactly what happend. Lafta’s Fallujah data “changed the story” drastically:

    1. Instead of an extremely weak guestimate of “100,000” excess deaths with a CI down to 8,000, the Lancet study – because of the Fallujah data Lafta insisted on providing – then claimed to show that it had to be “100,000 or more”. The story changed from an extremely unreliable estimate that “100,000” excess deaths had occurred, but which could very easily be far far lower, to one where they were almost “sure” that there was *more* than 100,000, and that this was even a “conservative” estimate.

    2. Instead of 43% of violent deaths (43% of about 58,000) or only about 25% of total deaths (25% of the “100,000” headline) being caused directly by Coalition forces, with “most deaths” being caused by NON-Coalition actors (insurgents, crime etc.), the Lancet study could now claim – solely because of what Lafta brought back from Fallujah – that now all of a sudden:

    “Eighty-four percent of the violent deaths were reported to be caused by the actions of Coalition forces and 95 percent of those deaths were due to air strikes and artillery”

    3. They could now claim “a 58-fold increase in death from violence” instead of around an 18-fold increase that would have been weakly estimated prior to Lafta doing his thing in Fallujah.

    4. Instead of the Coalition having killed mostly adult males, as was the case in the data prior to Lafta going to Fallujah, now almost half the people the Coalition were killing were little helpless children.

    ALL of the study’s main “findings” were drastically shifted by this Fallujah decision of Riyadh Lafta. Everything that provided this study a means to present itself as anything other than a relatively worthless coin-toss, and which did not show anything new that was very damning against Coalition forces in particular, all came about entirely from Riyadh Lafta’s decision to go do Fallujah and what he gave to Les Roberts from there.

    This all seems pretty, um, convenient. Could it be that Riyadh Lafta saw the pre-Fallujah results (an extremely weak 100,000 ‘excess deaths’ estimate, with only about 25% caused directly by Coalition violence, and most of those victims adult males) and thought that this was simply not a good enough “story”, and that he (and “God”) wanted a better “story” to be featured and publicized in the headlines of the US press?

    He and one companion went into Fallujah, with no oversight (with Roberts safely tucked away in a hotel room), no means to check or verify the data they collected, and no means to verify whether they collected it in anything like a “random” fashion, or if they just went in there and announced to the city that everyone who had deaths to report should come forward and do so (or perhaps the people who stopped their car three times on their way into the city announced this for them).

    Whatever went on there, what is clear is that, after having done so, suddenly, the far more powerful (and perhaps far more ideologically convenient) “story” materialized.

    Since they don’t provide anything to actually prove what they did in the first place (and are also so secretive even with the data on top of this), all the complicated math of the kind the Lancet authors did, and the kind which David Kane is doing, may just be so much mental masturbation built on top of bad data, to which such math simply doesn’t apply.

    Draw your own conclusions.

  29. Martin, you have to read the studies.

    The Falluja cluster was NOT included in them. This is exactly what Kane’s analysis is based on–the fact that Falluja was NOT included.

    The way this story has spread across blogs reminds me of what Mark Twain said: “A lie circles the earth before the truth even puts its shoes on.”

  30. Tweed, I’ve read it.

    You can’t get “Eighty-four percent” of violent deaths caused by Coalition, or “95 percent due to air strikes and artillery” without including Fallujah.

    You also can’t get a “58-fold increase in death from violence” without including Fallujah.

    etc. etc.

    Work it out with the data if you don’t believe me.

    What they did is not use it in the “100,000”, but still used it indirectly there to create the impression that their study had set a floor of 100,000 on excess deaths (which is part of what Kane is on about).
    The authors rarely said when they were or weren’t using it, which is why you don’t know this.

  31. Tweed,

    Kane’s entire analysis is based on the exclusion of Falluja

    You are hallucinating. The entire point of Kane’s article is to try to reconstruct the missing confidence interval for data set with the Falluja cluster included.

    The Falluja cluster was NOT included in them.

    The Falluja cluster is most definitely included in the paper’s main finding. The sentence in the Interpretation that says:”Violence accounted for most of the excess deaths and air strikes from coalition forces accounted for most violent deaths.” is only true with the Falluja cluster included.

    Les Roberts et al VERY dishonestly switch back and forth between two data sets throughout the article without any indication they have done so. One dataset excludes the Falluja cluster and the other includes it. They use the Falluja excluded set to provide a plausible number of excess deaths but then switch to the Falluja included set to provide a breakdown of the cause of death and the nature of the victims.

    That is why I claim this is a political hatchet job. The authors cherry picked their data to create exactly the story they wanted to tell. If anyone else had done this in any other context everyone would be screaming bloody murder about.

  32. Phoenician in a time of Romans,

    There is insufficient reason to reject the null hypothesis” is a perfectly acceptable result from a study, which is what Kane is stating is the case

    It is acceptable if you actually state that. Roberts et al did not. They have kept the confidence interval of the data set with the Falluja-cluster included secret. They only published the CI for relative risk with Falluja included.

    Kane is attempting to infer the CI for change in mortality from the CI given for pre and invasion mortality. He find that 1.5 CI runs from -160,000 to +659,000. That translate to a 10% chance that mortality either improved or stayed the same.

    This is important in statistic because it gives you an estimate of the data’s overall reliability. This number means that if we duplicated the study numerous times, 1 in every 10 studies would find that mortality had improved which is very unlikely to true. further if we duplicated the study many times the 9 out of 10 positive results would vary significantly from one another. One study would say 100,000 excess deaths the next would claim 400,000. Obviously, such results are useless.

    Which is why the outlier of Fallujah was not included.

    Yes it was. The main finding is wholly dependent on the Fallujah cluster. I can understand how you might not see that. The authors did everything possible to obscure that fact. Very dishonest.

    Think for a moment – you’re in the position of saying that without including war zones, the sample shows a better than 95% chance that a lot of extra people died – but that if you include the war zones there’s a chance nobody extra died because of the war.

    Completely wrong. The confidence interval is not a statement about how many people died but rather a statement about the chance that the study accurately measured any particular change in mortality rates within the CI range. Adding Falluja doesn’t statistically lower the chance that more people died, instead, it destroys the accuracy of the actual number of excess deaths. Think about it, there is a 1 in 10 chance that, no matter how many actual excess deaths occurred, the study will report zero or negative (improve mortality).

    The method is flawed if you assume a unimodal distribution – which the authors didn’t.

    The paper gives no indication that they used anything other than a normal distribution. Given the stated design of the study i.e. that clusters were interchangeable, there was no reason for doing so.

    That doesn’t make sense. The reason it doesn’t make sense is because of a statistical artifact that occurs if and only if you treat the full sample – war zones and non-war zones – as a unimodal distribution. The authors didn’t, as the people in that thread spent much time trying to explain to Kane. Again, examine this.

    I not sure of the provenance of that chart but the bottom one shows an asymmetric distribution with a rather big chunk of the curve south of zero. That would tend to support Kane although by other means.

  33. It is acceptable if you actually state that. Roberts et al did not.

    That is because they did not reach that conclusion – Kane did. My comment was directed at your original statement that “… the confidence interval dips below zero, which is a big no-no.” You were wrong.

    Kane is attempting to infer the CI for change in mortality from the CI given for pre and invasion mortality.

    Using what assumptions about distribution?

    The paper gives no indication that they used anything other than a normal distribution.

    From the Deltoid comment thread already referenced:

    David, what are you going on about here talking about the authors assuming normal distributions? A quick read back of Roberts et al (2004) reveals, at the bottom of p3, that

    “As a check, we also used bootstrapping to obtain a non-parametric confidence interval under the assumption that the clusters were exchangeable. The confidence intervals reported are those obtained by bootstrapping. The numbers of excess deaths (attributable rates) were estimated by the same method, using linear rather than log-linear regression.”

    So basically your analysis is completely tangential. Roberts et al got their confidence intervals by bootstrapping from the empirical distribution of their data. The empirical distribution of the data wasn’t normal and it wasn’t nearly normal.

  34. “By the way, there was one very active leftist — I forget his name — who defended the studies endlessly. Has he been heard from?)”

    You’re probably referring to Tim Lambert, a computer professor in Australia and more or less professional troll at:
    http://scienceblogs.com/deltoid/

    When I challenged the study data right after it came out he ripped me and I shot back, the result of which is that he has become obsessed with me and has a special “Fumento” category on his website. It is sadly one of the weaknesses of the blogosphere that those with the shortest working hours by definition have the most hours to blab their faces off on things they know absolute nothing about and to pursue vendettas.

  35. Kane’s critique has been removed from from Harvard Institute for Qualitative Social Science Web site.

    Interestingly, that hasn’t even slowed down the “Lancet proved wrong” meme from spreading like measles across the right-wing blogosphere.

    I haven’t the math skills to understand some of the more arcane arguments about the study’s veracity, but this much seems clear: Shannon either misunderstands or is deliberately misrepresenting whether Fallujah was included in the study. After saying it’s inclusion is problematic, he later backtracks and tries out the theory that it was included in some places and not in others.

    “They use the Falluja excluded set to provide a plausible number of excess deaths but then switch to the Falluja included set to provide a breakdown of the cause of death and the nature of the victims.”

    So we are clear, Shannon, Falluja was EXCLUDED in the body count. Your critique has, from the beginning, been based on the assertion that the study overestimates the body count. Your latest canard about cause of death and nature of victims shouldn’t distract anyone from the fact that you got the most fundamental fact wrong.

    Neither of those positions squares with what Kane has to say. I would quote from Kane’s paper, but it’s curiously become unavailable online.

    Shannon can easily clear this up by quoting the parts of Kane’s work where Kane makes the claim that INCLUDING Falluja distorts the body count, which is what Shannon initially claimed.

    The best way I can see for Shannon to extricate himself here is to admit that he jumped the gun on “vindication,” and that crowing prematurely reeks of political motives and subjectivity. An observer concerned with preserving scientific method and falsifiability could never make such a boast.

    In general, the fevered responses to the Lancet/Johns Hopkins Iraq death estimates have betrayed a certain lack of strategic political understanding.

    By turning what could have been an objective mathematical dispute into a poltical rallying cry, conservative ideologues undermine a cornerstone of their Iraq position, which is that a large number of civilian deaths is acceptable because the U.S. doesn’t target civilians and because those deaths serve a greater cause that will, in the long run, prevent ongoing bloodshed.

    By accusing liberals of using the study to boost their case, conservatives unwittingly confirm that even they believe the massive civilian body count in Iraq is persuasive evidence that the invasion is a failure and/or was a mistake.

    Not that I think the rapid spread of the phony meme that Kane has debunked the Lancet is a good thing. It does create a kind of noise that will allow some committed war supporters to remain in denial about its effects. But that is mitigating, or even more than offset, by unwittingly underscoring the relevance of civilian deaths, even as the U.S. military shamelessly “doesn’t do civilian body counts.”

  36. tweed,

    So we are clear, Shannon, Falluja was EXCLUDED in the body count

    I totally, complete, enthusiastically, joyously,consummately, entirely, fully, thoroughly, utterly, wholly, perfectly,altogether, without doubt, exhaustively, you go girl, absolutely agree with you that Falluja was not included in the widely quoted 100,000 excess deaths.

    That has rather been one of my chief complaints since day one.

    Again, here is the Interpretation paragraph from the abstract:

    Making conservative assumptions, we think that about 100000 excess deaths, or more have happened
    since the 2003 invasion of Iraq. Violence accounted for most of the excess deaths and air strikes from coalition forces
    accounted for most violent deaths. We have shown that collection of public-health information is possible even
    during periods of extreme violence. Our results need further verification and should lead to changes to reduce non-
    combatant deaths from air strikes.

    Now, can you tell me (1) if all the information in the paragraph comes from excluding Falluja, (2) if Falluja data is included what is the author’s stated rational for doing so, and (3) is the scientific or mathematical justification given valid in you opinion.

    Your critique has, from the beginning, been based on the assertion that the study overestimates the body count.

    No, my critique from the beginning has been that the paper, but not the actual data from the study, mis-reported the study’s finding about the percentage of deaths caused by violence and percentage of violent deaths attributed to the Coalition. I am perfectly fine with total EXCESS deaths of 100,000 as long as the authors are truthful about how the study’s data shows they died.

    100,000 deaths from statistically earlier deaths of the elderly, increased illness, increased accidents. increased crime, insurgent attacks and coalition attacks seems perfectly reasonable to me. In fact, only a slightly lower number can be inferred from other sources such as the Iraqi ministry of health or UN studies.

    I should point out that Kane’s critique has nothing to do with the body count. Instead, he is examine the variance in the papers stated increase in the risk of death.

    The most important result from L1 is the first sentence of the Findings sec-
    tion.2

    “The risk of death was estimated to be 2.5-fold (95% CI 1.6 – 4.2) higher after the invasion when compared with the pre-invasion periods.”

    Unfortunately, if the other results presented in L1 are correct, this con-
    fidence interval is wrong. It is too narrow, especially at the lower end. The
    Lancet authors cannot reject the null hypothesis that mortality in Iraq is
    unchanged.3

    That risk of death estimate definitely does include the Falluja cluster:

    FindingsThe risk of death was estimated to be 2·5-fold (95% CI 1·6–4·2) higher after the invasion when compared
    with the preinvasion period. Two-thirds of all violent deaths were reported in one cluster in the city of Falluja. If we
    exclude the Falluja data, the risk of death is 1·5-fold (1·1–2·3) higher after the invasion.

    The math debate centers around whether the researches used a symmetrical variance distribution (i.e the chance that an error was equally likely to be higher or lower) or an asymmetrical variance (i.e. the chance that an error would be more likely to be either higher or lower).

    Kane assumed that the variance was symmetrical whereas his critics believe the variance is asymmetrical since the researches used bootstrapping (using the distribution of the real data to set the distribution of the variance) to get the reported results. Since data including Falluja is massively skewed upward, then the distribution of variance must be skewed upward as well which means that the variance could not drop below zero.

    Kane might be wrong in his assumption. It is interesting that researches reported their bootstrap results but not the non-bootstrapped ones. This raises the possibility that the non-boostrapped results due support the null hypothesis as Kane says.

  37. Thanks Shannon, for clearing that up for me. I had thought you were challenging the study’s body count, which, in final form, was more than 600,000, not 100,000.

    I do remember that you wrote:

    “Even the most casual student of military history or, indeed, just a curious person with access to Google, should instantly know that the 250,000 figure would be far too high based on the direct observation of facts on the ground. You can’t kill that many people without leaving massive physical evidence. There is absolutely no precedent for killing that high a percentage of the population with air power (or even ground forces) but leaving so few clues that the information could only be teased out by an epidemiological study.”

    But I see that you’ve changed your position now, which is a good, honorable thing.

  38. tweed,

    I had thought you were challenging the study’s body count, which, in final form, was more than 600,000, not 100,000.

    You’ve got the wrong study. The 2004 Study found 100,000 dead, the 2006 study found 600,000+ dead. Both Kane and I are talking about the 2004 study.

  39. Phoenician,

    The Roberts team made two estimates, one including the bus, and one excluding the bus, and concentrated on the one excluding the bus; then concluded that even if you exclude the bus the average occupancy on weekends went up

    No, they did not. The main finding in the 2004 contains information that, based on the contents of the study can only be true if Falluja is included. That has been my chief all along. Almost every paragraph in the study depends on data from the Falluja cluster. Read it yourself.

  40. So, Mr. Kane points at a large pile of bodies, which Roberts et al. excluded, and tells us that it is evidence that things might be better than we thought. In other words, a pile of bodies is evidence of reduced mortality. If your favorite baseball team has a long winning streak, do you think that should lower the probability of their winning the pennant?

  41. JohnP,

    I have no idea what your talking about.

    If you are repeating that odd little conception that the inclusion of the Falluja cluster makes the study more accurate then you simply do not under statistics.

    Statistics is all about extrapolating from small representative samples of a much larger population. To have any hope of doing that accurately, statistics must judge the likelihood that any particular sample represents a part of the larger population. It does that by seeing how far the sample diverges from the other samples. This is called variance. The combination of the variances of each sample tells us the variance of the entire set of samples. This in turn allows us to predict how closely the collective samples represent the larger population.

    We have learned from both mathematics and practical experience that samples raise variance in all directions. A sample much smaller than the others raises the collective variance to a higher range as well as the lower. A sample much higher lowers the variance higher as well as lower. (This a simplified a little).

    The problem with Falluja is that it is a massive outlier. It has over two times the deaths by violence as all the other 32 cluster combined do (52 vs 21). The studies authors themselves call it a significant outlier. Had they then excluded it from their main findings they would have been on solid ground. The study however, includes the Falluja data in its main finding. If it does not, it does not include the data which supports its main finding.

    In any case, Kane’s argument is that the confidence interval for the change in mortality which the paper definitely states includes the Falluja cluster, cannot have the confidence interval that the author’s state.

  42. It seems that David Kane is changing his paper, following the comments at Deltoid. He writes:

    “I will send you the next draft of the paper so that you (and others) can see for yourself. That draft will be non-trivially different from this one and the major cause of the differences will be these comments”

Comments are closed.