A lot of people have taken me to task for calling the Lancet Iraqi Mortality Survey [PDF] an example of scientific corruption. I still stand by this claim.
Many seem to equate scientific corruption with falsification of data but there are many ways to create a false impression even if the underlying data is sound. (I will expand on this in a subsequent post).
One easily graspable example in the Lancet study’s dishonesty is the key sentence in the Summary, the one repeated in the media world wide, that pegs the “conservative” estimate at 100,000 excess deaths. The actual given estimate is 98,000. What pure scientific purpose is served by rounding the number up to 100,000? There is no technical reason for doing so. They chose that number because a big, round numbers stick in people’s minds. Its a number chosen only for its marketing value.
More damning is the utter practical uselessness of the study’s findings. The cover-story for the study is that it is a medical epidemology study intended to provide decision makers with information they can use to reduce the mortality rate in Iraq from all causes.
When one looks at the study as actually published, however, it provides no solid information on which a decision maker could act in Oct 2004 or later. Indeed, the study obscures such data as the age, gender, combat status and means of death of those reported killed. It doesn’t report any kind of detailed time series that would let decision makers determine whether the people reported killed died in the major combat phase or not and it produces widely different scales and causes of death dependent on whether the outlier Falluja cluster is included or not.
But I could be wrong, so let me issue this challenge:
Can anybody point out information contained in the study, as published in Oct 2004, that would let a real world decision maker make changes to policy, strategy or tactics that would have or will save lives in Iraq?
Please be as specific as possible.
I think the answer is “No.” I think this proves this study was designed, conducted, written up and published purely for its hoped for political impact in America and the rest of the Western world. Trying to save Iraqi lives was never a major consideration.
To me that is scientifically dishonest and represents corruption of our scientific institutions.
(Update:) let me rephrase the question slightly
Given that: (1) The study is universally recognized as being carried out under very difficult conditions. (2) Difficult conditions seriously raised the risk of significant error in the study. (3) The consequences of acting on erroneous information in a war zone could be catastrophic.
What data provided by the study was so important that it was worth the risk of conducting the study under adverse conditions and getting an widely inaccurate result? What data had to be provided in Oct 2004. What data couldn’t wait?
Oh Shannon!
1. You’re repeating a mistake I’ve already pointed out to you:
One easily graspable example in the Lancet study is the key sentence in the Summary reproduced world wide that pegs the “conservative” estimate at 100,000 excess deaths. The actual estimate is 98,000. What pure scientific purpose is served by rounding the number up 100,000? Of course, there is none. They chose that number because big round numbers stick in people’s minds. Its a number chosen only for its marketing value.
No. The number 100,000 is chosen because the figure of 98,000 deaths refers to “the 97% of Iraq represented by all the clusters except Falluja” (page 5). 98,000 divided by 0.97 is 101030, which rounds down to 100,000. Yet another correction of a calculation error needed, I’m afraid.
2. My criticism of you in particular has been that you have made, and continue to make, methodological criticisms which are invalid, and that you are not prepared to either defend them or stop making them. In particular, your “cluster sampling critique” from your original “Bogus Lancet Study” post. I’ve asked you three times now whether you are going to stop claiming that “a cluster sample has a very high chance of exaggerating the number of deaths” and I think I deserve an answer.
3. The answer to your sophomoric “challenge” is simple. The simple information that “The postinvasion management of Iraq is going badly wrong and currently there are more people dying than there were when Saddam was in power” is valuable to a decision maker, because it gives him the information that there is a serious problem to be dealt with. Such a decision maker could then commission more information and start to make a plan. The value of such “alarm bell” studies is, frankly, well known in epidemiological studies; Medecins Sans Frontieres have produced many excellent ones, including a recent one which estimated mortality in Western Darfur.
4. The charge of your posts never changes; it is always that the Lancet team were “scientifically dishonest” and “represent corruption of our scientific institutions”. This is the fourth time you’ve made this charge. But the grounds always change. First it was that cluster sampling was “bogus”. Then it was that the summary was confusing to you. Then it was that they had reported an outlier in a way that we later established was the correct way to do so. Now it’s because the study, in your unsupported opinion “wouldn’t be useful”. You’ve been wrong every time, as you changed your story, but the one constant has been that you are angry that a study was published which had the effect of making George W Bush’s policy in Iraq look like a failure. Can I be forgiven for suspecting that this is actually what you are angry about?
The Lancet team are a credit to science; you, on the other hand, are fast giving political “sound science” hackery a bad name.
Yes,
The results of the “study” could be used to elect a weasily “surrender monkey” to the American Presidency that would cut our military budget and tie our hands with a “global test”.
Once global Dhimmitude was complete and the caliphate extended beyond the New Jersey Mosques and Toronto Ghettos, the “White Satan” would no longer be able to kill innocent “minutemen” indiscrimately without suffering “collateral damage” in its restaurants, shopping malls and unholy gambling centers.
The Baath party, newly restored with Soros’ funding, would then return to its holy, humanitarian subsidization of baby formula for all of Saddam’s needy children with the help of Kojo Annan, who the caliphate will elevate to General Secretary of the U.N. for his “undying supplication to the justice and love of global Islam.”
Any “regrettable dissappearances of “jewish-pig sympathizers” into “accidental excavations” in the desert would pale in comparison to the awful killing by Bush-Hitler and the neocons so clearly demonstrated by the Lancet study.
Saddam’s media handlers would be sure to explain to Eason Jordan”s “journalists” that the new sand mounds are newly discovered Pre-Iron Age ruins deserving demarcation on a map of tourist sites in the “Greater Arabic” region of the global caliphate.
This is how the Lancet study could “make changes to policy, strategy or tactics that would have saved lives in Iraq.”
I’m surprised you couldn’t see it, Shannon.
-Steve
Minor comment on dsquared’s 11:57am #4:
> Then it was that they had reported an outlier in a way that we later established was the correct way to do so. [bolds added]
If the bolded phrase means “dsquared & his allies have claimed,” then all is well. If the missing antecedent to “we” is to be taken as “dsquared and other ChicagoBoyz commenters,” then this recent entry and comment thread stand as a rebuttal.
I’ve replied in that thread. I am not aware of anyone having made any specific complaint about the handling of that outlier other than ones which proved unsustainable.
ahhh, now we’re on to the fifth grounds for accusing the authors of scientific dishonesty!! Productivity or what? A pedant would say that you would be better off correcting your serious calculation error, but hey, it’s Thursday!
What data provided by the study was so important that it was worth the risk of conducting the study under adverse conditions and getting an widely inaccurate result? What data had to be provided in Oct 2004. What data couldn’t wait?
The data that couldn’t wait was that there was a prima facie case that “not doing body counts” was resulting in the coalition missing a potentially very serious problem. That’s why the authors recommended, at length and with (IMO erroneous) references to the Geneva Convention, that the coalition should start “doing body counts” immediately.
If you find that 100,000 people have died unnecessarily in 18 months, then you can assume that 5,555 people are dying unnecessarily every month. Very few things “can wait” when the cost of waiting is so high. The US elections do not constitute a reason to delay.
And in any case, this is pure ad hominem. The study could have been carried out by John Kerry and Howard Dean, but it would not have changed a single standard deviation in the data. Shannon appears to have given up the losing battle on the science, and switched to pure and simple attacks on the authors. It really takes the biscuit to look at evidence that 100,000 people have died and say “well obviously this has to be delayed or else it might effect the elections!”
dsqaured,
Thank you for demonstrating my main point so elegantly! (see#3)
1) So they either rounded up from 98,000(2,000) or down from 101,030(-1,030). This in no way changes the validity of my criticism. I am pointing out that they rounded to the more marketable number of 100,000 instead of providing the actual number, whatever that was. There was no technical reason to round at all.
More over, if they did calculate the figure as you suggest the researchers were incredibly sloppy. Remember this is a cluster study. Each cluster represents a unique area of Iraq calculated by the demographic software. If a cluster is removed, the remaining clusters must be recalculated to stretch over the area of the missing cluster.
I suppose it is possible that they recalculated everything and ended up at exactly 100,000 but what are the odds of that?
2) I am working on a post explaining the Cluster sampling critique. This time I’m going to use pictures! In any case, my most recent post have not concerned methodology i.e. how the data was collected and processed, but just presentation. Honest communication of results is just as important to the integrity of scientific institution as methodology. So I haven’t pounced on any issues of methodology you have raised.
3) “The post-invasion management of Iraq is going badly wrong…”
Honestly, was there any possibility that the study would return a finding that you personally would not use to confirm your preexisting bias that that the post-invasion was mismanaged? Again, what specific information supplied by the study did you have to get to reach your conclusion. If the study had not been done, how would your current view differ?
More to the point, “the post-invasion management of Iraq is going badly wrong” compared to what other real world event? How does this study tell us what could have been done better and how?
“there are more people dying than there were when Saddam was in power”
Again, you needed the study to tell you this? Everybody who has studied armed conflict at all knows that you will see a spike in deaths. Did we need the study to tell us this.
Frankly, this is just the kind of vague response I expected. You provide no concrete example how this study might be used in a real world fashion. This study provided you personal with no new information and changed your assessment not at all. Rather, if confirmed your prejudices and gave you a rhetorical stick to beat people up with. Which I contend was the gaol of the study in the first place. This study was always about politics. Its corrupt.
4) “But the grounds always change”
No, I keep adding new grounds. The base methodology was risky. The on the ground conditions made the study prone to error and subversion. The presentation was deceitful. Now, I have learned that key data was left out of the published paper. I’m sure something else will turn up tomorrow.
I am not shifting the grounds of my critique. Instead the reasons to question the study keep piling up.
Again, thanks a lot for proving my point. The study has no practical utility.
dsqaured,
So the practical utility that you think the study proves is the need to do body counts?
Given that in the “conservative estimate” most people died of illness, accident or non-Coalition violence, how would body counts help? The remainder of he deaths occurred from airstrikes the results of which are already accessed as standard protocol.
Moreover, what evidence does the study present that the military is not already taking the maximum amount care humanly possible to prevent the deaths of non-combatants. Given that the conservative estimate says most deaths from violence are due to non-Coalition actors couldn’t making changes to Coalition tactics that make those non-Coalition actors harder to neutralize actual cost lives in the long run.
The fact that the researchers chose to conduct the study under difficult conditions (which would degrade their data) and rush it in to print, even though the study produced no time-dependent results, strongly supports the idea that they did created the study for purely political reasons. They never expected to have any other positive effects.
So they either rounded up from 98,000(2,000) or down from 101,030(-1,030). This in no way changes the validity of my criticism. I am pointing out that they rounded to the more marketable number of 100,000 instead of providing the actual number
Has anyone ever read any criticism more idiotic than this one?
Although it is a perfect match for such an imbecile challenge. According to Love the Lancet study is good or bad depending on whether politicians would implement any changes based on its conclusion. Accordingly, Dr. Love will soon propose that quantuum mechanics is wrong beyond repair, likely a product of some corrupt physicists.
Unfuckingbelievable.
Palo,
Loose analogies and obscenities say more about your position than you might wish.
S. Love,
I am untroubled by Roberts’ use of “100,000” instead of 98,000, given the 95% confidence interval (8,000-194,000). This estimate has zero significant figures (E5), or one significant figure (100,000) at most.
Everybody who has studied armed conflict at all knows that you will see a spike in deaths
On the other hand, anyone who has studied a chart of deaths in Iraq over the eighteen months between the invasion and September 2004 will know that there was no such “spike”, in the ex-Fallujah clusters, just a gradual increase in the death rate which got worse toward the middle of 2004. Luckily, there is exactly such a chart on page 5 of the Lancet report; perhaps you have forgotten as it must be at least an hour since you last read it looking for a new argument.
By the way, here is a source of a picture that you might want to include in this wonderful forthcoming “cluster sampling post”.
I have 2 questions for the stats gurus, if anyone can help out here. Since the 95% confidence interval refers to the 8,000-194,000 band, what confidence could be assigned to a somewhat narrower band surrounding 100,000, like 90-110k?
Also, what would the inclusion of Falluja data do to the confidence interval? Any ideas?
Telluride,
See dsquared’s answer to your (and my) including-Fallujah confidence interval question in the comments to the “Fisking Falluja” post. It is one-third down the thread, at 3/23 2:53pm.
Okay, because you called me a statistical guru, I’ll answer your questions.
1.) If the estimate is 98,000 and you want a confidence interval from 88,000 to 108,000 the CI is going to be about 16%.
2.) Including Fallujah should make the point estimate (the 98,000 number) more extreme and the confidence interval broader.
That was an interesting response from dquared. It sounded very much like “I have no idea and I’m frankly scared to ask.” Of course we know SOMEWHAT what it would do to the confidence interval, since it would massively blow out the standard deviation of the results used to compute it.
But if we’re treating Falluja and ex-Falluja as completely distinct sets, why are they being conflated again and again? If “there is no very good way to make general statements about percentile confidence levels of this sort of distribution,” why is the study useful?
If the estimate is 98,000 and you want a confidence interval from 88,000 to 108,000 the CI is going to be about 16%.
Right, so referring to 100,000 as the “most likely” value means next to nothing. Just checking.
Telluride said:
“Right, so referring to 100,000 as the “most likely” value means next to nothing. Just checking.”
Well this stuff complicated and cannot be boiled down to one simple thing. It’s a rough estimate with a range. And that’s informative. I’ll quote one of the study authors, Dr. Gilbert Burnham, from his interview in The New Republic one more time below:
“Now, you can argue, is this increased mortality rate 70,000? Is it 60,000, is it 150,000, is it 200,000? Our best guess, on a conservative side, is 100,000. But it could be less and it could be more. Because just by the statistical nature of this thing, the kind of zone around this number where we are sure this answer truly lies is fairly broad. It’s a national survey, it’s a massive survey, but it’s not a national census.”
It is what it is. Not perfect and also not useless.
Well so far my challenge isn’t netting much input. dsqaured seems to think that the study has practical utility because body counting will miraculously reduce non-combantant deaths, at least, I think that is what he is arguing but nobody else is biting.
Anybody else, please?
Amac,
“I am untroubled by Roberts’ use of “100,000” instead of 98,000, given the 95% confidence interval (8,000-194,000). This estimate has zero significant figures (E5), or one significant figure (100,000) at most.”
Yes and if they had not made sweeping claims of mass murder and had not spun like used car salesmen in the presentation of the paper (including apparently hiding data) I would give them a pass but since it appears that the entire paper and the publicity effort around are anchored on the 100,000 claim I think I will choose to call them to account.
Well, I don’t think that you can expect the scientists to come up with perfect policy answers. Their thing is to collect and analyze data, which they did.
And that’s really important. I mean how can you possibly know that there aren’t many civilian deaths when you are not even attempting to count them? The main conclusion of this paper is that there is a need for more research into what’s going on, research that may answer Shannon’s additional questions about person, place and time.
This preliminary data make it appear that there’s a good chance that more civilians are being killed than our government assumed were being killed. I really don’t know what number of civilian deaths the administration deams acceptable (for all I know it could be way, way higher than the 100,000 number.) But if it turns out that your weapons and tactics are killing more civilians than is acceptable then you’ve got to reevaluate your tactics.
Do you think America is fighting the perfect war and that there’s no room for improvement? In everything, whether it be business, technology, warfare, personal fitness goals, etc. there’s room for improvement.
dsqaured seems to think that the study has practical utility because body counting will miraculously reduce non-combantant deaths
This is, of course, untrue, though I doubt any neutral observers were fooled. If Shannon decides to declare victory on this account he/she is a bigger fool than I thought, which is quite some fool.
Telluride: I can’t at this late hour be bothered calculating the 90-110K confidence interval but the Excel spreadsheet function NORMSDIST will do it for you. The cum-Fallujah confidence interval would give a negative number of excess deaths if calculated crudely using a normal distribution, but this is clearly an indication that the calculation of the CI is wrong; taking a distribution with a positive 95% confidence level and adding an observation which is much higher shouldn’t make you think that the true value is lower (for example, would you say that the 1987 crash is a good reason to believe that the London market often goes up by 20% in a day?). To be honest, confidence intervals aren’t the be-all and end-all; sometimes you just have to accept that some things can’t be summarised with a single number.
By the way, can we get the charge clear; Shannon is still saying (falsely) that the authors made “sweeping claims of mass murder”, but appears to have dropped the accusation that they did so specifically in order to provide propaganda for Iraqi fascists? I only ask for the benefit of the libel lawyers who I still hope will take an interest in this series.
Right, so referring to 100,000 as the “most likely” value means next to nothing
No, it means that, ex-Fallujah and speaking loosely, the true value is equally likely to be higher than 100K or lower than 100K. It is 58% likely to be higher than 90k, etc etc up to the result that it is 97.5% likely to be more than 8000 and extremely unlikely to be less than zero.
With Fallujah, the distribution is bimodal (a fifty cent word meaning that there are two peaks; one associated with the middle of the ex-Fallujah distribution and one wth the Fallujah distribution, rather like a distribution of heights has a male peak and a female peak), so the confidence interval is difficult to calculate, but you can tell that, because you’re adding more probability mass on the higher numbers, the 50/50 figure is going to be higher than 100K.
” taking a distribution with a positive 95% confidence level and adding an observation which is much higher” results in a statistically meaningless amalgam, as you know quite well. You cannot selectively introduce or allude to the falluja cluster while ignoring its likely effects on the confidence interval.
With Fallujah, the distribution is bimodal
from one outlier you conclude the distribution is bimodal? The authors do not claim it is bimodal, or asymmetrically leptokurtotic, or heavily skewed. i would think this assumption would make the confidence intervals considerably more complicated to calculate.
Give over Telluride. If you have 32 observations which are reasonably normally distributed between 30 and 50, then one observation of 200, then does that make you think that the true value might be minus 20? I don’t want to start appealing to authority, but “common sense” points in the same direction as the underlying mathematics in this case.
from one outlier you conclude the distribution is bimodal?
No, that’s a modelling assumption made by me, not a conclusion. The underlying DGP might be a fat-tailed distribution, or it might be a genuine low-p event, or whatever. But given the non-data information that we have that the majority of Iraq saw a small rise in deaths, but some part of Iraq saw very heavy violence, I think the bimodal assumption is defensible, not that anything in the Lancet paper depends on anything like it.
“I don’t want to start appealing to authority, but “common sense” points in the same direction as the underlying mathematics in this case.”
Well here’s an example that might appeal to your authority. An option market with heavily skewed outlier calls says one thing about the underlying market: it is more likely to go down than up. Crazy, huh? Counterintuitive but true.
The Lancet editors are not claiming there is a bimodal or skewed or leptokurtotic distribution. They (and you) are selectively adding and subtracting the outlier value to the ex-outlier confidence interval based on a normal symmetrical distribution, resulting in a conceptual mishmash.
Is there some reason to restrict the discussion to lessons which relate to “policy, strategy or tactics that would have or will save lives in Iraq?”
Is it not of some importance to avoid making horrible mistakes in Iran, Syria and other countries where regime change may be contemplated?
Frankly, if you really can’t see why there is merit in studies which seek to quantify the consequences of waging wars of choice, it is hard to figure out how to communicate with you.
I hope for your sake that you are just changing the subject out of (understandable) embarassment at your failure to find some significant flaw in the Lancet study. To bite off more than you can chew in a technical debate merely makes you a hothead. To really believe that a matter of life and death isn’t worth study is something else. Can you really not see why it matters?
BTW, assuming either an asymmetrically leptokurtotic or a positively skewed distribution shifts the mode to the left, and I think the mode is what is interesting in this instance, is it not? You’re “the authority” here so do fill us in.
Telluride, as I say, give over. You’re not responding to my main example; if you have a dataset and then add a large positive outlier, then there is no summary statistic of the central tendency of your data which moves downward. I don’t quite understand what you’re saying about options markets, but since there is no way to hedge death rates in an underlying market, I suspect that this is a disanalogy.
Clearing up, your charge:
The Lancet editors are not claiming there is a bimodal or skewed or leptokurtotic distribution. They (and you) are selectively adding and subtracting the outlier value to the ex-outlier confidence interval based on a normal symmetrical distribution, resulting in a conceptual mishmash.
is wrong. The study’s authors (not the Lancet editors, who are a different group of people) are not “selectively” doing anything. They are reporting the crude numbers including the outlier and the extrapolated numbers excluding it. This is a sensible way to report the data.
and
BTW, assuming either an asymmetrically leptokurtotic or a positively skewed distribution shifts the mode to the left, and I think the mode is what is interesting in this instance, is it not?
looks like logorrhea to me. Different types of distribution can have higher or lower skew and all manner of kurtosis and all manner of funny things can happen to the mode. For any statement about kurtosis and skew, I can come up with a distribution for which the opposite of what you say about the mode is true. However, it is equally obvious that if you have made a distributional assumption under which the addition of a large positive observation shifts your measure of the the central tendency downward, then you have made the wrong assumption.
As I say again; you have 32 observations between 30 and 50, then observation number 33 comes in at 200; would this make you say that the true distribution is centred around zero?
By the way, the mode is not what we are interested in in this instance – for any sensible loss function over deaths, we are interested in the expectation of the number of excess deaths, not the single most likely point estimate.
Kevin Donoghue,
“Is there some reason to restrict the discussion to lessons which relate to “policy, strategy or tactics that would have or will save lives in Iraq?” “
I am attempting to falsify my own contention that the study has no practical value beyond the crudely political i.e. giving political advantage to one party over the other. For that reason I requested that people try to focus on providing examples of practical i.e. non=political uses for the study.
“I hope for your sake that you are just changing the subject…”
This is a large subject with lots of point for debate. When I bring up a new point that doesn’t automatically mean that I have abandoned all others. By nature I tend to chop of arguments in to subcomponents when possible and try to test each sub-argument. Each blog post tends to address one particular sub-argument.
My original criticism of the study, that Cluster sampling was a poor methodology, has never changed. In addition, when rereading the study just recently the inconsistent use of the Falluja data leapt out at me so I did a post on that. I’m adding bricks to the structure of my argument not thrashing around.
I’m really serious about attempting to uncover some utility beyond the grossly political. If you can provide me with a real use I will back down on my contention that the study is merely a political ploy.
“For any statement about kurtosis and skew, I can come up with a distribution for which the opposite of what you say about the mode is true.”
without exception, for a positively skewed distribution, the mode is to the left of the mean. The more positively skewed is the distribution, the further leftward is the mode. The mode is the relevant measure since “likelihood” is exactly what is being expressed.
The question has been updated to ask “What data couldn’t wait?” (Maybe the update was there before my previous comment; if so, my bad.)
That gives me a bit more to think about but since it is late here I will just note briefly:
If we could be sure (a) that the US had no plans in the works for further adventures and (b) that any lessons from the study would be inapplicable to operations in Iraq, then off-hand I can see no reason to rush.
Knock out either or both of those assumptions and it is obvious that the findings, if heeded, might lead to better management of operations in Iraq and (especially) elsewhere, thereby saving lives. Simple example: it might abort an invasion of Iran, whose cost in excess deaths would surely be higher.
Good night.
“For that reason I requested that people try to focus on providing examples of practical i.e. non-political uses for the study… I’m really serious about attempting to uncover some utility beyond the grossly political. ”
I answered this on my comment March 24, 2005 05:44 PM. Sorry to be squeaky here, Shannon, but I thought I had a good answer to your question however I posted it during a flurry of activity here and it might have gotten lost.
Kevin Donoghue,
The points you raise are beyond the scope of this thread but I would point out that for people actually interested in scientific accuracy, a single study is only suggestive.
One of the disturbing things about the sociology of this study is the degree to which many have embraced its findings as revealed truth even though, just as any other scientific study, it must be verified through replication before we can confirm its accuracy. (Our arguments over methodology are so vociferous because we don’t have any other means of evaluating the accuracy of the study. Solid science ends arguments, it doesn’t start them.)
If we use a single, unverified study to direct our policy we are not actually basing our decisions on good science.
“we concluded that the civilian death toll was at least around 100,000 and probably higher”
-Garfield & Roberts, in The Independent.
“Probably higher” from the mouth of a statistician, sounds to me like a positive scientific assessment of probability. Either their assumption is a normal distribution without the falluja data, or a normal distribution WITH the falluja data, and the concomitant increase in variability. it is altogether conceivable that zero is encompassed by this vastly larger confidence interval.
Chel,
Your 3/24 5:44pm answer to S. Love’s question struck this reader as honest and frank. I supported OIF as the least-bad of a bad batch of alternative policies, and, yes, any policy modifications that reduce noncombatant casualties are worth striving for in the future. And, yes (as commented on in “Fisking Fallujah”), more research will be needed on the number of Iraqis who died directly and indirectly as a result of the war. And the accompanying number of those who lived as a result.
Telluride,
See Mike’s comments on author interviews near the end of the “Fisking Fallujah” thread.
That would assume that the civilian deaths aren’t investigated under current policy and that no attempts to improve medical and sanitation are made. And that these issue aren’t considered for future planning.
The data doesn’t tell us where we can better focus our efforts beyond what we already do.
To summarize:
ARTHUR: Look, you stupid bastard, you’ve got no arms left.
BLACK KNIGHT: Yes I have.
At least the Black Knight, unlike Shannon, was sufficiently honest only to try and pretend the outcome was a draw.
I think I’ve got it. It doesn’t quite answer Shannon’s question of how the study can help Iraqis, but the study tells us that a small study conducted during a war doesn’t provide us information useful in making decisions during the war or numbers that are better than what we would get from studies that will be conducted later. It also raises the question that perhaps a larger study could be conducted safely and cost-effectively during a conflict to provide real time data for decision making.
The reason why you want to count the casualties is explained here
That doesn’t address the Lancet study. The value of having good casuallty data is appartent, the Lancet study doesn’t provide this.
Urgency, having this data now, is asserted, but not demonstrated.
The Lancet study doesn’t tell us anything we don’t already know.
It doesn’t seem to be in interest of the military to make these policy changes now. It may want to consider making changes in the future when they can be better evaluated and implemented, the lancet study doesn’t help in this.
Better data will be availible in the future and other parties will do the work. Why would the government do the work at great cost to itself when others will do it eventually? This sense of urgency is unfounded, it seem more like they’re lobbying the military to do the work they want done for them.
To sum up, you are arguing that in addition to being an effective propaganda tool for our enemies, the study is a valuable political tool for pressing the military into making rushed policy decisions.
Why thank you AMac!
Tim Lambert,
Thank you for comment, however, the page you linked to does not answer the specific question of how this particular study was useful and why it was so useful (see update in parent) that it had to be done under difficult conditions that conceivably could lead to significant error.
More generally, the comments you link to comprise a mere unsupported argument from authority by some arbitrary clump of “health experts” asserting without proof or even example that body counts would help save lives. For example, do we have study comparing a conflict where there was a body count versus one where there was not? I don’t find such unsupported arguments from authority compelling.
I realize that it would seem common sense that body counts would save lives but nobody has demonstrated that the current tactics currently used by the military do not represent the best possible practices that result in the minimum number of causalities. Warfare is to fluid and each conflict to unique for the tools developed for epidemology to provide much help for those trying to minimize causalities.
I think my point still stands. This study has no utility beyond trying influence whether one politician gets elected over the other.
Amac, thanks for the reference. My comment addresses the authors’ misstatement, the Lancet’s deceptive synopsis, and dquared’s persistent partial-use of falluja by way of “intuitive” appeals that are actually incoherent.
If the distribution is presumed to be normal, the enclosure of the outlier data set would certainly widen the confidence interval. Since they dare not represent that new [ridiculously wide] confidence interval, the authors/editors allude darkly to the excluded figure, appealing as dsquared does above to “common sense” instead of statistical rigor. This approach is incoherent: if the outlier is excluded, it adds no information to the probability or confidence interval of the “true” results. If it included (assuming a normal distribution) it should blow the confidence interval out beyond repair, and that new confidence interval needs to presented as well. Assuming a skewed distribution, accomodating falluja would raise the mean, but it may alter the mode and median not at all as dsquared acknowledges, and would also have counterintuitive implications for the position of the confidence interval [the mode representing the “most likely” individual value, not the raw average.]
To take dsquared’s example, adding 200 to his hypothetical distribution changes the likelihood of randomly selecting “41” not very much at all. This is the figure people mean when they say “most likely” – NOT simply the mean.
By grafting the outlier value to the framework of the ex-falluja data, “probably higher” is probabilistically incoherent and deceptive. It smuggles the outlier value in front of the jury without acknowledging any new blown-out standard deviation. Under all scenarios, alluding to the “true result” of the Lancet survey as “probably higher” due to Falluja represents a FALSE quantitative, statistical judgment, especially when voiced ex cathedra by a scientist or statistician.
Shannon, Let me get this straight, you admit that it is common sense that body counts save lives but this is not sufficient — you want a controlled study comparing a conflict with a count with one without. But to do such a comparison you need to do a body count in both conflicts because otherwise you can’t measure whether lives are saved or not.
Anyway, at a minimum your own suggestion that a controlled study would be needed satisfies your challenge since conducting such as study would, according to you, serve a useful purpose.
telluride (10:01am),
I don’t see how one can know the nature of the distribution of violent deaths in postwar Iraq, unless one has examples of similar wars whose deaths have been thoroughly investigated. However, one can make some common-sense statements about it. (Note to those who claim simple statistical concepts are not amenable to common-sense interpretation: skip this comment.)
By far the most common value for “# killed in household?” is zero. Zero is also the most common value if Cluster is taken as the unit to be plotted (18 of 33). So, whatever the distribution is, it is severely skewed, i.e. it’s not Gaussian or some other symmetric curve. Seems to me the best any authors can do is make some reasonable guesses and do some reasonable transforms of the data. The question would be how sensitive the results’ midpoint and CI bounds would be to these analytical assumptions and procedures.
Roberts reported data that required the removal of one cluster prior to analysis. That left them to make their numerical analysis on violent deaths in Iraq from 21 reports (Table 2). Common sense (that term again) suggests that a very heterogeneous distribution of 90 events (21 violent) in 32 clusters will lead to a very rough estimate of Iraq’s death rate.
Roberts’ treatment of those 32 remaining clusters gave an excess-deaths figure that was at the high end of plausibility (98,000) and with a very, very wide CI (8,000-194,000). If the body-count approach is at all meaningful, we already know 8,000 is lower than the actual number of dead, and can confidently assert that it is lower than the number of excess dead.
I take this as strong circumstantial evidence that Roberts didn’t “cheat” on the primary data. If they were going to succumb to that temptation, they could have (a) not had to exclude Fallujah and (b) come up with a CI that was tighter, and more amenable to Les Roberts’ political agenda (see the Les Roberts interview in the Chronicle of Higher Ed., linked earlier).
It’s too bad that the Lancet’s editors didn’t require that Roberts submit their detailed breakdown as “supplemental online information.” A single Excel spreadsheet would seem to have sufficed. Then, people who know about such things could trace Roberts analytical steps, and criticize or support their analysis, as the case may be.
Having designed their study and then picked Fallujah, Roberts were in a bind, as their study design did not take such an outlier into account. Common sense, ack, says the design of study of violence known to be highly geographically heterogeneous should have anticipated this unlikely but not implausible datum. After the fact, the best they could do is apply Kruskal’s and others’ maxims on “wild data”. The extent to which they did so is the subject of some previous comments.
Note to those who claim simple statistical concepts are not amenable to common-sense interpretation: skip this comment
haha. As Amac’s comment shows, score ten points for common sense and minus a million for statistical rigour here. Telluride is using a concept of “skewness” under which a biased coin that comes up with a value of 2 90% of the time and 1 10% of the time has zero “skewness”; this is why super-purists like me don’t like it when people use “skew” to refer to the third moment of a distribution.
I continue to disagree with any interpretation that puts too much weight on modes because I think there is good reason to believe that the underlying distribution is bimodal; when you’re dealing with the estimated excess deaths number it seems clear to me that you care about the expected value. Perhaps people who say that the point estimate is “the single most likely” number are guilty of talking slightly loosely, but it seems a bit much to beat up on them when the context is a debate with people who think that any point in the CI is as likely as any other!
I maintain that there’s really not so much you can do with data that’s intrinsically heterogeneous and volatile; the only really satisfactory way to deal with it would be to do a much, much bigger study, which is why I’d encourage everyone to sign up at http://www.countthecasualties.org.uk
>haha. As Amac’s comment shows, score ten points for common sense and minus a million for statistical rigour here…
Ambiguous sarcasm (?), followed by a detailed statistical response to telluride. May I suggest 10 points for common sense where applicable and 25 points for statistical rigor applied where applicable.
Sorry, I can see how that didn’t come out too clearly; I meant that common sense was delivering a sensible assessment of the data, and rigorous analysis was getting us miles up a blind alley of modes and skews.
telluride is on the right track. The shape of the data distribution is uncertain, so it’s unwise to rely on analysis techniques that assume normality. This is another way of saying that the study should be replicated using a larger sample and better sampling techniques.
dsquared wrote:
That’s right, the best course of action is to do additional studies based on much larger data samples and better survey procedures. dsquared could have written this about 100 comments ago and saved himself a lot of work. (I see no reason to trust the countthecasualties people, BTW, as they seem interested mainly in using casualty data to further their anti-war political agenda. But if one thinks civilian-casualty information is valuable it makes sense to do additional surveys.)
While I don’t like the Lancet study, how about the following report by Human Rights Watch:
http://www.hrw.org/reports/2003/iraq1003/
They present a figure of 94 civilians killed by US forces under questionable circumstances in Baghdad over 5 months in 2003. This may not be a good figure for propaganda purposes, nor is the revelation that most of those killed were young men shot by coalition forces.
But they present criticisms and recommendations that are actually useful. They point to the marking of checkpoints, teaching hand signals, improvements in search techniques and improved language and cultural skills. They document, why these are important. They document real tragedies that have occurred due to the negligence of US soldiers and insufficient care and resources and training. They’ve got interviews from a range of witnesses of the same event, photographs. These are well researched cases. The Lancet Report, by sharp contrast, bases its main claimed conclusion (that the coalition should take greater care with the use of air power) on exceedingly poorly documented and researched cases, in particular, two claimed bombing incidents outside of Fallujah.
Human Rights Watch also clearly condemns insurgent actions:
http://hrw.org/english/docs/2004/08/05/iraq9195.htm
Thank you Heiko & Jonathan.
They also give recommendations, as to how to improve the compensation scheme, which I might add, seems to pay far too little, even considering that Iraq is not a wealthy country.
$11,000 is a very small amount, when the mistake in question is the shooting of a car that was at standstill far from a checkpoint and in no way threatening, and the woman, who gave birth to a son a week later and who survived the massacre together with a daughter, lost 3 of her children and her husband.
You could call me biased in my praise, because most of what the report recommends, I was in favour of before I read it. I’ve read many very credible accounts by soldiers and Iraqis from blogs which give the same picture as this report, and which have led me to similar conclusions.
The idea that air strikes are the key issue needing further attention promulgated by the Lancet seems quite wrong to me. But then again, call me a cynic, it seems to me the Lancet authors may have been overinfluenced by a desire to produce information packaged in a way to be politically damaging to Bush, rather than information that was accurate or useful.
I’ll see if I can find it again, but the WHO also had estimates for 2003.
While I appreciate the arguement over the validity of the estimates of the Lancet study, a larger question is determining if the evidence actually supports the findings of the study. Now, reviewing the news coming out of Iraq, it is clear that bombings, insurgent attacks, etc. are publicized, and every mis-step of the coalition forces is heavily covered. Evidence: the bomb going off in Baghdad early in the coflict, and the dozen military-age young men that were killed at a “wedding” at 4AM near the Syrian border. So…if insurgent or coalition activities that kill between 5 and 30 people are significantly covered, we can be confident that if actions like this were occuring on a regular basis, then we would know about it. The bombings that killed over 100 people are definately covered. So Lancet would have us believe that 5500 people are being killed in a violent manner every day, and there is no reporting on at least 5200+ EVERY SINGLE DAY? This defies logic and reason, and requires full dose of self-deception. Where are the dead? The fact that 80% of violence in Iraq occurs in 4 of the provinces should make those absolute war zones on par with WWII, yet the evidence is clear that this is not the case. If the Lancet study was accurate, there should be dozens of mass graves left over from such fighting (such as what Saddam’s regime left behind), but alas for Lancet, they don’t exist. This is a simple case of a flawed and biased “study” that is an embarrassment to the scientific community. The second question of the honesty of the researchers, the timing of the release, etc. are something else entirely. I think the evidence speaks for itself – and I think it is abundantly clear what Lancet was trying to do with a high degree of confidence – and determining the truth was definately two standard deviations out from the norm!
Brian,
I’ll have to bat for the “wrong side” here for a moment ;-)
5500, 5200 per day?
And otherwise your argument is fine, I don’t think lots of large bombings are being “overlooked” by the media.
WHO has the death rate for 2003 at 8. Here is data from 1970 to 2000 from the world bank.
Tim Lambert,
“ou admit that it is common sense that body counts save lives”
Ummm, no. What I said was, “I realize that it would seem common sense that body counts would save lives but” I think you misread me.
Actually, I asserted that the “health experts” were making their determination based on their own hunch, not any existing scientific evidence. I said they would need to conduct studies to make their assertions anything more than guesswork.
Again, I am trying to test whether this particular study had utility at the time it was performed and published. Whether such studies would be useful in the generic is another topic.
“This study has no utility beyond trying influence whether one politician gets elected over the other.”
Suspend disbelief for a moment and suppose this statement to be true. Well, what of it? Surely it is a matter of some importance to ensure that politicians whose decisions are disastrous are removed from office?
As Edmund Burke very wisely said: “The power of bad men is no indifferent thing.” He also had many useful things to say about men whose judgement is deficient.
In reality of course any study which explores the consequences of a drastic action is certain to have wider uses.
“I am trying to test whether this particular study had utility at the time it was performed and published.”
But you are explicitly limiting consideration to Iraq, where the damage was already done, whereas the main value was likely to be in discouraging further adventures of the same sort. That was the most urgent matter, given that what happened in Iraq might be about to happen on a larger scale in Iran.
In other words, you are framing your enquiry so as to get the result you want – precisely the sort of thing you accuse the Lancet team of doing.
Heiko’s and Brian’s latest comments sent me back to Table 2. Here are the results of some simple artithmetic.
I took the Total Columns (children+women+men+elderly), not counting the Fallujah cluster, and splitting Violence into not Coalition and by Coalition (9 and 12 deaths respectively, from somewhere in a recent thread).
For Preinvasion, I annualized the raw death numbers by multiplying by 82.2% (12mo/14.6mo).
For Postinvasion, I annualized by multiplying by 67.4% (12mo/17.8mo).
Here are the numbers to one decimal.
B.I., Before Invasion, A.I., After Invasion
B.I. A.I. Category
9.0 12.1 Heart attack/Stroke
9.0 7.4 Other chronic disorder
0.8 3.4 Infectious disease
4.9 6.7 Neonatal/Unexplained infant
9.9 8.1 Other
3.3 8.8 Accident
0.8 8.1 Violence not Coalition
0.0 6.1 Violence by Coalition
37.8 60.7 Totals
The total excess deaths, annualized, would be approximated by 60.7 – 37.8 = 22.9.
Violence by Coalition excess deaths, annualized, would be 6.1. or 27% of total excess deaths.
Roberts central estimate of 98,000 excess deaths would ascribe about 26,500 deaths due to Coalition violence in the 17.8 month post-invasion period, or 50 a day.
Here are the rest of these cocktail-napkin calculations of excess deaths.
In 17.8 mo, then expressed per-day
13,000 25 Heart attack/Stroke
-7,000 -13 Other chronic disorder
11,000 21 Infectious disease
8,000 14 Neonatal/Unexplained infant
-8,000 -14 Other
24,000 43 Accident
31,000 58 Violence not Coalition
26,000 48 Violence by Coalition
98,000 181 Total
This is arithmetic, not statistics. But, as far as I know, these are the only numbers (aside from cluster distribution data) on which Roberts’ EFC statistical analysis is based. So I would expect (common-sensically) that the statistically-derived central tendencies shouldn’t deviate from these by more than a factor of, say, 2.
I would collate the numbers this way:
In 17.8 mo
18,000 Diseases and Infant Mortality
24,000 Accident
31,000 Violence not Coalition
26,000 Violence by Coalition
Roberts’ statistically-derived numbers aren’t presented in their paper.
The Roberts Summary and the final four paragraphs of their Discussion read a bit funny when accompanied by these derivative tables.
Separately, my tables themselves read funny in these comments, because Movable Type (or whatever) collapses multiple spaces into a single one. It’s the best I can manage.
Hello Gerhauser,
My apologies on not being clear. If 5500 are being killed every day, and we hear *at the very most* of a maximum of 300 (the bombings of the Shia mosques last year), then there are at least 5200+ dying every day that we are not hearing about. Either there is a massive coverup, a massive failure by our journalists, or they simply don’t exist. My mind was going faster than my writing :-) My error :-)
AMac – thanks – good posting! That helps frame a bit. Did they happen do differentiate insurgents killed by the Coalition and civilians? At that point in time, StrategyPage.com (very reputable), was carrying about 17,000 insurgents KIA by that point. If they are blended into the “Violence by Coalition”, that produces a signficant skew. Also, did they indicate where the Iraqi security forces were being tracked in the stats? Thanks!
Kevin Donoghue wrote:
Good point. We should expand the study to Iran, which would allow us to weigh possible tens, hundreds or even thousands of civilian deaths in a preventive U.S. attack, vs. likely hundreds of thousands or millions of civilian deaths if the Iranian regime uses nuclear bombs against Israel (as it has stated it will do), Europe or the U.S.
Perhaps the main difference between us is that you are most interested in discouraging the U.S. whereas I think it’s more important to discourage the Iranian mullahs. I leave it to readers to decide which of us has the more realistic appreciation of current geopolitical risks.
Hi Brian,
I still don’t get it. 5000 per day would be 100,000 in 20 days, 300,000 in 60 days. You are talking about some theoretical upper confidence interval including Fallujah here?
Jonathan: “We should expand the study to Iran, which would allow us to weigh possible tens, hundreds or even thousands of civilian deaths in a preventive U.S. attack….”
In the light of the Lancet study the phrase “even thousands of civilian deaths” suggests you haven’t given any real thought to the loss of life your proposed policy entails. Think hundreds of thousands. Airstrikes would not suffice and Iran would be a much more troublesome country to invade and occupy than Iraq.
This is the point which Shannon Love is missing: the main value of the study is that it is helping, in its own small way, to restore sanity to policy discussions.
Of course, in order to be unaffected, one only has to cover one’s ears and shout “Bad sample! Bad sample!” That, in a nutshell, is the Love critique and I have wasted enough time on it.
Farewell, and thank you for your hospitality.
Hi Kevin,
Iran is a completely different case. It won’t be invaded unless a city gets nuked somewhere, or a nuclear bomb intercepted a few miles off New York, or something of the sort.
It won’t be attacked, because the cost would be obviously much higher (in terms of civilian lives lost and/or American soldiers, depending on the tactics, if carpet bombing was employed the place could be bombed back to the stone age even without using nuclear weapons and the likely US death toll would be nearly zero – with the proviso of course that this would be only possible politically, if it was in clear self-defense) and the benefits (Iran is autocratic, but so much better governed and less agressive than Iraq in so many ways) much lower than in the case of Iraq.
Kevin Donoghue wrote:
To a less blindered person it might suggest that I do not accept the Lancet study’s conclusions. Since I have already made clear that I reject those conclusions, your response is an evasion.
A reasonable person might also conclude that my use of the phrase “even thousands of civilian deaths” indicates that I think the alternatives are worse, which of course was my point. And of course you ignore this point, because to acknowledge that the Teheran regime is capable of killing millions of people is to admit that maybe the U.S. war effort is justified after all.
Your premises appear to be 1) that the Lancet article’s assertions about civilian deaths are important in themselves, even if they are unsupported by data and 2) that if I agree with one of your stipulations I must therefore also agree with your conclusions. Both of these premises are nonsequiturs.
That is indeed the gist of Shannon’s argument. And again your position appears to be that the anti-war case is intrinsically so important that we should overlook the inadequacy of the data supporting it. That’s absurd.
We really have arrived at po-mo science.
Brian (3/25 5:15pm):
>Did Roberts differentiate insurgents killed by the Coalition from civilians?
Table 2 apparently breaks down to 9 killed by coalition excluding the Fallujah cluster (EFC). Roberts says (p. 7 col. 2) that 3 of 61 IFC killings were by US ground forces; one might have been an insurgent and two were mistakes. All the rest were from airstrikes.
> Did Roberts track Iraqi security forces in the stats?
As far as I can tell, no.
> At that point in time, StrategyPage.com was carrying about 17,000 insurgents KIA.
Well, like Shannon Love, every time I re-read the Roberts paper, it looks wierd in a new and different way. Unlike Love, I accept the honesty of the raw data, mainly because as a whole it’s so “bad” that I can’t see anybody massaging it to get it to this point. I can’t say that the data collection was done correctly–just seems that it’s honest.
Having to exclude the Fallujah cluster is really embarassing, in my opinion, despite what dsquared, T. Lambert, K. Donoghue, disputo, and others say. This was a study of postwar violence in Iraq, known a priori to be highly geographically heterogeneous. Fallujah was known to be the most violent place, followed by (off the top of my head) Ramadi, Tikrit, Syrian border, Najaf, Salman Pak area, and Sadr City [not in any order]. When their cluster-assignment process gives them Fallujah, their analysis turns out to be based on algorithms that can’t handle it.
Re-read Kruskal’s “wild data” essay. He’s talking about unexpected things causing outliers. Measurement failures, fins falling off bombs, data-entry errors. Here, we are talking about an outlier that is caused by a seemingly-correct tabulation of the very thing that the authors knowingly set out to study. Common sense says, bad design. If I was to guess, it would be (1) failure to segregate known high-violence areas, and (2) insufficient power overall, i.e. 33 clusters was way too few to produce a usefully tight confidence interval.
More later.
An update on the CIA data (not that I find their reply very useful):
Thank you for your inquiry. I am the analyst for Iraq here in the International Programs Center and have been asked to respond to your inquiry. While Census Bureau’s International Data Base, which feeds into the CIA factbook, is updated annually, we sometimes have not completed analysis of new information at the time of an update, and this is the case for Iraq. We have begun a revision to the IDB mortality estimates for Iraq but have not completed that revision at this time. I expect our new estimates will be more in line with the UNICEF estimates than those currently shown in the IDB and in the Factbook, but until we complete
this update I can’t be more specific.
Sincerely,
–
Dr. Thomas M. McDevitt
Chief, Population Studies Branch
International Programs Center
Population Division
U.S. Census Bureau
ph: 301-763-1356
FAX 301-457-1539
email: thomas.m.mcdevitt@census.gov
The Unicef estimate of infant mortality is nearly twice that of the CIA factbook. Oh, how I hate this uncertainty and the apparent unwillingness by IPC and Unicef to publish their methods clearly and somewhere easily accessible.
Shannon, Let me get this straight, you admit that it is common sense that body counts save lives but this is not sufficient — you want a controlled study comparing a conflict with a count with one without. But to do such a comparison you need to do a body count in both conflicts because otherwise you can’t measure whether lives are saved or not.
One possibility that comes to mind is that attempts to minimise civilian casualties can be counter-productive: it may mean that the conflict lasts longer than it would otherwise, and that the combatants may become so powerful that certain cities become no-drive zones for coalition forces – meaning that fighting the combatants may involve air strikes, which may increase the possibility of civilian deaths.
If this is so, and body counts cause the armed forces to do actions that attempt to avoid civilian casualties, then it’s possible that body counts may increase civilian casualties.
Roberts data and Violence-by-Coalition Deaths
Table 2 highlights that excess deaths in a country of 25 million are being projected from very small numbers. Total postwar dead counted: 90 EFC, 142 IFC.
Common sense says that when one breaks a total into smaller sets, the CI gets larger. Estimates that start with very small numbers of deaths will yield CIs that are very broad.
Roberts’ best-estimate of excess deaths and 95% CI from 90 total deaths: 98,000; 8,000-194,000
Roberts’ best-estimate of excess deaths from 9 Violence-by-Coalition deaths: not stated.
OK, here’s my arithmetic-based estimate of Roberts’ estimate of Violence-by-Coalition deaths: 26,000 (see my handling of Table 2 data earlier on this thread, 3/25 5:07pm).
What’s the 95% CI on Violence-by-Coalition deaths? It has to be, proportionally, much wider than that given for all deaths. I’ll guess it’s proportional to the overall CI and two times its width (this is incorrect; for purposes of discussion only).
Violence-by-Coalition deaths: 26,000; 1,000-103,000.
The authors can make statements about Violence-by-Coalition deaths in the paper. The Lancet’s editors can make statements about Violence-by-Coalition deaths in editorials. Journalists and bloggers can make statements about Violence-by-Coalition deaths in the wider world.
However, as they relate to the Roberts mortality analysis, such statements are based on an unstated number (~26,000?) that would be accompanied by a very broad 95% CI (1,000??-103,000??).
Where is the real statistical-analysis-derived estimate for Violence by Coalition deaths? Where is the calculated 95% CI for that number?
In their absence, all Roberts-based discussion of Violence-by-Coalition deaths is based on:
That’s it.
What Roberts say in the Summary and the Discussion has already been reviewed in these threads, notably their habit of segueing, unannounced, from EFC analysis to IFC raw numbers.
Here’s what Lancet editor Richard Horton wrote in the accompanying editorial “The war in Iraq: civilian casualties, political responsibilities” (Lancet v. 364, #9448, p. 1831, 20 Nov 2004 [same date as Roberts]):
Here is how Bushra Ibrahim Al-Rubeyi begins his invited Comment, “Mortality before and after the invasion of Iraq in 2003” (Lancet v. 364, # 9448, p. 1834, 20 Nov 2004):
I hope these editorial comments on Roberts’ paper–released just-in-time for the US Presidential election–were due to incompetence in interpreting the article. The alternative is that they were sound-bites designed to mislead readers as to what conclusions Roberts’ data actually supported.
This discussion is germane to Shannon Love’s assertion that the Lancet has abandoned standards of objectivity in its handling of Roberts. If this is how the chief editor behaves, it’s easy to imagine how the Lancet’s transparency-lacking peer-review process could be corrupted. The Lancet’s “brand reputation”–its prestige and its citation index–mean very little.
What this journal publishes is not to be trusted.
By the way, here is a Guardian interview of Lancet editor Richard Horton on the subject of the Roberts paper from early November, 2004. He sure thinks he read the paper carefully and understands it well.
Thursday, March 24, 2005 6:25 AM
Thank you for taking action at http://www.countthecasualties.org.uk
Here is a copy of your email to Jack Straw:
Dear Foreign Secretary,
I am writing to ask the government to commission a government survey to
determine how many Iraqis have died or been injured since the March 2003
invasion – and the cause of those casualties.
I believe that Iraqi casualties could provide useful decision making
information. If a good count can be made, it may also be useful in dispelling
the propaganda of the bias parties which often do their own shody research.
Yours sincerely,
Aaron Chmielewski
Michigan, US
Aaron Chmielewski
aaroncc44@hotmail.com
As far as I can tell, this is (more or less) how the study arrived at the 100,000 “additional” death figure (please feel free to correct me if I’m wrong or point out any flaws with this conclusion). I am almost positive that I am missing out on a few minor details, but I think the general method I am presenting here to be accurate:
Could anyone provide me with the equation the study used to come up with the range 8,000 to 194,000. Thanks.
—Ray D.
Ray: they used the commonly available statistics program EpiInfo; I’d guess that the specific equations for estimating the standard errors would be in the EpiInfo manual.
Quick question, and a relevant one in light of the very small sample sizes here.
What steps were taken to ensure that death due to the actions of Saddam’s security forces, which may not show up in formal death statistics and may noit end up in morgues, were included within the study?
What measures were taken, if any, to ensure that questions about this item would be adequately sampled for and accounted for properly, especially given that the deaths tended to happen in remote geographical locations that were not towns, and whose populations may have been taken there before the period studied?
Enquiring minds want to know.
I have posted a comment at Tim Lambert’s site Deltoid, explaining why I think the Roberts paper has five, maybe six, severe problems that should have–and could have–been fixed prior to publication.
As close readers of these threads know, I have become reasonably confident that the raw data were honestly collected and collated, so Roberts’ harshest critics will find no satisfaction there.
This hyperlink will take you directly to the comment (timestamped 28/3 13:48), on the 25 March post “Lancet Links.”
Above post’s five, maybe six severe problems
Oops, make that four, maybe five severe problems
The Lancet Study: A Closer Look
Well, with all of the argument surrounding it, I decided to have a look at the Lancet Study myself. Here is what I concluded after reading it:
The Lancet Study is based on a survey conducted in 33 “clusters” throughout Iraq in which 988 households containing 7868 residents served as the basis for the study’s findings. Each cluster contained 30 households and an average of 238.4 residents. One of the clusters just happened to end up in Falluja, where an overwhelmingly disproportionate number of deaths had occurred. For that reason it was not considered in the calculations that lead to the famous 100,000 figure because, according to the authors, “the Falluja cluster is an obvious outlier and might not belong with the others.”
So for its findings the study relied on 32 clusters with (if we take the averages) 960 households and around 7630 residents. In these 32 clusters, 46 deaths were reported in a 14.6 month (or 442 day) period from January 1, 2002 to March 18, 2003 leading up to the war, which yields an average of 3.15 deaths per month. This period is referred to as the “pre-invasion” period by the study’s authors. In the same clusters, 89 deaths (21 of which were violent) were reported in the 17.8 month (or 548 day) period from March 19, 2003 to September 16, 2004, which yields an average of 5 deaths per month. This period, which includes the major combat operations phase of the Iraq conflict, is referred to as the “post-invasion” period by the study’s authors.
The numbers above serve as the fundamental basis for the Lancet Study’s estimate of “100,000 excess deaths” in Iraq, which the study claims were mainly attributable “to coalition forces.” Here is essentially how they used the numbers they provided to get that result:
First, because the two periods under consideration are different (14.6 months or “110,538 person-months of residency” versus 17.8 months or “138,439 person months of residency”), one has to adjust for time. One of the ways this can be done is by simply taking the average number of recorded deaths per month “pre-invasion” (3.15) and multiplying it by 17.8 months. That gives us right about 56 deaths. Remember for the “post-invasion” period, we had 89 deaths in 17.8 months. So now, after adjusting for time, we still have 33 “additional” deaths (89 minus 56) in the “post-invasion” period in our 32 clusters.
You may be scratching your head at this point and wondering…how did they get 100,000+ “additional” Iraqi deaths from 33? Well, this is how:
First you take the estimated population of Iraq: The survey uses 24.4 million. You divide the total population of Iraq by the overall number of people included in the 32 clusters surveyed (7630) for a result of 3198. Then you take your 33 “additional” deaths and multiply them by 3198…voila…there you have your 100,000+ “additional” Iraqi deaths. Of course there are numerous different ways one can plug in the numbers that will lead to slight variations, but the end result will always be right around 100,000.
For the record, I attempted to get a hold of the exact methods and computations carried out to reach the 98,000 figure cited by the study for the 32 clusters that served as the basis for the study’s 100,000 estimate from the study’s leader, L. Roberts. His reply:
“Dear Ray,
I am sorry, but I have 67 new e-mails today and I cannot get into this, especially the 98,000 since it was based on 32 regression lines with a boot-strapped confidence interval.”
The study used then another equation to calculate its “confidence interval” of 8,000 to 194,000 “additional” deaths.
That’s right ladies and gentleman, this entire study is based on a few dozen deaths. To be more precise the entire study and its 100,000 “additional” death estimate is based on 89 deaths in a 17.8 month period during and after the war versus 46 deaths in a 14.6 month period before the war in a 32 cluster sample group. In other words, if even one cluster was disproportionately affected by death (or a lack thereof) either before, during or after the war, the results of the study would be dramatically off base. If even just a few deaths were incorrectly recorded or invented, it also would have dramatically changed the study’s results. If even just one or two of the surveyed families was the unfortunate victim of a particularly violent bombing incident or terror attack, the entire survey could be way off base and fatally skewed. In fact, a change of just 10 deaths for any reason would have the effect of throwing this survey off by around plus or minus 30,000 deaths. The conclusions built on this data are comparable to houses built on quicksand because all of the results and assumptions rely on a sample group that is simply too small to give us any reliable or relevant data whatsoever.
How could anyone claim that this sample group was large enough to be taken seriously when debating an issue of such enormous gravity? How could anyone rely on a survey that makes such heavy claims of 100,000 “additional” dead in Iraq when it is based on a difference of around 33 deaths (time adjusted)? This is explainable only if we consider the worldview and ideology of those who have exhibited a burning desire to believe the results, whether true or not.
IBC vs. Lancet
The Iraqi Body Count is another left-leaning project aimed at recording the number of deaths in Iraq during and after major combat operations. Unlike the Lancet study, which relied on a mathematical estimate derived from a small sample group, the Iraqi Body Count is dependent primarily on the media for its casualty tally, but has also included information from the Iraqi government and in a few cases from NGOs.
Even assuming that all of the media sources and all of the other data sources the IBC relies upon are trustworthy and reliable (they include Al-Jazeera), the maximum casualty count according to the site is just over 19,500. That would mean, if we believe the Lancet study’s estimate of 100,000 “post invasion” deaths to be true, that over 80,000 Iraqi deaths, caused “mainly by the coalition” have gone completely unnoticed by the international media, Iraqi authorities and NGOs. That is over four-fifths or more than 80% of the deaths the Lancet study estimates have occurred.
Infant Mortality and the Lancet Study
The study’s “pre-invasion” infant mortality rate was also derived by combining a tiny sample group with a flawed assumption. Here again, the study relies on a minute number of actual recorded infant deaths to reach its reported “pre-invasion” infant mortality rate of 29 deaths per every 1000 livebirths for all Iraq. To be more precise, 8 actual infant deaths serve as the basis for this statistic that is purported to accurately reflect the infant mortality rate in a nation of over 24 million.
Here is how the authors came up with the figure: The study recorded 275 births and 8 infant deaths in the 14.6 months before the war. Using those figures, the authors came up with the 29 infant deaths per 1000 live births infant mortality figure which they then proceeded to accept as an accurate estimate for the infant mortality rate for the entire nation.
The study defends the 29 figure by noting that “the preconflict infant mortality rate (29 deaths per 1000 livebirths) we recorded is similar to estimates from neighbouring countries.” In so doing it blatantly ignores a number of key facts. First, other “neighbouring countries” were not subject to sanctions or Saddam Hussein’s reign of tyranny in the same period. Secondly, a UNICEF study conducted in Iraq for the period from 1994 to 1999 came up with an infant mortality figure of 108 per 1000 livebirths. That would mean if the UNICEF numbers are accurate, the infant mortality rate would have had to drop by over three-and-a-half times within 3 years in an Iraq under sanctions and Hussein’s rule.
The “58-Fold” Canard
Some particularly outspoken critics of the Iraq war have pointed with horror and outrage to the Lancet study’s finding that: “Violence-specific mortality rate went up 58-fold during the period after invasion.”
One problem with the 58 fold figure is that it is derived using all 33 clusters including the data gathered in Falluja. The authors came to the figure using this data: “After the invasion, 142 deaths (73 of them violent) were reported in 138,439 person-months of residency. Before the invasion, respondent households reported 46 deaths (1 of them violent) during 110,538 person-months of residency.”
So again, you simply adjust for time by dividing 110,538 by 138,439. That gives you .79846. Then you multiply that by the ratio of the number of violent deaths “post-invasion” to the number of violent deaths “pre-invasion.” That figure is 73. 73 multiplied by .79846 gives us 58.3.
But get this, if even one more death had been reported for the “pre-invasion” period, it would have cut the 58-fold” figure in half to 29! In fact, the “58-fold” figure depends on just one reported death in the “pre-invasion” period.
Here again, the data is fatally flawed and rendered useless by the fact that the sample group is simply far, far too small to produce meaningful results.
Falluja Not a Risk???
In the “Methods” section of the Lancet study, the authors changed the locations of some of the clusters they planned to visit. The reason, according to them was the following:
“During September 2004, many roads were not under the control of the Government of Iraq or coalition forces. Local police checkpoints were perceived by team members as target identification screens for rebel groups. To lessen risks to investigators, we sought to minimise travel distances and the number of Governorates to visit, while still sampling from all regions of the country. We did this by clumping pairs of Governorates.”
In other words, some of the clusters were moved in order to (in the words of the authors) “lessen risks to investigators.” Yet the same study authors who were so worried about risks had no problem allowing their “investigators” to drive right into Falluja in September 2004 and conduct their survey while the city was still under the control of violent insurgents and subject to a Coalition siege featuring almost daily bombardment from the air and ground. Let’s not forget about the kidnapping victims who ended up at Al-Qaeda headquarters in Falluja during this same period.
Now, how, exactly, is that consistent? If the study leaders were as concerned about safety as they claim they were, why would they allow their investigators to drive right into Falluja at a time like that? What could have possibly been their motivation? Does that not seem just the slightest bit contradictory? All things considered, one has to wonder about the logic that went into selecting the other 32 clusters chosen as well.
Saddam’s Long Rule of Violence Not Accurately Reflected by Study
And how can we say that the 14.6 months preceding the war are an accurate reflection/representation of Saddam Hussein’s 24 year regime and the violence wrought by that regime over the years? The year and a half before the war happened to be one of the less violent periods of Hussein’s regime. But is this period an accurate reflection of the average number of deaths suffered at the hands of the Baathist tyranny over the years? Of course not! And there is no guarantee that Saddam and his vicious sons would not have carried out future massacres had they been left in power. There history (they are responsible for killing hundreds of thousands) certainly would not have made such an event unlikely.
Again, for the reasons given above, the Lancet study is fatally flawed. Above all, the data derived from the study is useless because the group sampled is too small and too subject to radical change through small anomalies.
One last point: Tim has mentioned the death certificate issue. He has pointed out that in the 78 households where documentation of death was requested (or in the words of the authors, “where confirmations were attempted”, in 61 cases (81% of the time) it was provided. However, let’s keep in mind that the study recorded a total of 231 deaths. That would indicate that, unless all 231 deaths in fact occurred in just 78 households, that the study did not even request proof of death in many cases of reported death. Why don’t you deal with these issues Tim, dsquared and the other Lancet defenders???
—Ray D.
Because most of them are rehashes of critiques already dealt with.
If even just one or two of the surveyed families was the unfortunate victim of a particularly violent bombing incident or terror attack, the entire survey could be way off base and fatally skewed.
Yes, but how likely is this? If “particularly violent bombing incidents” were rare, then it is very unlikely that this would bias the results. On the other hand, if “particularly violent bombing incidents” were common, then the number of deaths is likely to be high. This is, to be frank, why random sampling works, and the fact that you don’t seem to understand this suggests that you’re not really familiar with your subject matter. The sample isn’t “ridiculously small”; they sampled 7800 individuals in 33 clusters. If you find 33 deaths in your sample when you only expected to find 19, then that’s a significant result.
Nevertheless, Ray has clearly done some work here and I respect that; it’s a good faith critique and much better than most. Answering those points that I consider to be reasonably new:
[On the IBC versus Lancet theme …]
That would mean, if we believe the Lancet study’s estimate of 100,000 “post invasion” deaths to be true, that over 80,000 Iraqi deaths, caused “mainly by the coalition” have gone completely unnoticed by the international media, Iraqi authorities and NGOs
No. Firstly the statement “mainly by the coalition” does not appear in the Lancet study in that form and does not accurately summarise its findings. But more importantly, IBC has a tight definition of what it is going to add to its count (a major source of bias is that when IBC gets a news report of “a family” being killed, it increments the counter by 4 to be conservative, despite the fact that an average nonextended family in Iraq is more like 6). IBC also does not make anything like the sweeping claims you make on its behalf about the comprehensiveness of its coverage. IBC does not endorse the use of its number as a stick to beat the Lancet survey.
[on infant mortality]
Secondly, a UNICEF study conducted in Iraq for the period from 1994 to 1999 came up with an infant mortality figure of 108 per 1000 livebirths. That would mean if the UNICEF numbers are accurate, the infant mortality rate would have had to drop by over three-and-a-half times within 3 years in an Iraq under sanctions and Hussein’s rule.
The UNICEF study for 1999 was based on fieldwork carried out in 1998. It therefore does not cover the oil-for-food period under which sanctions were relaxed. Oil-for-food had a very significant effect on infant malnutrition, and infant malnutrition is closely correlated with infant mortality.
Some particularly outspoken critics of the Iraq war
The Lancet team are not responsible for what “particularly outspoken critics of the Iraq war” say; they presented the data they had.
[on cluster selection; this is the only genuinely new critique I have seen]
Now, how, exactly, is that consistent? If the study leaders were as concerned about safety as they claim they were, why would they allow their investigators to drive right into Falluja at a time like that? What could have possibly been their motivation? Does that not seem just the slightest bit contradictory? All things considered, one has to wonder about the logic that went into selecting the other 32 clusters chosen as well.
The clusters were selected randomly, so “logic” does not enter into it. Since Fallujah is representative of other high-violence areas of Iraq, it would not have been honest to simply pretend that it was not there when the random sampler selected a cluster there. The survey team minimised their risks by not also sampling Najaf, Samarra and Ramadi.
moving on:
And how can we say that the 14.6 months preceding the war are an accurate reflection/representation of Saddam Hussein’s 24 year regime and the violence wrought by that regime over the years?
This is clearly not a critique of the study itself; if you want to argue this separate point it would be best to do so outside the context of the Lancet study.
That would indicate that, unless all 231 deaths in fact occurred in just 78 households, that the study did not even request proof of death in many cases of reported death.
The study gives its reasons why it did not request death certificates in every case; it requested two death certificates per cluster (except, obviously, in those clusters with no deaths) and you give no reason to believe that this wasn’t a satisfactory way to deal with the problem.
@ dsquared:
I found this point interesting:
“The UNICEF study for 1999 was based on fieldwork carried out in 1998. It therefore does not cover the oil-for-food period under which sanctions were relaxed. Oil-for-food had a very significant effect on infant malnutrition, and infant malnutrition is closely correlated with infant mortality.”
Even assuming that there was a significant improvement in nutrition due to oil-for-food, it would still be very difficult to argue that the rate in areas under Saddam dropped from an estimated 108 to 29. Even in autonomous Kurdish areas of Iraq not affected by sanctions or Saddam, the infant mortality rate was recorded to be 59 per 1000 livebirths for 1994 to 1999, more than double the figure the Lancet study gives us for all of Iraq in 2002.
Anyway dsquared, if the infant mortality number is an indicator of the accuracy of this study, then I think you are in a lot of trouble…
“The study gives its reasons why it did not request death certificates in every case; it requested two death certificates per cluster (except, obviously, in those clusters with no deaths) and you give no reason to believe that this wasn’t a satisfactory way to deal with the problem.”
With all due respect dsquared, I think it is a problem, because we are clearly relying on a lot of data that is unproven and undocumented. If even just a few people made up deaths or failed to report them, the results would have been changed dramatically due to the small number of deaths actually reported (46 vs 89/142).
BTW, for the record, I would like to make a correction to my first post above, actually the study recorded a total of 188 deaths (if we count the Falluja data), not 231. I added 142 and 89 when I should have added 142 and 46. Just want to be clear on that. The original point I made remains though.
In terms of your other points, I don’t think they really challenge my arguments.
—Ray D.
@ dsquared:
Just wanted to address one more point you made:
“No. Firstly the statement “mainly by the coalition” does not appear in the Lancet study in that form and does not accurately summarise its findings.”
Well, here is what the study itself says, I will let the readers decide:
I am baffled how the study authors can make that statement if they exclude the Falluja data. I have to believe that they are counting the Falluja data in making this statement. Otherwise, the data does not support saying that.
This points out another problem with the study, the inclusion/exclusion of the Falluja data which is not always clear.
—Ray D.
@ dsquared:
You question how likely it is that one of the families in the survey could be impacted disproportionately by death or a lack thereof through an event like a bombing, etc. Well, there were 988 households surveyed. Do you find it unlikely that even one or two of them might have been unusually hard hit by a bombing or act of terror during or after the war? Again, if even just one or two families was especially hard-hit, it would have dramatically impacted the data. That is why I am saying the sample group is too small and also why I am saying all of the conclusions built on this data are like a house built on quicksand.
—Ray D.
Do you find it unlikely that even one or two of them might have been unusually hard hit by a bombing or act of terror during or after the war?
Well no, but that’s because I believe that there were a lot of fatalities due to bombing and acts of terror after the war. It’s not true that “just one or two” households could have made a major difference to the results, by the way; excluding the violent deaths in the Fallujah cluster, you’re going from 36 deaths to 60 deaths (on an annualised basis, multiplying the prewar deaths by 12/14 and the postwar deaths by 12/18). That’s 24 deaths; the average household size in the survey was 8 and the survey didn’t count households in which everyone was dead (obviously).
@ dsquared:
I have to start with a correction to an earlier post: To my knowledge, Kurdish areas were also subject to sanctions but not entirely subject to Saddam’s control. I still find the presumption that infant mortality dropped nearly 4 times in 3 or 4 years highly questionable.
Now: It is absolutely likely that a few deaths could have dramatically changed the results. Why? All of the data depends on 46 pre-war deaths in 14.6 months versus 89 (or 142 if we count Falluja) “post-invasion” deaths in 17.8 months. It doesn’t take a rocket scientist to figure out that even a difference of 10 deaths in either period would give us very different results.
You write:
“the average household size in the survey was 8 and the survey didn’t count households in which everyone was dead (obviously).
Honestly, I’m not sure it is so obvious. Here is what the study says:
It seems to me that they did try to take that factor into account dsquared. I don’t think I’ve misinterpreted the study either. Let me know if you disagree and why.
—Ray D.