Tuesday, July 01, 2008

Obermeyer, Murray and Gakidou

Thanks to Donald Johnson writing at Deltoid for the pointer to this article from the British Medical Journal, "Fifty years of violent war deaths from Vietnam to Bosnia: analysis of data from the world health survey programme" by Ziad Obermeyer, Christopher J L Murray, and Emmanuela Gakidou. Below are my comments, expanded from my initial thoughts at Deltoid. I refer to the authors by their initials: OMG.

OMG's key claim, for purposes of Lancet aficionados, is that "media estimates capture on average a third of the number of deaths estimates from a population based surveys." This matters because Les Roberts has been running around for years claiming that passive surveillance (as Iraq Body Count uses) is a horrible method of estimating mortality and never (except possibly in places like Bosnia) captures more than a small percentage of all deaths. (In fairness to Roberts, this is a new article and so, perhaps, his previous claims were justified by the research he had access to at the time.)

Will Roberts now acknowledge this? Time will tell.

The paragraph most directly relevant to disputes over Iraq mortality is:

As a final point of comparison, we applied our correction method, derived from the comparison of survey estimates with Uppsala/PRIO data, to data from the Iraq Body Count project’s most recent report of 86,539 (the midpoint of the 82,772 to 90,305 range reported in April 2008) dead in Iraq since 2003. Our adjusted estimate of 184,000 violent deaths related to war falls between the Iraq Family Health Survey estimate of 151,000 (104,000 to 223,000) and the 601,000 estimate from the second Iraq mortality survey by Burnham and colleagues. [footnotes omitted]


1) Tim Lambert enjoys making a listing of various (reputable) estimates of mortality in Iraq. Example here Now, I might argue that, given all the problems that ORB has had, its estimate does not belong in Tim's collection. But there can be no doubt that OMG's estimate does belong. Will Tim add it?

2) If your main interest is judging the quality of L2, then a better comparison would have used the IBC numbers to July 2006 (mid-point 47,668), thus covering the same time period as L2 and IFHS. I am not sure what the exact formula is that allows OMG to go from 86,539 to 184,000. Assume that we can just apply this ratio (184,000/86,539 = 2.13) to the IBC estimate of 47,668. That would yield a violent death estimate of 102,000. Recall that, 2 years ago, Jon Pedersen estimated violent deaths at 100,000. Moreover, the IFHS estimate would be lowered from 151,000 to around 100,000 if you removed the "arbitrary fudge factor" (in Debarati Guha-Sapir's marvelous phrasing) that IFHS employs.

Call me crazy, but I would say that the emerging scientific consensus is that approximately 100,000 excess war-related violent deaths had occurred in Iraq through June 2006. This is 1/3 lower than 150,000 (0 -- 500,000) estimate that I was comfortable with earlier in this year. Given all the new research since then, I update my estimate to 100,000 with a 95% confidence intervals of (0 -- 300,000).

[Those who think that a lower bound of zero is too low should remember that the definition of "excess" implies a comparison to what would have happened in a counterfactual world without a US invasion/occupation. Although I do not think that it is likely that Saddam would have engaged in substantial internal (against Kurds/Shia) or external (against Iran/Kuwait) aggression, it is not impossible that he would have. Those (possible) violent deaths were prevented by the war. If the comparison is against mortality in Iraq in 2002, then the lower bound should be raised substantially.]

So, if IFHS and IBC+OMG are consistent with each other and with the opinions of informed observers like Pedersen and Guha-Sapir, why does L2 estimate violent mortality approximately 6 times higher? I think that the raw data underlying L2 is not reliable.


Anonymous joshd said...

Hi Dave,

Interesting points. Though there are a couple problems. First a correction for you: "Les Roberts has been running around for years claiming that passive surveillance (as Iraq Body Count uses)..."

That is not what Iraq Body Count uses. IBC uses primarily conflict reporting. That it (and therefore IBC) is "passive surveillance" is another falsehood that Roberts invented for his purposes. The term has been used in some medical literature prior, but in a very different context. Roberts has begun applying it very differently to things like IBC for rhetorical purposes (the word "passive" helps load the discussion with negative connotations toward the "passive" sources being argued against). Reporters are out there in the line of fire seeking out events and witnesses and getting killed in the course of their, um, "passive reporting". Does that make any sense at all? It only makes sense as a rhetorical term of art.

That Obermeyer et al carelessly adopt this "passive" fabrication from Roberts, is one of several indications that they really treat all of this "passive" data very carelessly and with little serious thought or understanding. Most of the problems I see revolve around carelessness or ignorance about the issues involved in all the different sources they lump into this rhetorical category of "passive" material. I doubt the authors know much about this kind of material, and the peer-reviewers probably less. These kind of sources are outside their sample survey universe. They don't know much about them, their differences, or the issues involved, so they make many mistakes and the peer-reviewers (drawn from the same universe) don't have the ability to notice or correct the mistakes.

For example, you note that the "key finding" is that "media estimates capture on average a third of the number of deaths estimates from a population based surveys."

But this does not even hold up in the set of cases they examine. 5 of the 13 cases go in the opposite direction, and very few cases come down to "a third" (incidentally, only one, Georgia, comes out with anything close to an L2-type disparity).

That "key finding" may make for a catchy headline, but it's such an extremely crude generalization that it just doesn't have much value. The way they reach it is also kind of suspect. If you do a straight average of the (positive and negative) disparities over the 13 conflicts it comes out to more like a factor of 2, not 3. They inflate it up to 3 by using a particular averaging scheme that weights some conflicts more than others. (And indeed the formula they build at the end uses a 2 factor, not 3.)

Another problem is that the data the Obermeyer thing is using is from PRIO: http://www.prio.no/CSCW/Datasets/Armed-Conflict/Battle-Deaths/The-Battle-Deaths-Dataset-version-20/

For some really strange reason (and despite the long list of footnotes) they never directly cite the "passive" data they are using as the basis of their comparisons, even as this is the most important citation of all. But it has to be this PRIO data. One key point about this data is that the PRIO approach is to evaluate multiple sources and arrive at a best estimate of "battle deaths" in each conflict. "Battle deaths" is a subset of violent deaths, so the Obermeyer piece is comparing their estimate of total violent deaths to the PRIO estimate of a subset of violent deaths. PRIO gives some explanation of this in here:

For example, the PRIO data considers IBC as one of several sources for the current Iraq war (it also considers the "passive reporting" of L1 as one of its sources). But it uses a figure of "5,514" for IBC for 2005. IBC's figure for 2005 is over 14,000. It attributes similarly lowered figures to IBC for 2003 and 2004. It doesn't say how it arrived at these lower figures from IBC, but apparently PRIO decided that 5,514 was the number of "battle deaths" within IBC's 2005 figures (see pages 196-198 in pdf), and only considers that number from IBC.

The final PRIO "best estimates" for Iraq, based on consideration of many sources, wind up lower than IBC in each year. But there are big differences between the two sources. PRIO is supposed to be just for "battle deaths", but on the other hand, PRIO includes coalition forces deaths and Iraqi combatants into their estimates. This undermines the premise of the Obermeyer paper that the PRIO relationships they find can be used as a proxy for every other "media" or "passive" source. Every such source can have substantial differences in approach which would mean the PRIO ratios would not apply.

Perhaps the biggest problem is another underlying premise that wherever a disparity exists between their estimates and those in PRIO that this is the failing of PRIO. In reality, the disparity indicates only that there is error either in PRIO, or in their own estimates, or in both. Two particular cases which, to me, directly contradict the premise they chose are Bosnia and Guatemala.

As noted in the Science article on this:
'Nils Petter Gleditsch, the editor of the Journal of Peace Research in Oslo, Norway, a co-author on an earlier analysis, says the team's analysis of Bosnian fatalities "is most concerning" because that conflict has been so thoroughly studied. The discrepancy with Bosnian studies "undermines the credibility of their entire study," he says.'

This is exactly right. That they came to a much higher figure for Bosnia than the RDC should have raised doubt about their own approach and conclusions, not the other way around. It should indicate, first, that the central estimates they arrive at have great uncertainty, and that they have the potential to exaggerate.

In the opposite direction is their estimate for Guatemala, which comes out extremely low compared to a lot of others, Patrick Ball in particular. Recall his Guatemala claims of extremely high "unreported" deaths were used in L2. Well, the Obermeyer study suggests Ball's Guatemala figures are inflated about ten-fold. I see some reason to believe Ball's figures are inflated in what I've read from his Guatemala stuff, but the 10-fold inflation proposed in this new thing is pretty extreme. Can we also just assume here that they now have the true figure and Ball's was a wild inflation, as they do?

Your points on their Iraq estimate are good. While they come to - at least - a plausible estimate, they do present it in a rather confused and misleading way. If you follow the method they present in their appendices, and the footnote, it is clear they are adjusting April 2008 IBC figures up. So their estimate is covering a much longer time-frame than IFHS or L2. Yet they present this as "falling between" IFHS and L2, when in fact, as you correctly note, it actually falls below IFHS when properly compared (or at least, it falls below the upwardly "fudged" IFHS). While it certainly looks to confirm the ILCS/IFHS kind of range, and which may seem desireable, I wouldn't place too much confidence in it for the reasons above. It's based to much on extremely crude generalization from other conflicts and other sources lumped into an arbitrary category with IBC.

"I think that the raw data underlying L2 is not reliable."

Gee, ya think. ;-) But I think that pretty clear before this.

7:37 AM  
Anonymous joshd said...

Sorry David, I got one thing confused above. The way they arrive at the 3 vs. 2 factors I got backwards. They use two different approaches to get these averages, one for the media headlines, and another for their substantive calculations.

1) For the headlines of "media estimates capture on average a third" they add up the 13 ratios and divide by 13. In this scheme every conflict is given an equal weighting no matter how few or how many deaths each contributes to the totals. Averaging them all this way produces a factor close to 3.

2) However, When doing substantive calculations, they use a different averaging scheme which produces a different result.

If you look at their Table 3, the total WHS deaths are 5,393,000 and the total PRIO deaths are 2,784,000. This gives a ratio around 1.9. It is this ratio that they basically wind up using for their formulas and in Figure 5 (though they do some other adjusting on top that changes the trend-line slightly).

In Table 3 you can see that the totals are there ready to give the 1.9 ratio: 5,393,000 vs. 2,784,000. Their final row gives these totals and if you calculate like all the above rows the number in the ratio column should be 1.9.

Instead you get... (*Asterisk) 3.0

But then 3.0 is never used for anything, while the 1.9 is used.

There could be arguments for or against either approach, as both are basically arbitrary averaging schemes. One way (3.0) weights each conflict as equal regardless of whether the conflict has an estimate of 35,000 deaths (Georgia) or 3.8 million (Vientam). The other way (1.9) weights bigger conflicts more heavily than smaller ones.

So toss a coin or take your pick. And they do. They pick the 3.0 for the media headlines and they pick 1.9 for their actual calculations and formulas.

A problem with method 1 would be that - particularly with small numbers of conflicts, as here - an outlier will skew the results a lot. In this case Georgia, even though very small, is a big outlier in its high ratio of 12. This high Georgia ratio of 12 gets equal weight to Vietnam's 1.8 and this inflates the final ratio up to 3. If you toss out Georgia, you're basically back to 2.

A problem with method 2 is that one or two really big conflicts can swamp all the smaller ones, and the final ratio will mostly just reflect the ratio of the big ones. So the final 1.9 here winds up closely mirroring their estimates for Vietnam.

So which is better? The authors apparently decided that method 1 was better for what they wanted media headlines to say, while method 2 was better for any serious calculations they do, and wrote up the paper accordingly.

9:43 AM  

Post a Comment

<< Home