IFHS As An Overestimate?
I have spent so much time and energy fighting the claim, made by Lancet defenders, that IFHS is an underestimate of violent mortality in Iraq, that I had never really considered the other side. Might IFHS be an overestimate? (Thanks to a reader for pointing this out.)
Consider one of the adjustments made in IFHS:
I had initially misread this as referring to the issues of clusters that the interviewers had not been able to visit because they were too dangerous. Looking more closely, I now see that this "underreporting" has nothing to do with missing clusters. Instead, I think, the concern is that some households might have, for whatever reason, failed to inform interviewers about violent deaths. I also think that "underreporting" covers the concern that entire households may have been killed or that families with higher than average mortality would have been more likely to leave the country. In either case, no household would be left to interview, thereby leading to underreporting.
But 35% is a big number! Where does it come from? I can't see any discussion in the paper or literature references. Why not use 5% or 200%? Note that the Lancet papers make no adjustment for underreporting, although they do discuss the issue. The Lancet authors use concerns with underreporting to justify, reasonably enough, that their estimates are "conservative."
But that means that we should not be comparing the 151,000 violent deaths estimate from IFHS with the 601,000 violent deaths from L2! The first number adjusts for underreporting while the second does not. This is an apples-versus-oranges comparison.
We can get a sense of the magnitude of this issue by comparing Table 3 and 4 in IFHS. In Table 3, the overall violent mortality rate is 1.09 (0.81 -- 1.50) per 1,000 per year without any adjustment for underreporting, but with adjustments for the missing clusters. In Table 4, on the other hand, we have these estimates for the 3 years after the invasion: 1.77, 1.56 and 1.67. The key point is, as the legend indicates, that these numbers are adjusted for "underreporting," unlike those in Table 3.
In fact, the main results section of the paper makes clear that this is a big issue.
The adjustment from underreporting raises the violent death rate about 50%, from 1.09 to 1.67. Both those numbers include the adjustment for missing clusters. That 50% increase does not match perfectly against the 35% figure quoted above, but there are a lot of messy details to consider so we are safe in assuming that the 1.09 to 1.67 increase is accurate.
So, back of the envelope, the appropriate number to compare to the 601,000 excess violent deaths from L2 is not the 151,000 cited by IFHS. Instead, we should use 100,000 or so, the excess death estimate which is implied by an violent death rate of 1.09. [1.67 divided by 1.09 times 151,000 equals 98,600, but I am rounding up to be "conservative."] In other words, once we adjust for the use of underreporting, we see that the L2 estimate is six times larger than the one provided by IFHS, not only four times larger.
(Of course, one could also adjust the L2 number up to account for underreporting. Assuming an adjustment factor consistent with IFHS yields an excess violent death estimate of around 900,000.)
The main point is that we want to compare L2 with IFHS on an equal footing, either with adjustment or without. The current convention --- of which I am as guilty as anyone --- of comparing 601,000 (L2) with 151,000 (IFHS) and then concluding that the L2 estimate is 4 times higher is fatally flawed. To compare L2 and IFHS sensibly, we must either adjust L2 upward (so that it also adjusts for underreporting) or adjust IFHS downward (so that it does not adjust). Whichever you prefer, the conclusion is the same: the L2 estimate of violent deaths is six times larger than IFHS.
The next step in the analysis is to do the same for the issue of missing clusters. Both IFHS and L2 have missing clusters, clusters that were too dangerous for the interviewers to enter. In such cases, it is reasonable to believe that mortality might be higher in those clusters than in ones that the interviewers were able to visit. (But see here for a contrary view.) IFHS adjusts for this bias, but L2 does not. So, to compare the two surveys on an equal footing, we need to either remove this adjustment from IFHS or add it to L2. Doing so will make L2 an even greater multiple of IFHS than it already is.
UPDATE: Note that one can think about this change as either an increase in 53% (1.09 to 1.67) or a decrease of 35% (1.67 to 1.09). I have also been told that Deberati Guha Sapir refers to this as an "arbitrary fudge," but I am still tracking down that reference.
UPDATE II: Here is the quote in context:
UPDATE III: L2 also had a problem with clusters that were to violent to visit. From the paper:
So, we know that there was at least one cluster that was too violent to visit and we know that they had a "protocol" to deal with this problem. (It is unclear how many such clusters there were.) This means that a fair comparison between L2 and IFHS would need to either adjust both for this problem or adjust neither. The problem --- and, again, I am as guilty as anyone --- is that the standard 151,000 (IFHS) to 601,000 (L2) comparison adjusts only the former but not the later.
Fortunately, Table 4 in the Supplementary material (pdf) shows that the post-invasion violent mortality rate unadjusted for missing clusters is 0.80 (0.63 1.03). So, we need to cut the 151,000 IFHS estimate for excess deaths (which is based on a mortality rate of 1.67) in more than half in order to provide a fair comparison to the 601,000 figure from L2. Scaling appropriately, the IFHS estimate would be 72,000.
Summary: Using the same assumptions for L2 and IFHS (no adjustments for underreporting or for clusters that could not be visited) generates estimates that differ by more than a factor of 8: 601,000 to 72,000.
Consider one of the adjustments made in IFHS:
To estimate the most probable rate of violent deaths after the invasion and the range of uncertainty, we performed Monte Carlo simulations that took into account the survey sampling errors that were estimated with the use of the jackknife procedure and uncertainty regarding the missing cluster-adjustment factors, the level of underreporting, and the projected population numbers. We assumed that the level of underreporting was 35% (95% uncertainty range, 20 to 50), and its uncertainty was normally distributed.
I had initially misread this as referring to the issues of clusters that the interviewers had not been able to visit because they were too dangerous. Looking more closely, I now see that this "underreporting" has nothing to do with missing clusters. Instead, I think, the concern is that some households might have, for whatever reason, failed to inform interviewers about violent deaths. I also think that "underreporting" covers the concern that entire households may have been killed or that families with higher than average mortality would have been more likely to leave the country. In either case, no household would be left to interview, thereby leading to underreporting.
But 35% is a big number! Where does it come from? I can't see any discussion in the paper or literature references. Why not use 5% or 200%? Note that the Lancet papers make no adjustment for underreporting, although they do discuss the issue. The Lancet authors use concerns with underreporting to justify, reasonably enough, that their estimates are "conservative."
But that means that we should not be comparing the 151,000 violent deaths estimate from IFHS with the 601,000 violent deaths from L2! The first number adjusts for underreporting while the second does not. This is an apples-versus-oranges comparison.
We can get a sense of the magnitude of this issue by comparing Table 3 and 4 in IFHS. In Table 3, the overall violent mortality rate is 1.09 (0.81 -- 1.50) per 1,000 per year without any adjustment for underreporting, but with adjustments for the missing clusters. In Table 4, on the other hand, we have these estimates for the 3 years after the invasion: 1.77, 1.56 and 1.67. The key point is, as the legend indicates, that these numbers are adjusted for "underreporting," unlike those in Table 3.
In fact, the main results section of the paper makes clear that this is a big issue.
Interviewers visited 89.4% of 1086 household clusters during the study period; the household response rate was 96.2%. From January 2002 through June 2006, there were 1325 reported deaths. After adjustment for missing clusters, the overall rate of death per 1000 person-years was 5.31 (95% confidence interval [CI], 4.89 to 5.77); the estimated rate of violence-related death was 1.09 (95% CI, 0.81 to 1.50). When underreporting was taken into account, the rate of violence-related death was estimated to be 1.67 (95% uncertainty range, 1.24 to 2.30). This rate translates into an estimated number of violent deaths of 151,000 (95% uncertainty range, 104,000 to 223,000) from March 2003 through June 2006.
The adjustment from underreporting raises the violent death rate about 50%, from 1.09 to 1.67. Both those numbers include the adjustment for missing clusters. That 50% increase does not match perfectly against the 35% figure quoted above, but there are a lot of messy details to consider so we are safe in assuming that the 1.09 to 1.67 increase is accurate.
So, back of the envelope, the appropriate number to compare to the 601,000 excess violent deaths from L2 is not the 151,000 cited by IFHS. Instead, we should use 100,000 or so, the excess death estimate which is implied by an violent death rate of 1.09. [1.67 divided by 1.09 times 151,000 equals 98,600, but I am rounding up to be "conservative."] In other words, once we adjust for the use of underreporting, we see that the L2 estimate is six times larger than the one provided by IFHS, not only four times larger.
(Of course, one could also adjust the L2 number up to account for underreporting. Assuming an adjustment factor consistent with IFHS yields an excess violent death estimate of around 900,000.)
The main point is that we want to compare L2 with IFHS on an equal footing, either with adjustment or without. The current convention --- of which I am as guilty as anyone --- of comparing 601,000 (L2) with 151,000 (IFHS) and then concluding that the L2 estimate is 4 times higher is fatally flawed. To compare L2 and IFHS sensibly, we must either adjust L2 upward (so that it also adjusts for underreporting) or adjust IFHS downward (so that it does not adjust). Whichever you prefer, the conclusion is the same: the L2 estimate of violent deaths is six times larger than IFHS.
The next step in the analysis is to do the same for the issue of missing clusters. Both IFHS and L2 have missing clusters, clusters that were too dangerous for the interviewers to enter. In such cases, it is reasonable to believe that mortality might be higher in those clusters than in ones that the interviewers were able to visit. (But see here for a contrary view.) IFHS adjusts for this bias, but L2 does not. So, to compare the two surveys on an equal footing, we need to either remove this adjustment from IFHS or add it to L2. Doing so will make L2 an even greater multiple of IFHS than it already is.
UPDATE: Note that one can think about this change as either an increase in 53% (1.09 to 1.67) or a decrease of 35% (1.67 to 1.09). I have also been told that Deberati Guha Sapir refers to this as an "arbitrary fudge," but I am still tracking down that reference.
UPDATE II: Here is the quote in context:
Now comes the WHO survey. Conducted with the help of the Iraqi government, it is by far the most comprehensive mortality assessment to date. Interviewers visited 9345 homes in more than 1000 clusters. But its estimate of 151,000 violent deaths has come in for some criticism, too. Unlike other Iraq casualty surveys, this one includes an upward adjustment of 35% to account for “underreporting” of deaths due to migration, memory lapse, and dishonesty. “That is really an arbitrary fudge factor,” says Debarati Guha-Sapir, an epidemiologist at the WHO Collaborating Centre for Research on the Epidemiology of Disasters in Brussels, Belgium. But the number falls squarely within the range produced by a meta-analysis of all available mortality studies by Guha-Sapir and fellow centre epidemiologist Olivier Degomme. The Johns Hopkins figure is an outlier, she says.
UPDATE III: L2 also had a problem with clusters that were to violent to visit. From the paper:
The survey was done between May 20 and July 10, 2006. Only 47 of the sought 50 clusters were included in this analysis. On two occasions, miscommunication resulted in clusters not being visited in Muthanna and Dahuk, and instead being included in other Governorates. In Wassit, insecurity caused the team to choose the next nearest population area, in accordance with the study protocol. Later it was discovered that this second site was actually across the boundary in Baghdad Governorate.
So, we know that there was at least one cluster that was too violent to visit and we know that they had a "protocol" to deal with this problem. (It is unclear how many such clusters there were.) This means that a fair comparison between L2 and IFHS would need to either adjust both for this problem or adjust neither. The problem --- and, again, I am as guilty as anyone --- is that the standard 151,000 (IFHS) to 601,000 (L2) comparison adjusts only the former but not the later.
Fortunately, Table 4 in the Supplementary material (pdf) shows that the post-invasion violent mortality rate unadjusted for missing clusters is 0.80 (0.63 1.03). So, we need to cut the 151,000 IFHS estimate for excess deaths (which is based on a mortality rate of 1.67) in more than half in order to provide a fair comparison to the 601,000 figure from L2. Scaling appropriately, the IFHS estimate would be 72,000.
Summary: Using the same assumptions for L2 and IFHS (no adjustments for underreporting or for clusters that could not be visited) generates estimates that differ by more than a factor of 8: 601,000 to 72,000.