More Quotes From Burnham
Previously, I commented on aspects of Gilbert Burnham's talk at MIT. Now, I want to record some key quotes.
Around 18:00 discussing the violent deaths which occurred before the war started. Note that this discussion concerns Lancet I.
I previously covered this topic in the context of Lancet II. The topic of Falluja and confidence intervals starts around 18:30.
Around 20:30 we have this.
Around 21:30 (starting discussion of Lancet II):
It is still somewhat a mystery why there were no clusters in Muthanna and Dahuk. The paper refers to "miscommunication." Burnham says (around 22:30) in referring to a slide which shows the distribution of clusters by governorate, "the last two, which probably should have gotten included, because of some mix ups in communication, they did not get sampled."
My guess is that the US authors wanted at least one cluster in each of the 18 governorates. Their technical document (one hopes!) specified the procedure by which this would occur. (Presumably, this involved assigning one cluster to each governorate and then allocating the remaining 32 clusters in proportion to governorate population.) Unfortunately, the US authors did not (?) actually do any of the sampling. Instead, the Iraqi interviewers performed the random allocation but did not follow the stated procedure.
Yet this leaves the mystery of which two clusters were discarded as a result of the mistake. That is, the US authors wanted a cluster in each of Muthanna and Dahuk. Yet, they did not get them. But how does anyone know which of the 50 clusters which were actually sampled were the "mistakes", the ones that were used in place of those two? How could anyone know which clusters to discard? I would really like to see the data from the two discarded clusters (as well as the one boundary mistake from Wassit).
I love this description (around 24:00) of the process.
What possible bias could this cause?!
Around 25:00 we get:
On forgetting to ask (at 26:00).
There is more good stuff. I hope to post more in a few weeks.
UPDATE: Here are more highlights.
Around 26:30 there is a description of the sampling methodology that is not totally consistent with other descriptions. There are also handy slides providing a visual guide to the procedure.
Around 28:30, Burnham notes that 11 of the clusters were in the same geographic area (town or village) as in the first survey but never in the exact same location. It would be interesting to see how that subset matches up between 2004 and 2006.
Around 30:00 shows an interesting histogram of the rate of violent death (post-invasion?) in each cluster. Speculates that Falluja and Dyalla (sp?) are the two biggest outliers but notes that the distribution looks pretty normal. In other words, violent deaths are not that clumped together. Also shows an interesting plot of death by age. Looks reasonable for women (high as babies then lower then high for elderly). The patterns for males, however, are very different. Highest mortality is for fighting age men. This is good evidence that there is a war going on, as we know.
[Side comment: But doesn't this dispute the central claim that lots of civilians are getting killed? One storyline is that US forces are dropping lots of bombs and lots of car bombs are going off in market places. In such a world, one would expect lots of females deaths as well as male deaths. Why would men be that much more likely to die in their house from an air strike or in the market from a car bomb than women? But, if most of the male deaths are "combatants" (take a look at the deaths from gunfire to cross check), then this is precisely the pattern we would expect to see. (Of course, many/most of the male victims of death squads and whatnot are not combatants as we would use the term.)
Shows the mortality by governorate and mentions that some of these estimates are imprecise because there is only one or a small number of clusters in the governorate. I ought to look into this more closely. Is there any statistical significance to the three groupings that they use to classify governorates? I have my doubts. This isn't really in point but it might make for a pretty graphic.
Around 39:00 talks about "deaths among civilians." False! They never let up with that canard. They measure deaths. They have no idea who the dead individuals were (civilians, insurgents, Iraqi army, et cetera).
Some discussion of survivor bias. Notes that there was one neighborhood in Falluja (from 2006 study?) in which all the members of seven families had been killed. Reports that it was common to have one or two such families in other clusters. I do not think that any of this is mentioned in the paper or distributed as part of the data.
Responds to MSB critique at 43:00.
I love the discussion about sampling Falluja at around 54:00. But the important quote is above. Guy paying for survey says, "If you happen to randomly choose Falluja (and I hope you do!), let's do three samples." Guy getting paid to do the survey says, "Guess what! We randomly selected Falluja." Guy with the money. "That makes me happy!" Burnham also mentions the problems of estimating population in Falluja. Commonly accepted number for before the war is 500,000. After US military action, the number was probably at 200,000.
Around 57:00 gets called out by a questioner about inappropriate use of the term "civilians" in describing the victims. Fesses up that he should use the term "Iraqis."
Around 1 hour, tells the story of a former student at US AID who reports that, when the survey came out, some unnamed office at the White House sent over a copy for them to try and discredit. They tried to but couldn't come up with anything (the project ruined their Friday afternoon) and so reported back that the study could not be discredited.
Detailed discussion of the randomization procedure around 1:07:00.
A discussion of the Wassit neighborhood problem follows. Burnham also mentions "rules" about the process for selecting a different neighborhood if the chosen one was too dangerous. They were supposed to look for a nearby neighborhood which was "similar" to the one selected.
It sure would be nice to have a sense of the how often this happened and to see a copy of the technical document which guided the survey teams.
Around 18:00 discussing the violent deaths which occurred before the war started. Note that this discussion concerns Lancet I.
Now, one of the big differences was the number of deaths due to violence. Now, our definition of violence wasn't car crashes or roof-falling-in-on-the-house, but these were intentional violence. So, they would be things like gun shots, missiles and so forth. And there were a few deaths that were before the invasion that were violent, and many of these were probably --- not "many," there were only a couple --- but these were associated probably with incidents around the No Fly zones, the process of radar units locking on to air planes and then firing back on the artillery and missile sites and so forth.
I previously covered this topic in the context of Lancet II. The topic of Falluja and confidence intervals starts around 18:30.
Around 20:30 we have this.
I might add that, in this second survey, I intentionally asked the team to oversample in Falluja, to take three samples, if they chose Falluja in the random selection, which indeed happened. And so we did three samples, three clusters instead of the one which the population would have merited. And then we randomly selected one of the three, which turned out to be the one that had the lowest mortality in the three clusters. But the experience in the first one was really true to what we found in the subsequent one, that there was very, very widespread destruction in Falluja, nearly every household had lost somebody in that survey.
Around 21:30 (starting discussion of Lancet II):
In 2006 (in working with John [Tirman of MIT]) we went back to look at things again from May to July of 2006. And, as John knows, we had great intention to get this out as early as we could before the elections, so we would not be again accused of trying to do something with the elections. But, one thing and another, movement of money, ethical approvals and so forth, we didn't get it done quite as soon as we would have liked.
It is still somewhat a mystery why there were no clusters in Muthanna and Dahuk. The paper refers to "miscommunication." Burnham says (around 22:30) in referring to a slide which shows the distribution of clusters by governorate, "the last two, which probably should have gotten included, because of some mix ups in communication, they did not get sampled."
My guess is that the US authors wanted at least one cluster in each of the 18 governorates. Their technical document (one hopes!) specified the procedure by which this would occur. (Presumably, this involved assigning one cluster to each governorate and then allocating the remaining 32 clusters in proportion to governorate population.) Unfortunately, the US authors did not (?) actually do any of the sampling. Instead, the Iraqi interviewers performed the random allocation but did not follow the stated procedure.
Yet this leaves the mystery of which two clusters were discarded as a result of the mistake. That is, the US authors wanted a cluster in each of Muthanna and Dahuk. Yet, they did not get them. But how does anyone know which of the 50 clusters which were actually sampled were the "mistakes", the ones that were used in place of those two? How could anyone know which clusters to discard? I would really like to see the data from the two discarded clusters (as well as the one boundary mistake from Wassit).
I love this description (around 24:00) of the process.
They [the interviewers] went out house to house in their white coats so that they couldn't be mistaken for being somebody else. They, first off, rounded up the children to explain what this survey was about, sent out the children to the households to explain to the neighbors what was going on and so forth, to try and reduce the risks that were involved.
What possible bias could this cause?!
Around 25:00 we get:
So we had pairs [of interviewers], male and female. Generally speaking, women will speak to women. So, if the head of the household who is home, or the person who is home, was a woman, the male interviewer would often go to the next house and get started there. So, with these two pair of interviewers for each of the clusters of 40 households, we were able to finish a cluster in a day. One of the things that we did not want to do was to come back the next day.
On forgetting to ask (at 26:00).
So this time our intent was to ask every household where a death was reported for a death certificate. Now, if you have done survey work you know you don't always get all the answers you want. And, in this case, in 13% of cases the interviewers forgot to ask for the death certificate.
There is more good stuff. I hope to post more in a few weeks.
UPDATE: Here are more highlights.
Around 26:30 there is a description of the sampling methodology that is not totally consistent with other descriptions. There are also handy slides providing a visual guide to the procedure.
Around 28:30, Burnham notes that 11 of the clusters were in the same geographic area (town or village) as in the first survey but never in the exact same location. It would be interesting to see how that subset matches up between 2004 and 2006.
Around 30:00 shows an interesting histogram of the rate of violent death (post-invasion?) in each cluster. Speculates that Falluja and Dyalla (sp?) are the two biggest outliers but notes that the distribution looks pretty normal. In other words, violent deaths are not that clumped together. Also shows an interesting plot of death by age. Looks reasonable for women (high as babies then lower then high for elderly). The patterns for males, however, are very different. Highest mortality is for fighting age men. This is good evidence that there is a war going on, as we know.
[Side comment: But doesn't this dispute the central claim that lots of civilians are getting killed? One storyline is that US forces are dropping lots of bombs and lots of car bombs are going off in market places. In such a world, one would expect lots of females deaths as well as male deaths. Why would men be that much more likely to die in their house from an air strike or in the market from a car bomb than women? But, if most of the male deaths are "combatants" (take a look at the deaths from gunfire to cross check), then this is precisely the pattern we would expect to see. (Of course, many/most of the male victims of death squads and whatnot are not combatants as we would use the term.)
Shows the mortality by governorate and mentions that some of these estimates are imprecise because there is only one or a small number of clusters in the governorate. I ought to look into this more closely. Is there any statistical significance to the three groupings that they use to classify governorates? I have my doubts. This isn't really in point but it might make for a pretty graphic.
Around 39:00 talks about "deaths among civilians." False! They never let up with that canard. They measure deaths. They have no idea who the dead individuals were (civilians, insurgents, Iraqi army, et cetera).
Some discussion of survivor bias. Notes that there was one neighborhood in Falluja (from 2006 study?) in which all the members of seven families had been killed. Reports that it was common to have one or two such families in other clusters. I do not think that any of this is mentioned in the paper or distributed as part of the data.
Responds to MSB critique at 43:00.
A group of physicists in Oxford complained that our sampling method favored the principal streets where they interpreted that most of the killings may have occurred. In fact, we went out of our way to try to include all the streets in the sampling frame. And, of course, the other thing we found is most killings occurred away from home anyway. And so that probably didn't add much to bias, if it had even existed.
I love the discussion about sampling Falluja at around 54:00. But the important quote is above. Guy paying for survey says, "If you happen to randomly choose Falluja (and I hope you do!), let's do three samples." Guy getting paid to do the survey says, "Guess what! We randomly selected Falluja." Guy with the money. "That makes me happy!" Burnham also mentions the problems of estimating population in Falluja. Commonly accepted number for before the war is 500,000. After US military action, the number was probably at 200,000.
Around 57:00 gets called out by a questioner about inappropriate use of the term "civilians" in describing the victims. Fesses up that he should use the term "Iraqis."
Around 1 hour, tells the story of a former student at US AID who reports that, when the survey came out, some unnamed office at the White House sent over a copy for them to try and discredit. They tried to but couldn't come up with anything (the project ruined their Friday afternoon) and so reported back that the study could not be discredited.
Detailed discussion of the randomization procedure around 1:07:00.
They [the Iraqi interviewers] did it on pieces of paper, they'd write down street numbers [or street names?] of pieces of paper. And then they would randomly pick out one of those from the hat, as it were. The things that we were doing in the 1960s, they did that. We got criticism because at the end of the process they destroyed all the little pieces of paper which, you know, I think that most of us would have done anyway.
...
Once they selected the street, then they numbered the houses, from that street, from 1 to wherever the end of the street was. And then they randomly, using serial numbers on money, they randomly selected a start number, to start from that house, and then from there they went to the nearest front door, nearest front door, until they had a total of 40 houses.
...
If the neighborhood was too dangerous, then you could either come back or you could select another one. Now, where this was a real problem was in Basrah. And they had to go to Basrah three or four times before it was finally safe enough to go. But they decided that they were going to go to the neighborhood that was selected even if it was a bit dangerous at the moment, they'd come back and do that one; they wanted to stick with the one that was selected.
A discussion of the Wassit neighborhood problem follows. Burnham also mentions "rules" about the process for selecting a different neighborhood if the chosen one was too dangerous. They were supposed to look for a nearby neighborhood which was "similar" to the one selected.
It sure would be nice to have a sense of the how often this happened and to see a copy of the technical document which guided the survey teams.