Friday, March 16, 2007

Burnham Presentation: Notes and Comments

I attended Gilbert Burnham's recent presentation at MIT. Tim Lambert provides a link to the video. I have not reviewed the video, but here are some notes that I made of the talk. (These might be incorrect; doublecheck the video.)

  1. The work of the authors was split up about as you might expect. Burnham organized; Lafta organized the survey; Doocy did the statistics; Roberts wrote the article. None of the authors besides Lafta were ever in Iraq.

  2. He mention something about the confidence intervals of the first survey going to infinity if the Fallujah data were included. I am not sure if I understood this point correctly. Recall every assumes that the results of Lancet I would have been even stronger if the Fallujah data were included. Thus, excluding Fallujah was "conservative." But perhaps this only applies to the mean of the distribution. It could be (?) that including an outlier widens the confidence intervals so much that one can no longer reject the null hypothesis of no increase in excess mortality. I think that I or someone else made this point during the discussion 2 years ago, but I am too lazy to look for a citation. If this is correct, then it strikes me as something that the readers of Lancet I deserved to know. I should come back to this point later. My best guess now is that I just misheard Burnham and he was just talking about the mean estimate going very high, i.e., to infinity.

  3. Burnham mentioned that he wanted to oversample Fallujah if it were chosen again. And, mirable dictu, it was chosen. But isn't that a little suspicious? Again, my basic hypothesis is not so much that the surveyors were malicious as that they gave the Americans what they thought the Americans wanted. So, they knew that Burnham wanted to check out Fallujah again so they ensured (?) that Fallujah was "randomly" selected for Lancet II. (Does this mean that all the sampling was done by the Iraqis?) Also, if Fallujah were purposely oversampled, then one would need to adjust the estimates for this fact. Not easy to do! Did Doocy do this? How? Given that the reason that Burnham chose Fallujah to be oversampled was because he knew it was much more violent than other parts of Iraq, one would need to do some adjustment. I guess that Doocy could use the extra samples just to estimate Anbar more accurately and then combine the Anbar estimate with the rest of the country. But I don't recall any discussion of this in the paper.

  4. The authors wanted to get the results of the survey published well before the election in order to avoid the controversy which plagued Lancet I, but they weren't able to do so because of fund raising and dealing with ethical reviews and the like. This makes little sense to me. Raising money and dealing with IRBs is tough and time consuming, but the survey work was complete by July 10, 2006. So, by that date, all (?) the issues of money and ethics were done. Lancet I went from survey completion to publication is one month. At the same rate, Lancet II could have been published in August. Now, one month is a very quick time from survey to publication, so no one would expect such a result. Indeed, getting to press within 3 months is still unusual. But I still expect that Lancet editor Richard Thorton likes October even year publication dates for these articles.

  5. All the interviewers were from (the same, I think) community medicine center in Bagdhad

  6. Tim Lambert comments: "The IBC made vociferous attacks on the studies because they want to defend their methods, and Les Roberts suggests that IBC are trying to stop the donations from drying up." I thought that this was a low blow, and an unusual comment in an academic seminar. But, it could also be true. Burnham made clear that this was Roberts's opinion, not his.

  7. There was useful background on the conduct of the survey. Once the team had picked the street (house?) to start at, it would tell all the children in the neighborhood (who had gathered to see the strangers in their white coats) what the survey was about. The children would spread the word around the neighborhood, alleviating suspicion and making everyone comfortable.

  8. Tim Lambert comments: "They will soon release the data (with identifying material removed) to other researchers." I hope so! There still seemed (to me) to be some hedging on this, that only "qualified" researchers would be allowed to see the data, that they (or their institutions) would be expected to testify somehow to data security. With luck, this won't be a problem, but I have my doubts.

  9. Burnham claimed that the reason that the interviewers only asked for death certificates in 87% of the reported deaths was that they "forgot." This strikes me as an implausible (but testable) claim. Some blogospherians had speculated that the 13% where this wasn't done were purposely for choices made to avoid danger or trauma. Could you ask for a death certificate with a grief-stricken mother wailing on the front step? This claim is testable because we should see a pattern in the cases where no certificate was asked for. It should be focused on one or two teams and should occur early in the surveys rather than later. Checking this is an example of the analysis we will be able to do once the data is available. Related to this is the issue of the survey form itself. Has this been made publicly available? It should be! Also, any competently designed form would feature a check box for this (and every other question). If it is on the form, then how could one forget 13% of the time?

  10. One of the biggest surprises (to me) was Burnham's admission that the teams operated independently at times. I think that the drill was that a team of 4 would go to a cluster, pick a main street, then a side street, then a house. Two teams of two would start knocking on doors. So each sub-team would have to 20 houses to finish this cluster. But, Burnham said, these two person teams would sometimes (how often?) operate independently. In other words, as one person "finished up" an interview in one house, her partner would leave and start on the next house himself. This procedure helps to alleviate concerns that there wasn't "enough time" to do the surveys but it does mean that if only one of the eight interviewers were malicious, she would have opportunity to make her results whatever she wanted. This is another reason for looking closely at the raw data. Mortality rates should be similar for all four members of the one team. (The other team, operating in different clusters, might have different results.

  11. He mentioned the difficulty of getting good population estimates for Fallujah. He mentioned 500,000 as being a number for before the war but 200,000 after all the US military activity.

  12. The slides make clear that the main-street bias (MSB) issue is less serious than one might suppose (and that the methodology write up in the paper is misleading --- unintentionally, I think). But the whole issue is quite complex and I hope to return to it later.

Again, these are from my messy notes. Check the video for what Burnham actually said.


Blogger jbd4020 said...

"The slides make clear that the main-street bias (MSB) issue is less serious than one might suppose"

You may want to look at what they're giving you on their slides much more closely than this David.

From what I can tell, Burnham was giving CDR's for both studies, and they look sorta similar, but that's misleading.

It conflates violent/non-violent deaths while MSB says violent deaths will be inflated by their procedures.

The CDR stuff evades the whole point, and an anomoloy with non-violent deaths in L2 (they went down during the L1 period) makes for a counter-intuitive calculation that makes the CDRs look similar.

From the numbers in the report it looks like about 140,000 violent deaths were recorded in L2 for the L1 period. L1 had about 58,000 violent deaths.

At the same time, L1 had non-violent deaths going up by about 40,000.

L2 had them going *down* by about the same amount for the L1 period.

Conflating these creates similar CDRs, and an illusion that the two studies recorded essentially the same phenomena. They did not.

What the two actually recorded is something like this:

L1 = 60000v + 40000nv = 100,000
L2 = 140000v - (minus) 30000nv = 110,000

That's what they seem to be showing you on the graphs, but they're not telling you the slippery process of this calculation, which creates an illusory similarity, and completely evades the MSB argument. They cherry-pick this manner of showing you a comparison of two studies because it misleadingly makes the findings look very similar, while they are anything but.

5:02 AM  

Post a Comment

<< Home