Thursday, February 26, 2009

Burnham Sanctioned

Two weeks ago, I claimed that a partial victory for we Lancet skeptics would come with the "Censure of Roberts/Burnham by Johns Hopkins." I declare victory!


Iraq Researcher Sanctioned

Associated Press
Tuesday, February 24, 2009; Page A04

The Johns Hopkins Bloomberg School of Public Health is sanctioning the lead author of a 2006 study that suggested massive civilian deaths in Iraq.

The school announced yesterday that it is barring Gilbert M. Burnham from serving as a principal investigator on projects involving human subjects, saying he violated school policies by collecting the names of those interviewed.

The school completed an internal review of the study, which estimated that nearly 655,000 Iraqis had died because of the U.S.-led invasion and war in Iraq. The review found that inclusion of identifiers did not affect the results of the study.

The school says the paper published in the Lancet medical journal incorrectly stated that identifying data were not collected. A correction will be submitted.


Now, "sanction" is not the same thing as "censure," but it is close enough for blog-work. The original Hopkins news release is here. (Hat tip to Tim Lambert.) Key sections:


The Bloomberg School of Public Health’s IRB acted properly in determining that the original study protocol was exempt from review by the full IRB under federal regulations. The original protocol explicitly stated that no names of study participants or living household members would be collected. The protocol also included an appropriate script to secure verbal consent from study participants, rather than a written consent process that would have included participants’ signatures.


Note that the interests of Hopkins and Gilbert Burnham are not necessarily aligned, although both wish the whole debate would go away. Hopkins wants to ensure that the Federal Government isn't angry with it since Hopkins is so dependent on federal money. Hopkins' key concern is to demonstrate that the university did nothing wrong. If that means throwing Burnham under the bus, so be it.


An examination was conducted of all the original data collection forms, numbering over 1,800 forms, which included review by a translator.

...

A review of the original data collection forms revealed that researchers in the field used data collection forms that were different from the form included in the original protocol. The forms included space for the names of respondents or householders, which were recorded on many of the records.


If the interviewers weren't even using the correct form (and did not visit the correct governorates), then why should we believe that they did anything else correct?

Left unexplained is whose fault this is. Did Burnham hand Lafta the wrong form? On purpose? By mistake? Or did Burnham give Lafta the correct form and then Lafta just used the wrong one? On purpose or by mistake? Recall (pdf) that the process included two face-to-face meetings.


American and Iraqi team members met twice across the border in Jordan, first to plan the survey and later to analyze the findings.


My guess is that Burnham did everything correctly (he seems like a careful and professional guy to me) but then Lafta just screwed it all up, either out of sloppiness or because field work is hard, even for the most diligent of researchers. And then Burnham got in real trouble because, after the fact, he tried to cover for Lafta. It is always the cover-up, never the crime.

And that is the real issue for Burnham's credibility. If he had just fessed up immediately, then that would be one thing. But, instead, he made a series of misleading claims about what happened. Consider just six examples:

1) Burnham's February 2007 talk at MIT (video here).


There were limitations to record keeping; we had some criticism that we could not produce a record showing which households were visited and who names of people were, and so forth.

We intentionally did not record that, because we felt that if the team were stopped at a checkpoint, of which there are lots of checkpoints, and the records were gone through, some of you may have had this experience, where you stop at a checkpoint, people go through all your papers, read everything, and they find certain neighborhoods. That might have increased risk, which we didn't want to do.


That's not true. The survey forms included names and Burnham knew that at the time he gave his talk to MIT. Note that these comments were part of the main presentation, something that Burnham had probably given on multiple occasions, not an off-the-cuff answer to a question from the audience. If Burnham said this at MIT, he probably said it elsewhere.

2) Burnham wrote to blogger Tim Lambert in February 2008:


As far as the survey forms, we have all the original field survey forms. Immediately following the study we met up with Riyadh (in this very hotel I am now in) and Shannon, Riyadh and I went through the data his team had computer entered, and verified each entry line-by-line against the original paper forms from the field. We rechecked each data item, and went through the whole survey process cluster-by-cluster. We considered each death, and what the circumstances were and how to classify it. Back in Baltimore as we were in the analysis we checked with Riyadh over any questions that came up subsequently. We have the details on the surveys carried at each of the clusters. We do not have the unique identifiers as we made it clear this information was not to be part of the database for ethical reasons to protect the participants and the interviewers.


Again, this statement is incorrect. Burnham did have the full names of at least some (or many? or most?) of the survey participants.

3) Burnham in an interview with the National Journal.


Hopkins's guidelines say that substantial modification of survey procedures without approval by the university's IRB can be deemed a serious violation. "All intentional noncompliance is considered to be serious," the guidelines state.

The survey "was carried out as we designed it," Burnham told National Journal.


Not true. The survey design called, explicitly, for the names of participants to not be collected. But the names were collected. Even if one tries to defend Burnham (see below) by insisting that he did not realize, until recently, that full names were collected, we still have the problem that Burnham has always known that Lafta used the wrong forms, forms with a space for the names of the participants. Using the wrong form is, obviously, not carry out the survey "as we designed it."

4) A description in Johns Hopkins Magazine.


Concern for the safety of interviewers and respondents alike produced two more decisions. First, they would not record identifiers like the names and addresses of people interviewed. Burnham feared retribution if a hostile militia at a checkpoint found a record of households visited by the Iraqi survey teams.


Now, a Burnham defender might argue that a) Burnham is not quoted, so this misstatement is not his fault and b) there is no misstatement here since "would not" is not the same as "did not." Perhaps. But the article clearly featured extensive cooperation from Burnham (and Doocy). They have some responsibility to make sure that the author describes their work accurately. And, to the extent that a mistake is made, they have an obligation to correct it. If Johns Hopkins Magazine can not trust professors from Johns Hopkins to clearly describe their research, then what is the world coming to?

The point of this example is not so much that it is damning in and of itself. It isn't. The point is that it is part of a pattern of Burnham giving his listeners a false impressions of whether or not names were collected. He makes everyone think that names were not collected even though they were.

5) The official Q&A from the Hopkins page, since removed (along with other material) for obvious reasons.


Approval specified that no unique identifiers would be collected from households visited by researchers, including complete names, addresses, telephone numbers or other information which could potentially put the households at risk. While household demographics were collected in both Iraq studies, personal information such as the date, location and cause of death was collected only for deceased household members. Research regulations do not consider a dead person to be a human subject and informed consent is not required for uniquely identifying information on the characteristics and circumstances of death. Informed consent was obtained from the principal respondent in each household before interviews were conducted.


"Personal information" was, we now know, collected for some living household members. Although Burnham's name does not appear on this Q&A, he was and is the Director of Center for Refugee and Disaster Response. It is hard to believe that he did not sign off on the document before it was posted.

6) "The Human Cost of the War in Iraq" (pdf), the official companion piece to Burnham et al (2006).


The survey was explained to the head of household or spouse, and their consent to participate was obtained. For ethical reasons, no names were written down, and no incentives were provided to participate.


Since names were written down, this statement is also not true. One might defend Burnham's misstatement at the MIT lecture as being an honest mistake, made while speaking off-the-cuff. One might argue that an informal e-mail to a blogger like Tim Lambert does not require perfect accuracy. One might believe that interviewers from the National Journal and Johns Hopkins Magazine were mistaken. One might assume that that Burnham never saw/approved that Q&A on the homepage of the Center that he heads.

But here we have a formal report with a statement that Burnham knew was both untrue and important. (If it were a trivial issue, then Hopkins would not have sanctioned him and no correction to the Lancet would be necessary.) How many other untrue statements are there?

Even worse, all the other L2 authors are also authors of this paper. Now, it could be that some of those authors (like Les Roberts) never saw the actual forms and so might be guilty of nothing more than an honest mistake. But Shannon Doocy saw the forms. She knew that this statement was not true.

Is Burnham telling the truth even now? Good question! Consider this article from the Baltimore Sun:


The Johns Hopkins University has disciplined the lead author of a widely publicized study that reported widespread civilian deaths in Iraq as a result of the U.S. invasion.


Discipline, sanction, censure. Whatever. They all work for me!


Because of the difficulty of carrying out research in Iraq during the war, Burnham and his team partnered with Iraqi doctors at a university in Iraq. Burnham, working out of Jordan, said he made it clear to the doctors that they could collect the first names of children and adults, to help keep the information straight, but that last names could not be collected.


Huh? This seems really fishy to me. I assume that collecting first names was not a part of the protocol that Burnham submitted to Hopkins. I have certainly never heard of anything like this. (Counter examples welcome!) You either write down someone's name or you don't. So, Burnham knew, before the interviewers went out that they were using a form with a spot for names? And he didn't say, "Wait! That's not the form you are supposed to use." The whole thing makes no sense. Unless Burnham is trying to cover for Lafta, trying his best to make his actions in recording names less damning.

And that is key to defending the results of the study. If Lafta screwed this up, then there is every reason to think that he screwed up other stuff, either on purpose or by mistake. If he wrote down names when Burnham told him not to, then how does anyone know if he followed the assigned sampling plan? We don't.

But if Burnham can take the blame himself, can claim that he told/allowed Lafta to write down partial names, then, perhaps, the rest of Lafta's work can still be trusted. I do not believe that Burnham really told Lafta that he could write down partial names.

You disagree? Fine. Let's ask Shannon Doocy. She was with Burnham in Jordan. She was a co-author of the study. She knows Lafta. Presumably, she was in the room when Burnham/Lafta discussed the planing for the study. I bet that she can't/won't back up Burnham on this point. She's not 66 and tenured . . .


When the surveys came back to him in Jordan, it appeared that some had last names. Many were in Arabic. Burnham said he asked his Iraqi partners and was told that the names were not complete, which he accepted. But Hopkins, in its investigation, found that the data form used in the surveys was different from what was originally proposed, and included space for names of respondents. Hopkins found that full names were collected.


Oh, what a tangled web we weave . . .

If I were Baltimore Sun reporter Stephen Kiehl, I would ask some follow up questions:

1) How many of the 1,800+ forms had names? How many of those names were full names (as opposed to just first names)? Were those names all in Arabic, all in English or a mixture? How many of the names were in English?

"Many" might be 20 or 100 or 500. But, given that the forms featured lots of English words (cause of death and so on), I would bet that many (hundreds?) names were in English. So, Burnham looked at a name like "Nouri al-Maliki" or "Iyad Allawi" and said, "Sure. I accept that these names are not complete." Unlikely! Again, it would be nice to ask Shannon Doocy some questions as well.

2) What is Burnham's take on the fact that his "Iraqi partners" misled him? (Was this just Lafta or Lafta and some other interviewers? My understanding had been was that it was just Lafta who brought the forms to Burnham/Doocy in Jordan. And note that Lafta is not just a "partner," he is an official co-author on the study.) Here we have documented proof that one author of the study misled another author of the study. How much faith does Burnham think we should have in a study in which the authors are lying to each other?

3) Why did Burnham mislead so many news organizations? Consider his (and Roberts') letter (also here) to the National Journal:


In the ethical review process conducted with the Bloomberg School of Public Health's Institutional Review Board, we indicated that we would not record unique identifiers, such as full names, street addresses, or any data (including details from death certificates) that might identify the subjects and put them at risk.


Oh, snap! Did you catch that? I thought that this was just going to be one of many examples of where Burnham lied, after the fact, about not collecting names. But note that he is not lying! Sneaky! He and Roberts do not claim (falsely) to not have collected names. They claim (truthfully) to having told Hopkins that they "would not" record names. Hah! I did not catch that the first time around because I assumed a basic level of honesty from Burnham. My mistake! (Further discussion here.) Comments:

1) This is circumstantial evidence that, at least by this point, Roberts was in on the lie.

2) I suspect that "full names" rather than just "names" is used to preserve wiggle room on various dimensions.

Tuesday, February 24, 2009

Likely Correction?

The Johns Hopkins statement includes:


The paper in The Lancet incorrectly stated that identifying data were not collected. An erratum will be submitted to The Lancet to correct the text of the 2006 paper on this point.


I am confused about how this is going to work. The only relevant statement that I could find in the paper was:


The survey purpose was explained to the head of household or spouse, and oral consent was obtained. Participants were assured that no unique identifiers would be gathered.


For all we know, this is a true statement. Perhaps this is what participants were told, albeit incorrectly. If that is so, then a hard core Lancet defender, like Les Roberts, might maintain that no correction is necessary. In fact, don't all the authors of a paper need to agree on any correction? I suppose that a single author, like Burnham, can say whatever he wants, but I wouldn't think that a single author has the right to make an official change unless the other authors agree. At most, a single author can just demand that his name be removed from the publication. And, say what you will about Les Roberts, but he is a head-strong fellow. Not that there is anything wrong with that! What if Burnham wants to make a correction but Roberts doesn't?

Perhaps there is some other reference in the paper that I have missed.

Moreover, wouldn't this also be a good time to correct the mistaken description in the paper of the sampling scheme. After all, even Roberts/Burnham admit that it is not accurate.

Anyway, I don't have a strong opinion on what will happen. I am just curious what others predict . . .

Misleading Statements on Unique Identifiers

I had assumed that it would be easy to find places where Roberts/Burnham claimed that names were not collected. Turns out that it isn't. Looks like they knew that this fact might eventually come out so they tried to mislead without lying. Consider their letter to Science.


Those who work in conflict situations know that checkpoints often scrutinize written materials carried by those stopped, and their purpose may be questioned. Unique identifiers, such as neighborhoods, streets, and houses, would pose a risk not only to those in survey locations, but also to the survey teams. Protection of human subjects is always paramount in field research. Not including unique identifiers was specified in the approval the study received from the Johns Hopkins Bloomberg School of Public Health Committee on Human Research. At no time did the teams “destroy” details, as Bohannon contends. Not recording unique identifiers does not compromise the validity of our results.


Were Burnham/Roberts lying? Tough to say! I (and everyone else who read those words two years ago) assumed that the names of the participants were not collected. (We now know that at least some full names were recorded.) But the above passage is written in such a way as to give the impression that no names were collected --- "Not including unique identifiers was specified in the approval the study received" --- without explicitly stating that this had been done. Clever!

Recall how the paper itself handled the issue.


The survey purpose was explained to the head of household or spouse, and oral consent was obtained. Participants were assured that no unique identifiers would be gathered.


Again, this could be a truthful statement. Who knows what the interviewers said to the participants? But it completely misleads the reader into thinking that names had not been collected when, in fact, they had been collected.

So, hard core Lancet defenders will insist that no correction is necessary. Then why is Johns Hopkins insisting that one be made?


The paper in The Lancet incorrectly stated that identifying data were not collected. An erratum will be submitted to The Lancet to correct the text of the 2006 paper on this point.


If you agree with Hopkins that a correction is necessary, then you should also believe that Roberts/Burnham owe an apology/correction to, among others, the readers of Science. They were just as likely to have been misled.

But also consider this June 2007 letter to Science from Burnham, Roberts and Doocy.


At the time the study was published, Iraqi colleagues requested that we delay release, as they were very fearful that somehow streets, houses, and neighborhoods might be identified through the data with severe consequences. We agreed to wait for 6 months and have now made the data available.

From the beginning, we have taken the position that protecting participants is paramount, and thus we will not be releasing any identifiers below the level of governorate. The demand by some for street names seems to arise from the erroneous belief that only main streets were sampled, when in fact, where there were residential streets that did not intersect with the selected commercial street, these too were included in the sampling frame for identification of the start house. In any event, our interviewers reported that most, although not all, violent deaths occurred away from the location of residence.


Does this make much sense? To the extent that their "Iraqi colleagues" were "very fearful," it would not be because of specific neighborhoods being identified via statistical magic but because the full names of participants could be released. That's the real problem. By pretending otherwise, Burnham/Roberts/Doocy continued to mislead all of us into thinking that the worst conceivable data release (from the point of view of participant safety) might allow identification of a specific street when, in fact, the worst possible release would involve the actual names of participants. Again, there is nothing in the above that is literally a lie --- Who knows what their "Iraqi colleagues" told them? --- but there is no doubt that readers were misled, again, into thinking that names had not been recorded.

Thursday, February 19, 2009

L2 Sampling Details

This post serves to collect information from various sources about the details of the L2 sampling procedure.

Summary: L2 authors have given different (and conflicting) accounts of exactly what the interviewers did and/or were supposed to do. They have made no final statement about what sampling plan was followed. Anyone who claims to "know" what the sampling plan in L2 was is lying.

Let's begin with the description from the paper itself.


As a first stage of sampling, 50 clusters were selected systematically by Governorate with a population proportional to size approach, on the basis of the 2004 UNDP/Iraqi Ministry of Planning population estimates (table 1). At the second stage of sampling, the Governorate’s constituent administrative units were listed by population or estimated population, and location(s) were selected randomly proportionate to population size. The third stage consisted of random selection of a main street within the administrative unit from a list of all main streets. A residential street was then randomly selected from a list of residential streets crossing the main street. On the residential street, houses were numbered and a start household was randomly selected. From this start household, the team proceeded to the adjacent residence until 40 households were surveyed. For this study, a household was defined as a unit that ate together, and had a separate entrance from the street or a separate apartment entrance.


The problem with this description, as highlighted by the "main street bias" work in Johnson et al (2008), is that there are houses/streets that are not included in the sampling frame. Consider this figure from Johnson et al.



It is obvious that some houses are not within the sample frame.

Now, just because the stated procedure has this problem does not invalidate it. Given the time/resource constraints faced by L2, this approach is perfectly reasonable. The critical thing is not so much the exact approach used as it is transparency on the part of the L2 authors as to their procedure.

The best single source for an overview of L2 sampling issues is Mike Spagat's "Ethical and Data-Integrity Problems in the Second Lancet Survey of Mortality in Iraq" (pdf), forthcoming in Defence and Peace Economics. Below are some of the key sections:


The authors of L2 have still not fully disclosed their sample design (Bohannon, 2008, Spagat, 2007). Gilbert Burnham and Les Roberts have stated frequently that the L2 field teams did not follow the sampling methodology that was published in the Lancet but they have not supplied a viable alternative. Burnham and Roberts have also issued a series of contradictory statements about their sampling procedures and have either destroyed or not collected evidence necessary to evaluate these procedures.


See Spagat for citations.


These lists of main streets are at the core of the claimed sampling methodology. Yet, the L2 authors have refused to provide these lists or even clarify where they came from.


The footnote associated with this sentence is particularly damning.


For example, Seppo Laaksonen, a professor of survey methodology in Helsinki, requested and was denied any information on main streets, even the average number of main streets per cluster (Laaksonen, 2008).


This is the sort of behavior that makes me (and most statisticians) incredibly suspicious of L2. It is fine, perhaps, for the L2 authors to ignore me. I am just some random guy on the internet. It is fine, perhaps, for them to ignore Spagat et al since Spagat has been so critical of their work. But to refuse to answer a question from a professor of survey methodology with no particular axe to grind is just pathetic. If you are an academic, you answer questions from your fellow academics.

Back to Spagat:


Gilbert Burnham did make aspects of the sampling methodology fairly concrete in Biever (2007), an interview with the New Scientist.


“The interviewers wrote the principal streets in a cluster on pieces of paper and randomly selected one. They walked down that street, wrote down the surrounding residential streets and randomly picked one. Finally, they walked down the selected street, numbered the houses and used a random number table to pick one. That was our starting house, and the interviewers knocked on doors until they’d surveyed 40 households…. The team took care to destroy the pieces of paper which could have identified households if interviewers were searched at checkpoints.” (Biever, 2007, emphasis added.)


Whatever its strengths or weaknesses, this does seem to be a procedure that can be followed in the field. The L2 authors may no longer be able to specify their sample design since these pieces of paper have been destroyed. But they should be able to supply lists of principal streets or at least specify how many such streets there were per governorate.

Burnham explains that the sampling information was destroyed to protect the identities of respondents, but this explanation is inadequate. Pieces of paper with lists of principal streets and surrounding streets would be of no use for identifying households included in the survey. Even lists of all of the households on a street that was actually sampled would not be usable for identifying particular L2 respondents. On the other hand, the L2 data-entry form that Riyadh Lafta submitted to the WHO contains spaces for listing the name of each head of household in addition to names of people who died or were born during the L2 sampling period. If the field teams could travel around with pieces of paper containing the names of their respondents plus many of their family members then they did not have to destroy lists of streets. Finally, as noted above in section 2, the lists of L2’s respondents would have been widely known at the local level in any case.

The L2 authors have often dismissed the possibility of sampling bias by stating that they did not actually follow the sampling procedures that they claimed to have followed in their Lancet publication. For example, Burnham and Roberts (2006a) write that they had removed the following sentence from their description of their sampling methodology at the suggestion of peer reviewers and the editorial staff at the Lancet:


"As far as selection of the start houses, in areas where there were residential streets that did not cross the main avenues in the area selected, these were included in the random street selection process, in an effort to reduce the selection bias that more busy streets would have." (Burnham and Roberts, 2006a)


Combining this with Gilbert Burnham’s New Scientist interview already quoted (Biever, 2007) would imply that at each location:

A. Field teams wrote names of main streets on pieces of paper and selected one street at random.

B. The field teams then walked down this street writing down names of cross streets on pieces of paper and selected one of these at random.

C. The field teams then became aware of all other streets in the area that did not cross the main avenues and may have selected one of these instead of one of the cross streets written on pieces of paper. This wide selection was done according to an undisclosed procedure.

The Biever (2007) description of Burnham does outline a sampling procedure that could have been followed and is broadly consistent with the published methodology. If other types of streets, beyond those that would be covered by the published methodology, were included in the sampling procedures then the authors need to specify how these streets were included. More fundamentally, how did the field teams discover the existence of such streets that could not be seen by walking down principal streets as described by Burnham in Biever (2007)? The L2 field teams would not have brought detailed street maps with them into each selected area or else it would not have been necessary to walk down selected principal streets writing down names of surrounding streets on pieces of paper. We can also rule out the possibility that the teams completely canvassed entire neighborhoods and built up detailed street maps from scratch in each location. Developing such detailed street maps would have been very time consuming and the L2 field teams had to follow an extremely compressed schedule that required them to perform forty interviews in a day (Hicks, 2006).

In Giles (2007), an article in Nature, Burnham and Roberts suggested one possible explanation on how the field teams had managed to augment their street lists beyond streets that could be seen by walking down a main street but this suggestion was rejected by an L2 field-team member interviewed by Nature:


“But again, details are unclear. Roberts and Gilbert Burnham, also at Johns Hopkins, say local people were asked to identify pockets of homes away from the centre; the Iraqi interviewer says the team never worked with locals on this issue.” (Giles, 2007)


Even if locals had identified such “pockets of homes away from the centre” the authors still would have to specify how these were included in the randomization procedures. Indeed, involving local residents in selecting the streets to be sampled would seem to be at odds with random selection of households. Locals could, for example, lead the survey teams to particularly violent areas.

Burnham and Roberts have induced further confusion about their sample design by issuing a series of contradictory statements.


"The sites were selected entirely at random, so all households had an equal chance of being included." (Burnham et al, 2006b, emphasis added)

"Our study team worked very hard to ensure that our sample households were selected at random. We set up rigorous guidelines and methods so that any street block within our chosen village had an equal chance of being selected." (Burnham and Roberts, 2006b, emphasis added)

“… we had an equal chance of picking a main street as a back street.” (The National Interest, 2006).


These statements contradict each other and the methodology published in the Lancet. Some streets are much longer than others. Some streets are much more densely populated than others. Such varied units cannot all have equal probability of selection. If, for example, every street block had an equal chance of selection then households on densely populated street blocks would have lower selection probabilities than households on sparsely populated street block. If main streets are more densely populated on average than back streets are and main streets and back streets have equal selection probabilities then households on main streets would have lower selection probabilities than households on back streets.




All this is bad enough. Yet Lancet defenders assert that these contradictions are unimportant, that such discrepancies are inevitable when researchers try to explain their work to the public, that much of the "conflict" is just caused by a reasonable effort on behalf of the Lancey authors to simplify.

And that is a reasonable defense. But, to work, there must come a point at which we can point to a specific statement that describes the sampling plan used. What is the truth? Normally, the truth would be whatever is written in the published paper, but the Lancet authors now claim that the published work is not accurate. (And they have failed to publish a correction in the Lancet.) The truth could also be the official sampling plan that the authors created prior to starting the project. They must have written up a plan ahead of time. If they were to release that now, they could point to that.

So, if the truth is not in the paper, or in a correction to the paper or in an official document released by the authors, where is it? Good question! No one knows.

Burnham gave a presentation at MIT in February 2007. (See here for video, transcript and extensive commentary.) Let me quote my prior commentary.


Burnham claims that they did not restrict the sample to streets that crossed their main streets. Instead, they made a list of "all the residential streets that either crossed it or were in that immediate area." This is just gibberish.

First, if this was what they actually did, why didn't they describe it that way in the article? Second, given the time constraints, there was no way that the teams had enough time to list all such side streets. Third, even if the interviewers did do it this way, the problem of Main Street Bias would still exist, except it would be more Center Of Town Bias. Some side streets are in the "immediate area" of just one main street (or often in the area of none) and other side streets (especially those toward the center of a town or neighborhood) are near more than one. The later are much more likely to be included in the sample.


Again, it is not unreasonable for there to be some confusion in the immediate aftermath of publication about exactly what the interviewers did. My point here is not to attack Burnham for an in artful phrase like "equal chance" as used in an interview. But, given that this was such an important matter of public concern, I would expect Burnham to have his story straight by the time that he spoke at MIT, 4 months after publication.

But things are even worse that that!

Consider the Q&A that the Lancet authors posted in early 2008 on the Johns Hopkins web page. Note the description of the sampling:


Sampling for the 2006 study was designed to give all households in Iraq an equal chance of being included.


This statement can not possibly be true. How can the authors, more than a year after the study was published, assert such an obvious falsehood about the single most important criticism of the study?

But, even though this is a false statement, it is at least a clear one. Want to know what the official sample plan was? Don't check the paper. (It's wrong.) Don't look for corrections. (None were made.) Don't look for the original planning documents. (They were never released.) But, at least there is an official statement on the Johns Hopkins website. When Tim Lambert and other Lancet defenders want to claim that the sampling plan was such and such, they could link here as evidence.

But then the Lancet authors deleted that page! (This was in a conjunction with various other false statements made on the web page as well as other unexplained deletions.) So, the official word is that there is no official word. The last clear statement made by the Lancet team --- a statement that, one assumes, supersedes the published paper and other interviews given by the authors --- was removed from the web without any explanation.

Anyone who claims to know the sampling plan used in L2 is lying.

Sunday, February 15, 2009

Pope of Debunkers

Robert Shone's blog, Dissident 93, covers a variety of Lancet topics. Here is a good summary of why so much of the criticism of the Main Street Bias work is incoherent. Via his blog is also this useful Science article.


www.sciencemag.org SCIENCE VOL 319 18 JANUARY 2008 273
CREDIT: ADAPTED FROM C. A. BROWNSTEIN ET AL., NEJM 358, 5 (2008), MASSACHUSETTS MEDICAL SOCIETY
NEWS OF THE WEEK
A team led by the World Health Organization
(WHO) has produced a new estimate of
the number of Iraqis who died violently in
the first 40 months following the U.S.–led
invasion: between 104,000 and 223,000.
This figure, published online last week by
the New England Journal of Medicine, hews
close to some other attempts to quantify the
toll but comes in far below a controversial
2006 study led by researchers at Johns Hopkins
University in Baltimore, Maryland.
That group estimated approximately
600,000 violent deaths during the same
period. The discrepancy has prompted critics
to renew their charge that the Johns Hopkins
results are not credible.
Data from a war zone are never fully reliable;
the best researchers can hope for is
“getting the numbers roughly right,” says
Fritz Scheuren, a statistician at the University
of Chicago in Illinois and past president
of the American Statistical Association.
Escalating violence in Iraq after 2003 put a
limit on quality control, but researchers do
have a quantitative starting point: the casualty
tally made by Iraq Body Count, a nonprofit
advocacy group based in London. By
controlling for multiple accounts of the same
car bombs and shootings, the group estimates
from media reports that between
81,000 and 88,000 violent deaths have
occurred in Iraq since the invasion. The figure
is useful as “a lower bound on the true
number,” says Jon Pedersen, a statistician at
the Fafo Institute for Applied International
Studies in Oslo, Norway.
To get the upper bound, says Pedersen,
you have to knock on doors in what is known
as a two-stage cluster survey. That’s the
method used by the WHO and Johns Hopkins
teams, among others. Researchers divide the
country into regions and then sample clusters
of households within each. Finally, they
extrapolate mortality rates from those clusters
to the total population.
Epidemiologists Les Roberts and Gilbert
Burnham of Johns Hopkins published the
first Iraq cluster study in November 2004 in
The Lancet. They used data collected by
Roberts and an Iraqi team, which, in September
2004, surveyed 988 households in
33 clusters across the country. They arrived
at a figure of 98,000 “extra” deaths since the
invasion, about half due to violence. Soon
after this, a team led by Pedersen and the
United Nations Development Programme,
which had used a much larger sample of
21,668 households in 2200 clusters, produced
an estimate for roughly the same
period of about 25,000 violent deaths.
As the invasion gave way to occupation
and insurgency, Rober ts and Burnham
mounted another study. This time they left
the surveying entirely to the Iraqi team,
communicating from abroad. Published in
October 2006 in The Lancet, the second
survey—based on 1849 households in
47 clusters—estimated that 601,000 Iraqis
died violent deaths between the 2003 invasion
and July 2006. To many, the number
seemed unrealistically high. Some also
faulted the authors for not fully answering
questions about the survey’s methods
(Science, 20 October 2006, p. 396).
Now comes the WHO survey. Conducted
with the help of the Iraqi government, it is by
far the most comprehensive mor tality
assessment to date. Interviewers visited
9345 homes in more than 1000 clusters. But
its estimate of 151,000 violent deaths has
come in for some criticism, too. Unlike other
Iraq casualty surveys, this one includes an
upward adjustment of 35% to account for
“underreporting” of deaths due to migration,
memory lapse, and dishonesty. “That is
really an arbitrary fudge factor,” says
Debarati Guha-Sapir, an epidemiologist at
the WHO Collaborating Centre for Research
on the Epidemiology of Disasters in Brussels,
Belgium. But the number falls squarely
within the range produced by a meta-analysis
of al l avai lable mor tal i ty studies by
Guha-Sapir and fellow centre epidemiologist
Olivier Degomme. The Johns Hopkins
figure is an outlier, she says.
Why the Hopkins study came up with
such a high figure is not clear. Criticism of
the study has in fact intensified since Burnham
and Roberts released a data set to
selected peers last year. “It did
not include the standard kinds of
data,” says Seppo Laaksonen, a
statistician at the University of
Helsinki in Finland and a specialist
in survey methodology.
For example, he says, it was
impossible “to check the objectivity
and randomness of cluster
selection.” Scheuren, who also
received the data, wanted to
compare results obtained by different
interviewers to “get a handle
on noise” and check for fabrication
by surveyors. Rober ts
declined to provide all the
details, according to Scheuren,
saying that he was concerned
that this would risk the safety of
the interviewers.
Burnham told Science, however,
that the Johns Hopkins
team does not have such detailed
information. “Our goal was to
reduce any type of risk to the community
and the par ticipants,” says Burnham.
“While we have much of the raw data, we
requested that anything designating the
interviewers or the location of the neighborhoods
visited not be sent to us.” Laaksonen
responds that he would not have published
“any f igures for the country” if he didn’t
have direct access to such raw information
from surveyors.
Burnham is not retreating. Because the
WHO survey was conducted by Iraqi government
personnel, “people may have been
hesitant to answer honestly,” he says. He
claims that unlike those in the WHO study,
nearly all of the deaths tallied by the 2006
Lancet study were verified with death certificates.
Even if the debate may be drawing
to a close about whether the number of violent
deaths in postinvasion Iraq could be as
high as 600,000, the argument about methods
is clearly far from settled.
–JOHN BOHANNON
Downloaded from www.sciencemag.org on January 17, 2008

Wednesday, February 11, 2009

End Game

How will the controversy over the Lancet articles end? Good question. I can't predict the future, but here are some benchmarks for defining "victory," at least for we Lancet critics.

1) Both Roberts et al (2004) and Burnham et al (2006) are withdrawn by the Lancet. This is the primary goal that many of us are aiming for. If scientists don't police the accuracy of the scientific literature, then who will?

2) The Lancet authors are forced to be more transparent in their research. There is still some (small) chance that the Lancet papers are accurately, or at least not fraudulent. If the authors were to provide access to their data/methods to all researchers, then that would be a victory for replication and the scientific process. My primary motivation from the start has been to insistent on the importance of scientific work that is open to replication. That is the primary norm that I want to defend. If the Lancet authors don't have to share their data or explain their statistics, then why should any other scientist have to?

3) Censure of Roberts/Burnham by Johns Hopkins or (in the case of Roberts) Columbia. If leading research universities don't uphold the norms of open and transparent scientific inquiry, who will? The most interesting aspect of the recent censure of Burnham by AAPOR was news of an open investigation by Johns Hopkins.


In a highly unusual rebuke, the American Association for Public Opinion Research today said the author of a widely debated survey on "excess deaths" in Iraq had violated its code of professional ethics by refusing to disclose details of his work. The author's institution later disclosed to ABC News that it, too, is investigating the study.

AAPOR, in a statement, said that in an eight-month investigation, Gilbert Burnham, a faculty member at the Johns Hopkins Bloomberg School of Public Health, "repeatedly refused to make public essential facts about his research on civilian deaths in Iraq."

Hours later, the school itself disclosed its own investigation of the Iraq casualties report "to determine if any violation of the school's rules or guidelines for the conduct of research occurred." It said the review "is nearing completion."

...

The inquiry by the Johns Hopkins Bloomberg School was disclosed in an e-mail from Tim Parsons, the school's public affairs director, as follows:

"The level of civilian mortality in Iraq is a controversial subject. Questions have been raised regarding the findings and methodology of the 2006 Iraq mortality study conducted by Dr. Gilbert Burnham and published in The Lancet. The Johns Hopkins Bloomberg School of Public Health takes any allegation of scientific or professional misconduct very seriously. It believes that the correct forum for discussing the reported findings of the Lancet study and the general methodology that led to those findings is in the regular exchange of views in the scientific literature. The Bloomberg School of Public Health has undertaken a review of the study to determine if any violation of the school's rules or guidelines for the conduct of research occurred in the conduct of the study. That review is nearing completion and the school is unable to discuss the results at this time."


I hesitate to speculate on what Hopkins will conclude.

4) A change in policy by the Lancet to require authors to allow for replication of their work. More and more journals are moving in this direction and I expect the Lancet to follow suit at some point.

Will any of these come to pass? I don't know. But, heading into year 5 of this controversy, I am pleased with the progress that we have made so far.

Friday, February 06, 2009

Tiny Chance

The question of how the 2004 Lancet authors viewed the Fallujah "outlier" is often of interest. Consider this letter to the editor of The Independent.


You reported ("Polish hostage held in iraq is released unharmed", 21 November) the Foreign Secretary's response to our study published in The Lancet of civilian deaths in Iraq. It is heartening that Jack Straw has addressed the topic in such detail. However, his response includes an apparent misreading of our results.

Our study found that violence was widespread and up 58-fold after the invasion; that from 32 of the neighbourhoods we visited we estimated 98,000 excess deaths; and that from the sample of the most war-torn communities represented by 30 households in Fallujah more people had probably died than in all of the rest of the country combined.

Fallujah is the only insight into those cities experiencing extreme violence (ie Ramadi, Tallafar, Fallujah, Najaf); all the others were passed over in our sample by random chance. If the Fallujah duster is representative, there were about 200,000 excess deaths above the 98,000.

Perhaps Fallujah is so unique that it represents only Fallujah, implying that it represents only 50-70,000 additional deaths. There is a tiny chance that the neighborhood we visited in Fallujah was worse than the average experience, and only corresponds with a couple of tens of thousands of deaths. We also explain why, given study limitations, our estimate is likely to be low. Therefore, when taken in total, we concluded that the civilian death toll was at least around 100,000 and probably higher, not between 8,000 and l94,000 as Mr. Straw states. While far higher than the Iraq Ministry of health surveillance estimates, on 17 August the minister himself described surveillance in Iraq as geographically incomplete, insensitive and missing most health events.

We, the occupying nations, should aspire to acknowledge the dignity of every life lost, and to monitor trends and causes of deaths to better serve the Iraqis, and in doing so, sooner end this deadly occupation.

Les Roberts, Gilbert Burnham Centre for International Emergency, Disaster and Refugee Studies, Johns Hopkins Bloomberg School of Public Health, Baltimore; Richard Garfield School of Nursing, Columbia University, New York, USA


I would not be surprised if the initial version of the paper that they submitted to the Lancet included the Fallujah data without much/any discussion of what an outlier it was. The authors clearly believed that the data from Falluja is representative of what was happening in large parts of Iraq. Perhaps the reviewers made a stink, thus leading to the sometimes-Fallujah-in-sometime-Fallujah-out nature of the final paper.

UPDATE: Whoops! I blogged about this letter before.