### Lies, Damned Lies, and Lefty Studies Bereft of Statistical Methodology

About that Lancet study claiming 655,000 Iraqi deaths:

I do this sort of thing for a living, and one point it is absolutely essential non-statisticians understand is that the trickiest part of sampling is to make sure that the sample is truly representative of the population. It's hard to do because you aren't measuring the whole population---that's the point of sampling.

When the sampling involves people rather than, say, bottles of Coca-Cola rolling down a plant line, it gets trickier still. People lie. People sometimes refuse to answer. People avoid survey takers. Samples in that case must be very much larger to try to reduce the impact of this error.

Another problem is frame of reference and interaction between variables of interest. Deaths since 2003 in Iraq is of little interest---people die all the time. Attributing these deaths specifically to the American invasion is a fool's errand---everything depends on the definition used to bucket deaths into invasion-related and non-invasion-related. Serious statisticians use tools like measurement system analysis to evaluate measurement error from such bucketing, which is usually quite significant. They also evaluate for covariation, to sift out signal from noise where variables thought to be distinct turn out in fact to be subsets of one another.

It's a heady, boring business which often results in inconclusive data. In fact, the way you can tell whether statistical work in sociological studies is done honestly is whether or not its findings are inconclusive. Folks like Charles Murray spend years trying to sift one or two nuggets out of the mass of sociological data using very creative methods; you can tell they're honest because their findings are so non-controversial in statistical circles, and to informed laymen appear to be almost tautologies in hindsight.

Whenever some scribblers claim to discover startling results from statistical analysis, check your wallet---chances are you're being sold flim-flammery.

Now let me see if I’ve got this straight. The JHBSPH study attempts to calculate the number of civilian deaths “above what would have occurred without conflict.” I wonder, therefore, if the survey group was taking into account the effects of United Nations sanctions on Iraq prior the invasion — which, if the conflict hadn’t occurred, would logically still be in place. According to U.N. studies using similar methodologies to those utilized by JHBSPH, roughly 150,000 civilians, more than half of them children, were dying every year as a direct result of U.N. sanctions. Since the sanctions ended in May 2003 after the fall of Saddam Hussein’s regime, that means that in the 3.5 years since then, roughly 525,000 lives were spared. If we compare that number with the JHBSPH’s estimate of 600,000 lives lost as a result of the conflict, we’re led to conclude that George W. Bush’s decision to oust Saddam has cost roughly 75,000 Iraqi civilian lives. But the JHBSPH researchers acknowledge a huge margin for error; their low end estimate is 426,369. That means Bush’s decision to invade may actually have saved almost 100,000 lives.

What was it Mark Twain once said about “lies, damn lies and statistics”?

The manipulation of information is part of the fog of war, of course. That’s a point worth remembering whenever you’re confronted with ludicrous numbers like those put out by the JHBSPH. Indeed, that’s a point worth remembering whenever anyone starts talking about the civilian body count in Iraq. Agendas abound. On the political Left, peace activists and opportunistic Democratic politicians invariably cite high-end statistics in order to justify their attacks on the Bush administration. For that reason, they favor U.N. figures. According to the U.N., for example, 3009 Iraqi civilians were killed in August 2006 (the last month for which U.N. has published data) in the ongoing terrorist insurgency, down from 3590 in July.

Skeptical American military commanders point out that the U.N. numbers are based on combined reporting from the Iraqi health ministry, which is controlled by supporters of anti-American Shiite cleric Moqtada al-Sadr, and Baghdad’s central morgue, which apparently designates every unidentified corpse as a victim of the war. Indeed, the U.N. numbers feel inflated since 3,000 killings per month averages out to over a 100 every day — a total that far exceeds daily media accounts. By contrast, Reuters reports the civilian toll in August at 769, down from 1065 in July. (In the interest of fairness, it should be noted that Reuters reports the body count in September as 1,089, a sharp rise in fatalities from August.) The Reuters figures are derived from combined data provided by the Iraqi ministries of health, interior, and defense — dubious sources, to be sure — but not data from the Baghdad morgue.

I do this sort of thing for a living, and one point it is absolutely essential non-statisticians understand is that the trickiest part of sampling is to make sure that the sample is truly representative of the population. It's hard to do because you aren't measuring the whole population---that's the point of sampling.

When the sampling involves people rather than, say, bottles of Coca-Cola rolling down a plant line, it gets trickier still. People lie. People sometimes refuse to answer. People avoid survey takers. Samples in that case must be very much larger to try to reduce the impact of this error.

Another problem is frame of reference and interaction between variables of interest. Deaths since 2003 in Iraq is of little interest---people die all the time. Attributing these deaths specifically to the American invasion is a fool's errand---everything depends on the definition used to bucket deaths into invasion-related and non-invasion-related. Serious statisticians use tools like measurement system analysis to evaluate measurement error from such bucketing, which is usually quite significant. They also evaluate for covariation, to sift out signal from noise where variables thought to be distinct turn out in fact to be subsets of one another.

It's a heady, boring business which often results in inconclusive data. In fact, the way you can tell whether statistical work in sociological studies is done honestly is whether or not its findings are inconclusive. Folks like Charles Murray spend years trying to sift one or two nuggets out of the mass of sociological data using very creative methods; you can tell they're honest because their findings are so non-controversial in statistical circles, and to informed laymen appear to be almost tautologies in hindsight.

Whenever some scribblers claim to discover startling results from statistical analysis, check your wallet---chances are you're being sold flim-flammery.

## 3 Comments:

This is good stuff. As a yute I balked at statistics, and consequently am operationally ignorant. Older and wiser now, I keep my old Snedecor and Cochran,

Statistical Methodsat my bedside. It helps during those long nights of insomnia.LT

I did statistical analysis for a living for seven years. You are exactly right. Everything is dependent on how the sample is chosen. You can use statistics to prove anything you want to just by manipulating the data sampling method. And when you take a very small sample of a very large population, margins of error become so large as to render results meaningless (the basic flaw in television's Nielson ratings, btw).

I wonder if the question is all wrong. I think it should be: How many deaths have occured since the takedown of Saddam? Not necessarily the War bearing full blame.

Once Saddam was out of power, a struggle would naturally take place between the factions- so, it's an indirect relation to the War- as opposed to an outcome... does that make sense??

Post a Comment

## Links to this post:

Create a Link

<< Home