[Update] The CDC is lying to you

See added notes at end of article.


I’ve mentioned the problem with highly variable firearm homicides numbers in the FBI UCR vs. the CDC’s WISQARS. For 2017, the UCR claims 10,982 firearms homicides, while the CDC says 14,542, 32% higher. Part of that is reporting.

The UCR is based on reported numbers from law enforcement agencies. But not all report.

The CDC, though…

The CDC numbers are based on emergency department reports, using ICD-10 codes. But, like the FBI, they don’t…use data from every hospital. In fact, they pick out just 60 hospitals and use their reports as a proxy for the country. It’s rather like Rasmussen pseudo-randomly surveying 1,000 people in hopes of picking a representative sample of all Americans, and extrapolating from there.

And that’s a huge problem.

If more of the sample hospitals are in places like Chicago, Saint Louis, or Baltimore, it skews the results, because those locations have a disproportionate number of firearms homicides compared to Alamogordo, New Mexico. If you assume everywhere has a firearms homicide rate like Baltimore, you’re going to extrapolate an unrealistically high number. Maybe even 32% higher than what the FBI says.

Problem, right?

I’ve barely started.

First, a 60 hospital sample is ridiculous when there are 6,210 hospitals in the US.

CDC samples fewer than 1% of hospitals.

Second, there is no good reason to do a 60 hospital sample. Or a 600 hospital sample.

To comply with the federal HIPAA law, since October 1, 2015, every HIPAA-covered entity — every hospital — in the nation reports every single gunshot wound, by ICD-10 code, to the government. All 6,210. For Every. Single. Patient. ICD-10 is just the latest iteration. They have been collecting this data for years.

The CDC doesn’t need to sample, then guess at the total number. The total number for every hospital in the country is already at their disposal. At most, they might have to make extremely minor adjustments for occasional coding errors. But since Medicare/Medicaid and insurance payments are based on the reported codes, the existing system already checks for coding errors. Damned few should slip past insurance companies dead set on paying out the least they can.

In fact, they have more data than just “gunshot injury.” ICD-10 breaks it down by intent (accidental/self, accidental/other, suicide, homicide) and weapon (machinegun, rifle, shotgun, handgun, other). There is a separate code for each possible combination. More codes if multiple weapons. More codes for where on the body the injury is. The admission data (which they get) includes age, race (with more choices than the 6 given in WISQARS), gender. They have the hospital location for geographic distribution of injuries.

And it isn’t just fatal injury, all those code options are there for nonfatal injuries.

With the available data, the CDC can sort for “white males, 18-24, shot in lower back, in Kalamazoo, fatal and non-fatal” and give you the exact numbers.

I’ve played with the WHO ICD database, and the available data is amazing.

So why isn’t the CDC simply using the raw data, instead of sampling and extrapolating? Is it too difficult to get to get the data for research purposes?

No. ICD is designed for researchers to use, by intent. Medical people hate it because — to make any possible injury/illness in which a researcher might someday be interested — there are upwards of 150,000 different codes to choose from; want to know how many people are bitten by large dogs vs. small; it’s there.

There’s only one reason for the CDC to forego using the entire database as intended, and cherry-pick a handful of “representative” hospitals.

Because the raw data doesn’t support the laws the victim-disarming gun controllers want.

The raw data would tell us who is getting shot. With what. What the victims’ demographic and geographic distribution is. Combined with the UCR, it would tell gun controllers which criminals to target, and how. All the things the CDC pretends it can’t do.

The CDC has to lie about injury reports to rationalize targeting honesty gun owners who don’t commit the crimes.

Added: In a comment below, MC notes that WISQARS Fatal and Nonfatal Injury reports come from differing datasets.

WISQARS Fatal is sourced from the NCHS Vital Statistics System, using ICD codes.

WISQARS Nonfatal is sourced from the NEISS All Injury Program run by the Consumer Product Safety Commission.

NEISS does base their estimate on the sampling I speak of. NCHS appears to use the full data, so WISQARS Fatal should be accurate.

This still leaves the question of why the available full dataset is not used for both. I suspect it is a matter of bureaucratic empire building- the CPSC started doing injury reports through their system a long time ago, and don’t want to relinquish it.

I thank MC for clarifying the difference between the two data sources.

[Permission to republish this article is granted so long as it is not edited and the author and The Zelman Partisans are credited.]

Carl is an unpaid TZP volunteer. If you found this post useful, please consider dropping something in his tip jar. He could really use the money, what with truck repairs and recurring bills. And the rabbits need feed. Truck insurance, lest I be forced to sell it. Click here to donate via PayPal.
(More Tip Jar Options)
Facebooktwitterredditpinteresttumblrmail

11 thoughts on “[Update] The CDC is lying to you”

    1. but WHICH hospitals, and WHERE they are located, can HUELY skew the dataset to support or destroy a predetermined agenda, which is PRECISELY Bear’s point here. Since the data ARE to thorough, and readily avaialble, WHY NOT use the entire available sample set ANd how do we know CDD are not cherry picking to “direct” the conclusions to suit their predetermined outcome?

      As Sam Clemmons so famously declared: there are lies, damn lies, and statistics.
      I studied statistics in college, and learned that by setting up the “study” I can easily select the desired outcome beforehand, then select the sample populations and sample size, limit the variables, and “prove” that whatever my predetermined outcome was is supported by “data” , “:research” and “facts”. Yet the TRUTh is not served.

      CDC being a FedGov agency (with NO COnstitutinal basis for existing) can be expected to behave exactly as we see here. They, being gummit, are NOT neutral. Thus should be disbanded, and all the suckers at the public teat need to be dismissed, and go find some sort of PRODUCTIVE work to feed themselves and their (hopefully small) families.

      1. I’m not saying the CDC’s methodology isn’t flawed, or can’t be made to be flawed. I’m simply observing that Carl used as a simile a statistical methodology that sounds intuitively invalid to Joe Sixpack, but is in fact quite valid. In other words, the comparison is flawed.

        1. OK, time to step in. I am NOT claiming a 60 of 6,210 sample is flawed. I merely noted that sampling hospitals is essentially the same thing as sampling people in a phone survey.

          My point, which I clearly failed to express properly is that since they have the complete dataset of all hospitals, throwing out 99% of it and extrapolating back is freaking stupid, if honest data is their goal.

    2. Dear Bog…. SciAm…

      1,004 soil samples of a suspected contaminated are: fine.
      1,004 product sampls of a assembly line: fine.
      1004 samples of people’s political opinions across a continent wide nation of 3,142 counties (and equivalent), with widely varying culture and demographics: BS.

      The best election poll I ever saw was from a professor trying to teach his students why careful sampling is important. He randomly picked 1 person from every county in his — 4 state, I think — area example. His result was within 2% of the election results. None of the “official” polls came close with predictions across the map.

      Added: What a 1,000 person survey IS good for is tracking TRENDS on a particular topic. And best if it’s the same people: a longitudinal study.

  1. The CDC reports fatal injury data and non-fatal injury data. You’ve confused the source for the two. The mortality or fatal injury data accessible through WISQARS is not based on a survey of hospitals; it is based on death certificates and the ICD-10 mortality codes.

    Non-fatal injury data available through WISQARS is skewed because that is based on a survey.

    The 538 article you’ve cited is exclusively about non-fatal injury data. Fatal injury data reported by the CDC through WISQARS – which uses the mortality codes – would be the same data reported to the WHO.

  2. *** Completely Off Topic ***

    It doesn’t surprise me that CDC would “cook the books” but I see gun guys bending the truth with homicide statistics when they favor our cause. The problem with that strategy is that homicides (or murders) require that someone dies. No death, no homicide. So crowing about declining homicide numbers is disingenuous if not deceptive because homicide numbers over time are meaningless unless balanced against any change in the numbers of firearm assaults in the same period. The rate of deaths from GSWs over time will always tend to decline over time — especially in a period of war — because GSW trauma care continually improves.

    So it rubs me the wrong way when gun guys talk about a decline in homicide rates like that means something by itself (without mention of any corresponding change in rate of shootings) because it shows that either they’re naïve or they think I am.

    1. Not off topic at all. Long term, we do have numbers showing all violent crime is way down since the ’90s, including firearms homicide, so it isn’t just improved treatment.

      For the short term, if we look at the WISQARS data, such as it is, that could be what we’re seeing.

      https://webappa.cdc.gov/sasweb/ncipc/mortrate.html
      Fatal Firearm Homicide
      2016: 14,415
      2017: 14,542

      https://webappa.cdc.gov/sasweb/ncipc/nfirates.html
      Nonfatal Firearm Assault
      2016: 88,679
      2017: 107,002

      Fatal didn’t increase near as much as nonfatal. But… Do we know it’s because trauma care improved? Maybe marksmanship got worse. Maybe most of the shootings were nonlife-threatening (the majority of shootings I see in the news are minor wounds).

      And that brings me back to my main point about ICD codes. The dataset is there, but we aren’t seeing it. I mentioned there are codes for self or other inflicted, accident/intentional, and so forth. There also codes for degree of seriousness from mild to life-threatening. If they used those codes for nonfatal, we’d be able to answer your question.

  3. I checked the FBI UCR for 2016 and that data indicate 11,004 total firearm homicides. The territories of Puerto Rico and Guam are not included, and I know PR has more than a few every year. I’m not familiar with Guam’s crime rates. I doubt though that PR and Guam have 3,411 firearm homicides a year.

    The WISQARS shows 549 fatal legal interventions for 2016. The Washington Post Fatal Force database indicate 963 police shootings for 2016.

    All in all none of these databases inspire a great deal of confidence.

Leave a Reply to Tionico Cancel reply

Your email address will not be published. Required fields are marked *