Methodology & Data Sources
Data Sources
PlainHealth is built on a single, authoritative federal dataset:
- NCHS Leading Causes of Death: United States (1999-2017): Final CDC data with age-adjusted death rates, organized by ICD-10 leading-cause category and state (CDC Socrata dataset bi63-dtpu).
- CDC WONDER, Underlying Cause of Death: The source of the age-adjusted rates, standardized to the year 2000 US standard population.
We deliberately use only finalized data. CDC final mortality lags the calendar by roughly two years; provisional later years lack the complete, comparable age-adjusted rates this site depends on, so they are excluded rather than shown half-filled.
Coverage
PlainHealth covers 19 years of mortality data (1999-2017) for all 50 US states and the District of Columbia, across the 10 leading causes of death. The grid is complete: 51 jurisdictions x 10 causes x 19 years is exactly 9,690 records, each with a death count and an age-adjusted rate.
Age-Adjusted Rates
Age-adjusted mortality rates use the year 2000 US standard population as the reference, following CDC methodology. Age adjustment removes the effect of differing age distributions between states, enabling fair comparisons. A state with an older population would naturally have higher crude death rates; age-adjusted rates correct for this.
ICD-10 Classification
Causes of death are classified using the International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10), the international standard used by the CDC and all US vital statistics reporting. PlainHealth groups related causes into ICD-10 chapters (e.g., all heart-related deaths under "Diseases of the circulatory system") for easier browsing.
Suppressed Data
CDC WONDER suppresses death counts under 10 for privacy protection. At the level PlainHealth works, 10 broad leading-cause chapters by state and year, every cell clears that threshold, so no values are suppressed and the dataset has no gaps to fill.
Processing Pipeline
- The NCHS Leading Causes of Death dataset (1999-2017) is downloaded from CDC. Each record carries state, year, ICD-10 leading-cause chapter, a death count, and an age-adjusted rate.
- Records are organized into the 10 leading-cause chapters (heart disease, cancer, and so on), the same rollup CDC uses, so cause groupings are consistent across all 19 years.
- State-level time series are assembled for each cause, enabling 1999-2017 trend analysis for every state and DC.
- National aggregates and state rankings are pre-computed for the latest complete year (2017).
- All data is loaded into a structured SQLite database serving state profiles, cause-of-death pages, trend visualizations, and ranking tables.
Data Vintage and Update Frequency
Final mortality data from NCHS typically becomes available 11-15 months after the end of the calendar year. PlainHealth uses only this finalized, age-adjusted series, which currently runs through 2017. We do not substitute provisional figures for years that CDC has not yet finalized; instead, the site is extended when CDC publishes additional finalized years.
Accuracy Commitment
PlainHealth reproduces CDC and NCHS mortality data exactly as published. Death counts and age-adjusted rates are presented without modification. Because the dataset covers 10 broad leading-cause chapters by state and year, every cell clears CDC's privacy threshold, so no values are suppressed or estimated. All age-adjusted rates use the year 2000 U.S. standard population, consistent with CDC methodology.
Limitations
- The series ends in 2017, the latest year of finalized NCHS leading-cause data, so it does not cover the COVID-19 era or more recent years.
- The 10 leading-cause chapters are a deliberately coarse rollup, not the full ICD-10 code set; they capture the major mortality drivers but not every specific condition.
- ICD-10 coding practices have evolved over time, and changes in coding guidance may affect long-term trend comparisons for some causes of death.
- Age-adjusted rates for small states or rare causes may have wide confidence intervals that are not displayed on PlainHealth.
- This data is for informational purposes only and does not constitute medical advice. PlainHealth is not affiliated with the CDC, NCHS, or any government agency.
Editorial Workflow
Content on PlainHealth is built from official source data by a documented pipeline. Raw mortality data from the CDC NCHS Leading Causes of Death dataset is ingested programmatically by our ETL pipeline, with age-adjusted rates from CDC WONDER, and every death count and rate is read directly from that source record at build time, no figure is hand-entered. Our editorial team sets the rules the pipeline applies: which datasets to use, how each metric is defined and labeled, how derived comparisons are computed, what the guides and explainers say, and what we will not publish. The pipeline then applies those decisions uniformly across every state and cause. We do not accept payment for coverage, placement, or rankings, all rankings and comparisons are computed directly from the CDC source data. See our editorial & corrections policy for how to report a figure that looks wrong.
Frequently Asked Questions
Where does PlainHealth's mortality data come from?
All cause-of-death figures come from the CDC National Center for Health Statistics (NCHS) Leading Causes of Death dataset (1999-2017 final data with age-adjusted rates), with rates standardized through CDC WONDER. Both are publicly available on data.cdc.gov and wonder.cdc.gov.
How often is the data updated?
CDC final mortality data is typically published 11-15 months after the end of the calendar year. PlainHealth uses the finalized NCHS Leading Causes of Death series, which currently runs through 2017; the site is refreshed when CDC extends the finalized, age-adjusted series. We intentionally do not display provisional later years, which lack the complete comparable rates this site relies on.
How accurate are the age-adjusted rates?
Age-adjusted rates on PlainHealth are reproduced from CDC NCHS publications without modification. They use the year 2000 US standard population as the reference, following official CDC methodology. Age adjustment removes the effect of differing age distributions between states, enabling fair comparisons. A state with an older population would naturally have higher crude death rates; age-adjusted rates correct for this.
What are the limitations of this data?
State-level data masks significant variation within states (urban vs. rural). The series ends in 2017, the latest year of finalized NCHS leading-cause data, so it does not cover the COVID-19 era. Death certificates list a single underlying cause, which may not capture the full picture for individuals with multiple contributing conditions, and the 10 leading-cause chapters are a deliberately coarse rollup rather than the full ICD-10 code set.
Contact
Questions about our methodology? Contact us.
Verify the data yourself
Every figure on PlainHealth can be traced back to the official CDC source. These are the primary references for the mortality data and methodology on this site:
- CDC WONDER - the CDC's official query system for U.S. mortality data, and the source of the age-adjusted rates shown here.
- NCHS Leading Causes of Death (data.cdc.gov) - the finalized 1999–2017 leading-cause dataset this site is built from.
- National Center for Health Statistics (NCHS) - the CDC center that compiles and publishes U.S. vital statistics.
- National Vital Statistics System (NVSS) - the program behind U.S. death-certificate data and cause-of-death coding.