Data Validation Report

Every query result on this site is validated against published statistics from the CDC, NCHS, and CMS. This report shows our automated test suite results.

56/56

Checks Passed

Test Cases

Datasets

Mar 11, 2026

Last Run

Methodology

Each test case compares a result from our data against a published value from an official CDC or NCHS source. We run two independent layers of validation for every test:

Layer 1

Gold SQL

A hand-written SQL query is executed directly against the DuckDB database on Railway. This tests whether the data itself reproduces published statistics, independent of the AI layer. If Layer 1 fails, the data or our understanding of the codebook is wrong.

Layer 2

NL Query

A natural language question is sent through the full production pipeline: the question goes to our API, Claude generates SQL, Railway executes it, and the result is checked. This tests the end-to-end system that users interact with. If Layer 2 fails but Layer 1 passes, the AI is misinterpreting the question or generating incorrect SQL.

BRFSS Results

11 tests

Behavioral Risk Factor Surveillance System — self-reported survey data, 400K+ respondents/year. Values are weighted prevalence percentages using CDC's _LLCPWT survey weights.

Statistic	Year	Published	Gold SQL	Dev	NL Query	Dev	Source
Adult obesity (national)	2017	30.1%	30.1%	0.0	30.1%	0.0	CDC Obesity Maps
Adult obesity (national)	2018	30.9%	30.9%	0.0	30.9%	0.0	CDC Obesity Maps
Adult obesity (West Virginia)	2018	39.5%	39.5%	0.0	39.5%	0.0	CDC State Data
Adult obesity (Colorado)	2018	22.9%	22.9%	0.0	22.9%	0.0	CDC State Data
Current smoking	2018	15.5%	15.5%	0.0	15.5%	0.0	CDC Tobacco Data
Adult obesity (national)	2020	31.9%	31.9%	0.0	31.9%	0.0	CDC BRFSS Overweight and Obesity Dataset
Diagnosed diabetes	2018	10.9%	11.4%	+0.5	11.8%	+0.9	CDC Chronic Disease Indicators — Diabetes
Current asthma	2018	9.2%	9.2%	0.0	9.2%	0.0	CDC Asthma
Physical inactivity	2018	24.5%	24.5%	0.0	24.5%	0.0	CDC PCD
Adult obesity (national)	2023	34.3%	32.8%	-1.5	32.8%	-1.5	CDC Newsroom
Lifetime depression diagnosis (national)	2020	18.5%	18.8%	+0.3	18.8%	+0.3	CDC MMWR 72(24), June 2023

NHANES Results

8 tests

National Health and Nutrition Examination Survey (2021–2023 cycle) — clinical exams + lab measurements. Values are weighted prevalence percentages using WTMEC2YR exam weights.

Statistic	Year	Published	Gold SQL	Dev	NL Query	Dev	Source
Obesity overall (BMI≥30)	2021–23	40.3%	40.3%	0.0	39.8%	-0.5	NCHS Brief #508
Obesity, men (BMI≥30)	2021–23	39.2%	39.2%	0.0	38.7%	-0.5	NCHS Brief #508
Obesity, women (BMI≥30)	2021–23	41.3%	41.3%	0.0	40.8%	-0.5	NCHS Brief #508
Total diabetes (incl. undiagnosed)	2021–23	15.8%	13.8%	-2.0	13.8%	-2.0	NCHS Brief #516
High cholesterol (≥240 mg/dL)	2021–23	11.3%	11.4%	+0.1	11.1%	-0.2	NCHS Brief #515
Hypertension (measured + Dx)	2021–23	47.7%	50.0%	+2.3	50.0%	+2.3	NCHS Brief #511
Severe obesity (BMI≥40)	2021–23	9.4%	9.4%	0.0	9.3%	-0.1	NCHS Brief #508
Depression (PHQ-9≥10)	2021–23	13.1%	12.6%	-0.5	12.6%	-0.5	NCHS Brief #527

Medicare Inpatient (Part A) Results

4 tests

Medicare Inpatient Prospective Payment System (IPPS) — hospital discharges by DRG, ~2M rows across 11 years (2013–2023). Values are counts from the CMS Provider Summary PUF, which only includes hospitals with ≥11 discharges per DRG.

Statistic	Year	Published	Gold SQL	Dev	NL Query	Dev	Source
IPPS hospitals	2023	3,100	2,941	-5.1	2,941	-5.1	CMS IPPS PUF
Distinct DRG codes	2023	600	534	-11.0	534	-11.0	CMS FY 2023 IPPS Rule
Top DRG: Septicemia (871)	2023	561,177	561,177	0.0	561,177	0.0	CMS IPPS PUF
#2 DRG: Heart Failure (291)	2023	319,367	319,367	0.0	319,367	0.0	CMS IPPS PUF

Medicare Part D Results

5 tests

Medicare Part D Prescribers by Provider and Drug — 276M rows across 11 years (2013–2023). Published values are aggregate totals from the CMS Public Use File. Prescriber-drug combinations with fewer than 11 claims are suppressed by CMS before release.

Statistic	Year	Published	Gold SQL	NL Query	Source
Unique prescribers	2023	1,104,162	1,104,162	1,104,162	CMS Part D PUF
Total claims	2023	1,393,568,104	1,393,568,104	1,393,568,104	CMS Part D PUF
Total drug cost	2023	$212.7B	$212.7B	$212.7B	CMS Part D PUF
Unique prescribers	2019	985,533	985,533	985,533	CMS Part D PUF
Total drug cost	2019	$137.0B	$137.0B	$137.0B	CMS Part D PUF

Notes

Tolerance thresholds

Each test has a pre-defined tolerance (typically 1–2 percentage points for BRFSS, 1.5–5 for NHANES). These account for differences in survey weight versions, age cutoffs, and rounding. A deviation within tolerance is a pass.

BRFSS vs NHANES obesity gap

BRFSS reports ~31–33% obesity; NHANES reports ~40%. This is not an error. BRFSS uses self-reported height/weight (people underreport weight), while NHANES uses clinical measurements. The gap is well-documented in epidemiological literature.

CMS Public Use File suppression

Medicare PUF data suppresses all provider-level rows with fewer than 11 claims, beneficiaries, or discharges. This means aggregate totals from the PUF are systematically lower than universe totals. For Medicare Inpatient, hospital and DRG counts are ~5–15% below CMS-reported totals. For Part D, published values are computed directly from the PUF, so Gold SQL matches exactly.

What each layer catches

Layer 1 failures indicate data issues: wrong codebook interpretation, missing survey weights, incorrect variable coding. Layer 2 failures (with Layer 1 passing) indicate AI issues: the NL-to-SQL model is generating incorrect queries. Both layers passing means the data is correct and the AI can reproduce results from plain English questions.

Open Health Data Hub

Data Validation Report

Methodology

BRFSS Results

NHANES Results

Medicare Inpatient (Part A) Results

Medicare Part D Results

Notes

Tolerance thresholds

BRFSS vs NHANES obesity gap

CMS Public Use File suppression

What each layer catches

Source Citations

BRFSS Sources

NHANES Sources

Medicare Inpatient Sources

Medicare Part D Sources