Long-term Exposure to Air Pollutants and Alzheimer’s Disease Dementia Prevalence Across the Contiguous United States: An Explainable Machine Learning Analysis

Published: 19 Aug 2025, Last Modified: 12 Oct 2025BHI 2025EveryoneRevisionsBibTeXCC BY 4.0
Confirmation: I have read and agree with the IEEE BHI 2025 conference submission's policy on behalf of myself and my co-authors.
Keywords: Air Pollutants, Alzheimer’s disease dementia, Geographically weighted random forest, Spatial machine learning
TL;DR: We applied spatial machine learning to model Alzheimer’s dementia prevalence across the U.S. and found PM10 to be the most impactful air pollutant, especially in urban areas.
Abstract: A growing number of studies have examined the relationship between environmental factors and Alzheimer’s Disease (AD) dementia prevalence. However, exploration into long-term exposure to air pollutants at the county level across the United States using spatial machine learning has been insufficiently examined. We compiled long-term data for six air pollutants (PM2.5, PM10, NO2, CO, O3, and SO2) from 1999 to 2020 to examine their relationship with AD dementia prevalence using global Random Forest, global XGBoost, geographically weighted random forest (GWRF), and local XGBoost models. These models were evaluated with several metrics (i.e. R2, RMSE, and AIC). Moreover, Gini feature importance and SHAP values were used to assess the relative contribution of each pollutant and interpret model outputs. The GWRF model outperformed other local and global models, with an R2 value of 54.38%, with the best fit observed in the Northeast and West Coast regions. Findings from Gini feature importance showed PM10 as the most influential predictor, followed by NO2, O3, and PM2.5. In addition, PM10 emerged as the primary variable in 25.31% of counties (n=786), while SO2 and CO displayed a smaller role. Our results suggest that PM10 may play a more significant role in AD dementia prevalence in the US than previously recognized especially in urban areas.
Track: 5. Public Health Informatics
Registration Id: 76NLWBMXPLC
Submission Number: 305
Loading