The Health-Wealth Gradient in Labor Markets: Integrating Health, Insurance, and Social Metrics to Predict Employment Density
University of Pennsylvania · Boston University · +2 more institutions
Indexed incrossrefdoaj
Abstract
Methods
We constructed a multi-source longitudinal dataset (2014–2024) by aggregating county-level Quarterly Census of Employment and Wages (QCEW) data with County Health Rankings to the state level. Using a time-aware split to evaluate performance across the COVID-19 structural break, we compared LASSO, Random Forest, and regularized XGBoost models, employing SHAP values for interpretability.
Results
The tuned, regularized XGBoost model achieved strong out-of-sample performance (Test R2 = 0.800). A leakage-safe stacked Ridge ensemble yielded comparable performance (Test R2 = 0.827), while preserving the interpretability of the underlying tree model used for SHAP analysis.
Citation impact
13
total citations
- FWCI
- 355.35
- Percentile
- 100%
- References
- 17
Too recent for citation history.
Authors
3Topics & keywords
Topics
Keywords
- Interpretability
- Census
- Random forest
- Workforce
- Population
- Ridge
- Geocoding
UN Sustainable Development Goals
- Decent work and economic growth
No related works found for this paper.