How much missing data is too much to impute for longitudinal health indicators? A preliminary guideline for the choice of the extent of missing proportion to impute with multiple imputation by chained equations
Post Graduate Institute of Medical Education and Research
Abstract
The multiple imputation by chained equations (MICE) is a widely used approach for handling missing data. However, its robustness, especially for high missing proportions in health indicators, is under-researched. The study aimed to provide a preliminary guideline for the choice of the extent of missing proportion to impute longitudinal health-related data using the MICE method.
The study obtained complete data on five mortality-related health indicators of 100 countries (2015-2019) from the Global Health Observatory. Nine incomplete datasets with missing rates from 10 to 90% were generated and imputed using MICE. The robustness of MICE was assessed through three approaches: comparison of means using the Repeated Measures- Analysis of variance, estimation of evaluation metrics (Root mean square error, mean absolute deviation, Bias, and proportionate variance), and visual inspection of box plots of imputed and non-imputed data.
Citation impact
- FWCI
- 102.47
- Percentile
- 100%
- References
- 40
Authors
5- KPK. P. JunaidCorresponding
Post Graduate Institute of Medical Education and Research
- TKTanvi Kiran
Post Graduate Institute of Medical Education and Research
- MGMadhu Gupta
Post Graduate Institute of Medical Education and Research
- KKKamal Kishore
Post Graduate Institute of Medical Education and Research
- SSSujata Siwatch
Post Graduate Institute of Medical Education and Research
Topics & keywords
- Missing data
- Imputation (statistics)
- Medicine
- Biostatistics
- Health services research
- Public health
- Guideline
- Statistics