articleComputer Methods and Programs in BiomedicineMar 31, 2022HYBRID OA

Diabetes mellitus prediction and diagnosis from a data preprocessing and machine learning perspective

University of the West of England · Bristol Robotics Laboratory

PubMed
Indexed incrossrefpubmed

Abstract

Methods

In this paper, a robust framework for building a diabetes prediction model to aid in the clinical diagnosis of diabetes is proposed. The framework includes the adoption of Spearman correlation and polynomial regression for feature selection and missing value imputation, respectively, from a perspective that strengthens their performances. Further, different supervised machine learning models, the random forest (RF) model, support vector machine (SVM) model, and our designed twice-growth deep neural network (2GDNN) model are proposed for classification. The models are optimized by tuning the hyperparameters of the models using grid search and repeated stratified k-fold cross-validation and evaluated for their ability to scale to the prediction problem.

Results

Through experiments on the PIMA Indian and LMCH diabetes datasets, precision, sensitivity, F1-score, train-accuracy, and test-accuracy scores of 97.34%, 97.24%, 97.26%, 99.01%, 97.25 and 97.28%, 97.33%, 97.27%, 99.57%, 97.33, are achieved with the proposed 2GDNN model, respectively.

Citation impact

238
total citations
FWCI
56.69
Percentile
100%
References
49
Citations per year

Authors

3

Topics & keywords

Keywords
  • Artificial intelligence
  • Machine learning
  • Random forest
  • Diabetes mellitus
  • Computer science
  • Missing data
  • Feature selection
  • Support vector machine
No related works found for this paper.