articleBMC Medical Research MethodologyDec 1, 2014GOLD OA

Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints

Medisch Centrum Alkmaar · University of Toronto · +1 more institution

PubMed
Indexed incrossrefdoajpubmed

Abstract

Background

Modern modelling techniques may potentially provide more accurate predictions of binary outcomes than classical techniques. We aimed to study the predictive performance of different modelling techniques in relation to the effective sample size ("data hungriness").

Methods

We performed simulation studies based on three clinical cohorts: 1282 patients with head and neck cancer (with 46.9% 5 year survival), 1731 patients with traumatic brain injury (22.3% 6 month mortality) and 3181 patients with minor head injury (7.6% with CT scan abnormalities). We compared three relatively modern modelling techniques: support vector machines (SVM), neural nets (NN), and random forests (RF) and two classical techniques: logistic regression (LR) and classification and regression trees (CART). We created three large artificial databases with 20 fold, 10 fold and 6 fold replication of subjects, where we generated dichotomous outcomes according to different underlying models. We applied each modelling technique to increasingly larger development parts (100 repetitions). The area under the ROC-curve (AUC) indicated the performance of each model in the development part and in an independent validation part. Data hungriness was defined by plateauing of AUC and small optimism (difference between the mean apparent AUC and the mean validated AUC 200 events per variable.

Citation impact

789
total citations
FWCI
5.80
Percentile
100%
References
34
Citations per year

Authors

3

Topics & keywords

Keywords
  • Support vector machine
  • Logistic regression
  • Statistics
  • Random forest
  • Sample size determination
  • Artificial intelligence
  • Receiver operating characteristic
  • Regression
UN Sustainable Development Goals
  • Good health and well-being
No related works found for this paper.