articleEcographyDec 8, 2016Closed access

Cross‐validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure

University of Freiburg · Wright State University · +6 more institutions

Indexed incrossrefdoaj

Abstract

Ecological data often show temporal, spatial, hierarchical (random effects), or phylogenetic structure. Modern statistical approaches are increasingly accounting for such dependencies. However, when performing cross‐validation, these structures are regularly ignored, resulting in serious underestimation of predictive error. One cause for the poor performance of uncorrected (random) cross‐validation, noted often by modellers, are dependence structures in the data that persist as dependence structures in model residuals, violating the assumption of independence. Even more concerning, because often overlooked, is that structured data also provides ample opportunity for overfitting with non‐causal predictors. This…

Citation impact

2,317
total citations
FWCI
82.06
Percentile
100%
References
106
Citations per year

Authors

14

Topics & keywords

Keywords
  • Overfitting
  • Random forest
  • Computer science
  • Cross-validation
  • Econometrics
  • Autoregressive model
  • Contrast (vision)
  • Extrapolation
No related works found for this paper.