An analysis of four missing data treatment methods for supervised learning
Indexed incrossrefdoaj
Abstract
One relevant problem in data quality is missing data. Despite the frequent occurrence and the relevance of the missing data problem, many machine learning algorithms handle missing data in a rather naive way. However, missing data treatment should be carefully treated, otherwise bias might be introduced into the knowledge induced. In this work, we analyze the use of the k-nearest neighbor as an imputation method. Imputation is a term that denotes a procedure that replaces the missing values in a data set with some plausible values. One advantage of this approach is that the missing data treatment is independent of the learning algorithm used. This allows the user to select the most suitable imputation method…
Citation impact
885
total citations
- FWCI
- 16.86
- Percentile
- 100%
- References
- 10
Citations per year
Authors
2Topics & keywords
Topics
Keywords
- Missing data
- Imputation (statistics)
- Computer science
- Data mining
- k-nearest neighbors algorithm
- Data set
- Machine learning
- Artificial intelligence
UN Sustainable Development Goals
- Quality Education
No related works found for this paper.