Informative missingness and its implications in semi-supervised learning
Indexed inarxivcrossrefdatacite
Abstract
<p>Semi-supervised learning (SSL) constructs classifiers using both labelled and unlabelled data. It leverages information from labelled samples, whose acquisition is often costly or labour-intensive, together with unlabelled data to enhance prediction performance. This defines an incomplete-data problem, which statistically can be formulated within the likelihood framework for finite mixture models that can be fitted using the expectation-maximisation (EM) algorithm. Ideally, one would prefer a completely labelled sample, as one would anticipate that a labelled observation provides more information than an unlabelled one. However, when the mechanism governing label absence depends on the observed…
Citation impact
5
total citations
- FWCI
- 116.53
- Percentile
- 100%
- References
- 0
Too recent for citation history.
Authors
3Topics & keywords
Topics
Keywords
- Missing data
- Classifier (UML)
- Inference
- Class (philosophy)
- Mechanism (biology)
- Statistical model
- Imputation (statistics)
- Statistical inference
No related works found for this paper.