Conditional likelihood maximisation: a unifying framework for information theoretic feature selection
Abstract
We present a unifying framework for information theoretic feature selection, bringing almost two decades of research on heuristic filter criteria under a single theoretical interpretation. This is in response to the question: “what are the implicit statistical assumptions of feature selection criteria based on mutual information?”. To answer this, we adopt a different strategy than is usual in the feature selection literature—instead of trying to define a criterion, we derive one, directly from a clearly specified objective function: the conditional likelihood of the training labels. While many hand-designed heuristic criteria try to optimize a definition of feature ‘relevancy ’ and ‘redundancy’, our approach…
Citation impact
1,057
total citations
- FWCI
- 58.84
- Percentile
- 100%
- References
- 39
Citations per year
Authors
4Topics & keywords
Topics
Keywords
- Feature selection
- Heuristics
- Computer science
- Heuristic
- Markov blanket
- Mutual information
- Feature (linguistics)
- Redundancy (engineering)
No related works found for this paper.