Feature selection for high-dimensional data: a fast correlation-based filter solution

Yu, Lei; Liu, Huan

articleAug 21, 2003Closed access

Feature selection for high-dimensional data: a fast correlation-based filter solution

Abstract

Feature selection, as a preprocessing step to machine learning, has been effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this work, we introduce a novel concept, predominant correlation, and propose a fast filter method which can identify relevant features as well as redundancy among relevant features without pairwise correlation analysis. The efficiency and effectiveness of our method is demonstrated through extensive comparisons with other methods using…

Citation impact

2,213

total citations

FWCI: 16.43
Percentile: 100%
References: 26

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Feature selection
Computer science
Minimum redundancy feature selection
Curse of dimensionality
Pairwise comparison
Redundancy (engineering)
Preprocessor
Artificial intelligence

No related works found for this paper.