articleJul 28, 2003Closed access
Document clustering based on non-negative matrix factorization
Indexed incrossref
Abstract
In this paper, we propose a novel document clustering method based on the non-negative factorization of the term-document matrix of the given document corpus. In the latent semantic space derived by the non-negative matrix factorization (NMF), each axis captures the base topic of a particular document cluster, and each document is represented as an additive combination of the base topics. The cluster membership of each document can be easily determined by finding the base topic (the axis) with which the document has the largest projection value. Our experimental evaluations show that the proposed document clustering method surpasses the latent semantic indexing and the spectral clustering methods not only in…
Citation impact
1,866
total citations
- FWCI
- 18.32
- Percentile
- 100%
- References
- 18
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Document clustering
- Cluster analysis
- Non-negative matrix factorization
- Computer science
- Matrix decomposition
- Information retrieval
- Artificial intelligence
- Data mining
No related works found for this paper.