articleMay 1, 2003Closed access
Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data
Indexed incrossref
Abstract
Finding clusters in data, especially high dimensional data, is challenging when the clusters are of widely differing shapes, sizes, and densities, and when the data contains noise and outliers. We present a novel clustering technique that addresses these issues. Our algorithm first finds the nearest neighbors of each data point and then redefines the similarity between pairs of points in terms of how many nearest neighbors the two points share. Using this definition of similarity, our algorithm identifies core points and then builds clusters around the core points. The use of a shared nearest neighbor definition of similarity alleviates problems with varying densities and high dimensionality, while the use of…
Citation impact
706
total citations
- FWCI
- 25.31
- Percentile
- 100%
- References
- 34
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- DBSCAN
- Cluster analysis
- Computer science
- Curse of dimensionality
- Similarity (geometry)
- Data point
- Outlier
- Data mining
No related works found for this paper.