Fast R Functions for Robust Correlations and Hierarchical Clustering

Langfelder, Peter; Horvath, Steve

doi:10.18637/jss.v046.i11

articleJournal of Statistical SoftwareJan 1, 2012DIAMOND OA

Fast R Functions for Robust Correlations and Hierarchical Clustering

PLPeter Langfelder SHSteve Horvath

Indexed incrossrefdoaj

Abstract

Many high-throughput biological data analyses require the calculation of large correlation matrices and/or clustering of a large number of objects. The standard R function for calculating Pearson correlation can handle calculations without missing values efficiently, but is inefficient when applied to data sets with a relatively small number of missing data. We present an implementation of Pearson correlation calculation that can lead to substantial speedup on data with relatively small number of missing entries. Further, we parallelize all calculations and thus achieve further speedup on systems where parallel processing is available. A robust correlation measure, the biweight midcorrelation, is implemented…

Citation impact

1,214

total citations

FWCI: 3.36
Percentile: 100%
References: 0

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Speedup
Cluster analysis
Computer science
Hierarchical clustering
Data mining
Measure (data warehouse)
Correlation
Pearson product-moment correlation coefficient

No related works found for this paper.