articleDec 1, 2009Closed access

PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations

Carnegie Mellon University

Indexed incrossref

Abstract

In this paper, we describe PEGASUS, an open source peta graph mining library which performs typical graph mining tasks such as computing the diameter of the graph, computing the radius of each node and finding the connected components. as the size of graphs reaches several giga-, tera- or peta-bytes, the necessity for such a library grows too. To the best of our knowledge, PEGASUS is the first such library, implemented on the top of the HADOOP platform, the open source version of MAPREDUCE. Many graph mining operations (PageRank, spectral clustering, diameter estimation, connected components etc.) are essentially a repeated matrix-vector multiplication. In this paper we describe a very important primitive for…

Citation impact

637
total citations
FWCI
40.02
Percentile
100%
References
51
Citations per year

Authors

3

Topics & keywords

Keywords
  • Computer science
  • Tera-
  • PageRank
  • Graph
  • Parallel computing
  • Multiplication (music)
  • Matrix multiplication
  • Cluster analysis
No related works found for this paper.

Funding