articleBioinformaticsOct 11, 2012BRONZE OA

A high-performance computing toolset for relatedness and principal component analysis of SNP data

University of Washington

PubMed
Indexed incrossrefdoajpubmed

Abstract

Abstract Summary: Genome-wide association studies are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed gdsfmt and SNPRelate (R packages for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent measures. The kernels of our algorithms are written in C/C++ and highly optimized. Benchmarks show the uniprocessor implementations of PCA and identity-by-descent are ∼8–50 times faster than the implementations provided in the popular EIGENSTRAT (v3.0) and PLINK (v1.07) programs, respectively, and can be…

Citation impact

2,743
total citations
FWCI
14.92
Percentile
100%
References
14
Citations per year

Authors

6

Topics & keywords

Keywords
  • Principal component analysis
  • Computer science
  • Uniprocessor system
  • Implementation
  • Computation
  • Component (thermodynamics)
  • Data mining
  • Parallel computing
No related works found for this paper.