articleBMC BioinformaticsJul 9, 2013GOLD OA

TCC: an R package for comparing tag count data with robust normalization strategies

The University of Tokyo · Kanazawa University · +1 more institution

PubMed
Indexed incrossrefdoajpubmed

Abstract

Background

Differential expression analysis based on "next-generation" sequencing technologies is a fundamental means of studying RNA expression. We recently developed a multi-step normalization method (called TbT) for two-group RNA-seq data with replicates and demonstrated that the statistical methods available in four R packages (edgeR, DESeq, baySeq, and NBPSeq) together with TbT can produce a well-ranked gene list in which true differentially expressed genes (DEGs) are top-ranked and non-DEGs are bottom ranked. However, the advantages of the current TbT method come at the cost of a huge computation time. Moreover, the R packages did not have normalization methods based on such a multi-step strategy.

Results

TCC (an acronym for Tag Count Comparison) is an R package that provides a series of functions for differential expression analysis of tag count data. The package incorporates multi-step normalization methods, whose strategy is to remove potential DEGs before performing the data normalization. The normalization function based on this DEG elimination strategy (DEGES) includes (i) the original TbT method based on DEGES for two-group data with or without replicates, (ii) much faster methods for two-group data with or without replicates, and (iii) methods for multi-group comparison. TCC provides a simple unified interface to perform such analyses with combinations of functions provided by edgeR, DESeq, and baySeq. Additionally, a function for generating simulation data under various conditions and alternative DEGES procedures consisting of functions in the existing packages are provided. Bioinformatics scientists can use TCC to evaluate their methods, and biologists familiar with other R packages can easily learn what is done in TCC.

Citation impact

677
total citations
FWCI
7.33
Percentile
100%
References
28
Citations per year

Authors

4

Topics & keywords

Keywords
  • Normalization (sociology)
  • R package
  • Database normalization
  • Count data
  • Computer science
  • Data mining
  • Bioconductor
  • DNA microarray
No related works found for this paper.

Funding