articleBioinformaticsJul 9, 2019GREEN OA

Cooler: scalable storage for Hi-C data and other genomically labeled arrays

Massachusetts Institute of Technology

Indexed incrossrefdoaj

Abstract

Abstract Motivation Most existing coverage-based (epi)genomic datasets are one-dimensional, but newer technologies probing interactions (physical, genetic, etc.) produce quantitative maps with two-dimensional genomic coordinate systems. Storage and computational costs mount sharply with data resolution when such maps are stored in dense form. Hence, there is a pressing need to develop data storage strategies that handle the full range of useful resolutions in multidimensional genomic datasets by taking advantage of their sparse nature, while supporting efficient compression and providing fast random access to facilitate development of scalable algorithms for data analysis. Results We developed a file format…

Citation impact

1,827
total citations
FWCI
40.48
Percentile
100%
References
26
Citations per year

Authors

2

Topics & keywords

Keywords
  • Scalability
  • Computer science
  • Computer data storage
  • Operating system
  • Database
No related works found for this paper.

Funding