Reactome pathway analysis: a high-performance in-memory approach
Open Targets · European Bioinformatics Institute · +8 more institutions
Abstract
Reactome aims to provide bioinformatics tools for visualisation, interpretation and analysis of pathway knowledge to support basic research, genome analysis, modelling, systems biology and education. Pathway analysis methods have a broad range of applications in physiological and biomedical research; one of the main problems, from the analysis methods performance point of view, is the constantly increasing size of the data samples.
Here, we present a new high-performance in-memory implementation of the well-established over-representation analysis method. To achieve the target, the over-representation analysis method is divided in four different steps and, for each of them, specific data structures are used to improve performance and minimise the memory footprint. The first step, finding out whether an identifier in the user's sample corresponds to an entity in Reactome, is addressed using a radix tree as a lookup table. The second step, modelling the proteins, chemicals, their orthologous in other species and their composition in complexes and sets, is addressed with a graph. The third and fourth steps, that aggregate the results and calculate the statistics, are solved with a double-linked tree.
Citation impact
- FWCI
- 20.63
- Percentile
- 100%
- References
- 23
Authors
9- AFAntonio Fabregat
Open Targets, European Bioinformatics Institute
- KSKonstantinos Sidiropoulos
European Bioinformatics Institute
- GVGuilherme Viteri
European Bioinformatics Institute
- OFOscar Forner
European Bioinformatics Institute
- PMPablo Marín-García
Universitat de València, Instituto de Medicina Genómica, INCLIVA Health Research Institute
Topics & keywords
- Computer science
- DNA microarray
- Computational biology
- Biology
- Genetics
- Gene
- Gene expression