Identification of mobile genetic elements with geNomad
Lawrence Berkeley National Laboratory · Joint Genome Institute · +1 more institution
Abstract
Identifying and characterizing mobile genetic elements in sequencing data is essential for understanding their diversity, ecology, biotechnological applications and impact on public health. Here we introduce geNomad, a classification and annotation framework that combines information from gene content and a deep neural network to identify sequences of plasmids and viruses. geNomad uses a dataset of more than 200,000 marker protein profiles to provide functional gene annotation and taxonomic assignment of viral genomes. Using a conditional random field model, geNomad also detects proviruses integrated into host genomes with high precision. In benchmarks, geNomad achieved high classification performance for…
Citation impact
- FWCI
- 190.24
- Percentile
- 100%
- References
- 90
Authors
9- APAntônio Pedro CamargoCorresponding
Lawrence Berkeley National Laboratory, Joint Genome Institute
- SRSimon Roux
Lawrence Berkeley National Laboratory, Joint Genome Institute
- FSFrederik Schulz
Lawrence Berkeley National Laboratory, Joint Genome Institute
- MBMichal Babinski
Los Alamos National Laboratory
- YXYan Xu
Los Alamos National Laboratory
Topics & keywords
- Annotation
- Mobile genetic elements
- Genome
- Identification (biology)
- Scalability
- Computational biology
- Biology
- Computer science
Funding
- UDU.S. Department of EnergyAwards: -AC02-05CH11231, AC05-00OR22725, 76RL01830, AC05-76RL01830, 05CH11231, DE-AC05-76RL01830, AC02-05CH11231, DE-AC02, 17-SC-20-SC, DE-AC02-05CH11231, DE-AC05, 89233218CNA000001, 00OR22725, DE-AC02-
- JGJoint Genome InstituteAwards: DE-AC02-05CH11231, AC02-05CH11231
- OOOffice of ScienceAwards: DE-AC05-00OR22725, 89233218CNA000001, AC02-05CH11231, -AC02-05CH11231, DE-AC02, DE-AC05-76RL01830, 17-SC-20-SC, AC05-00OR22725
- NNNational Nuclear Security AdministrationAwards: AC02-05CH11231, DE-AC02-05CH11231, 17-SC-20-SC, DE-AC05-00OR22725, DE-AC05-76RL01830, 89233218CNA000001
- BABiological and Environmental ResearchAwards: 05CH11231, DE-AC05-00OR22725, AC05-76RL01830, 00OR22725, 89233218CNA000001, 76RL01830, DE-AC05-76RL01830, DE-AC02-05CH11231, AC02-05CH11231