Rapid and sensitive detection of genome contamination at scale with FCS-GX
National Institutes of Health · National Center for Biotechnology Information
Abstract
Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI's Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1-10 min. Testing FCS-GX on artificially fragmented genomes demonstrates high sensitivity and specificity for diverse contaminant species. We used FCS-GX to screen 1.6 million GenBank assemblies and identified 36.8 Gbp of contamination, comprising 0.16% of total bases, with half from 161 assemblies. We updated assemblies in NCBI RefSeq to reduce detected contamination to 0.01% of bases. FCS-GX is available at https://github.com/ncbi/fcs/ or…
Citation impact
- FWCI
- 81.79
- Percentile
- 100%
- References
- 40
Authors
19- AAAlexander AstashynCorresponding
National Institutes of Health, National Center for Biotechnology Information
- ESEric S. Tvedte
National Institutes of Health, National Center for Biotechnology Information
- DSDeacon Sweeney
National Institutes of Health, National Center for Biotechnology Information
- VSVictor Sapojnikov
National Institutes of Health, National Center for Biotechnology Information
- NBNathan Bouk
National Institutes of Health, National Center for Biotechnology Information
Topics & keywords
- Biology
- Human genetics
- Genome
- Contamination
- Computational biology
- Scale (ratio)
- Genome Biology
- Genetics