NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy
National Center for Biotechnology Information · National Institutes of Health
Abstract
The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of genomic, transcript and protein sequence records. These records are selected and curated from public sequence archives and represent a significant reduction in redundancy compared to the volume of data archived by the International Nucleotide Sequence Database Collaboration. The database includes over 16,00 organisms, 2.4 × 0(6) genomic records, 13 × 10(6) proteins and 2 × 10(6) RNA records spanning prokaryotes, eukaryotes and viruses (RefSeq release 49, September 2011). The RefSeq database is maintained by a combined approach of automated analyses, collaboration and manual curation to generate an…
Citation impact
- FWCI
- 57.60
- Percentile
- 100%
- References
- 15
Authors
4- KDKim D. PruittCorresponding
National Center for Biotechnology Information
- TTTatiana Tatusova
National Center for Biotechnology Information, National Institutes of Health
- GBGarth Brown
National Institutes of Health, National Center for Biotechnology Information
- DMDonna Maglott
National Center for Biotechnology Information, National Institutes of Health
Topics & keywords
- RefSeq
- Annotation
- Ensembl
- Biology
- Genome project
- Genome
- Reference genome
- Sequence database
- Partnerships for the goals