The Sorghum bicolor reference genome: improved assembly, gene annotations, a transcriptome atlas, and signatures of genome organization
Texas A&M University · HudsonAlpha Institute for Biotechnology · +1 more institution
Abstract
Sorghum bicolor is a drought tolerant C4 grass used for the production of grain, forage, sugar, and lignocellulosic biomass and a genetic model for C4 grasses due to its relatively small genome (approximately 800 Mbp), diploid genetics, diverse germplasm, and colinearity with other C4 grass genomes. In this study, deep sequencing, genetic linkage analysis, and transcriptome data were used to produce and annotate a high-quality reference genome sequence. Reference genome sequence order was improved, 29.6 Mbp of additional sequence was incorporated, the number of genes annotated increased 24% to 34 211, average gene length and N50 increased, and error frequency was reduced 10-fold to 1 per 100 kbp. Subtelomeric…
Citation impact
- FWCI
- 30.16
- Percentile
- 100%
- References
- 93
Authors
15Topics & keywords
- Biology
- Genome
- Sequence assembly
- Genetics
- De novo transcriptome assembly
- Reference genome
- Indel
- Transcriptome
Funding
- UDU.S. Department of EnergyAwards: FC02-07ER64494, -AC02-05CH11231, 07ER64494, DE-FC02-07ER64494, 05CH11231, DE‐SC0012629, DE‐AR0000596, AC02-05CH11231, DE-SC0012629, DE-AC02, BER DE-FC02-07ER64494, DE‐FC02‐07ER64494, DE-AC02-05CH11231, DE-AC02-
- JGJoint Genome InstituteAwards: DE-AC02-05CH11231, AC02-05CH11231
- OOOffice of ScienceAwards: BER DE-FC02-07ER64494, DE-FC02-07ER64494, FC02-07ER64494, AC02-05CH11231, -AC02-05CH11231, DE-AC02
- GLGreat Lakes Bioenergy Research CenterAwards: BER DE-FC02-07ER64494, DE-FC02-07ER64494, DE-AC02-05CH11231