PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions
Broad Institute · Massachusetts Institute of Technology
Abstract
MOTIVATION: As high-throughput transcriptome sequencing provides evidence for novel transcripts in many species, there is a renewed need for accurate methods to classify small genomic regions as protein coding or non-coding. We present PhyloCSF, a novel comparative genomics method that analyzes a multispecies nucleotide sequence alignment to determine whether it is likely to represent a conserved protein-coding region, based on a formal statistical comparison of phylogenetic codon models. RESULTS: We show that PhyloCSF's classification performance in 12-species Drosophila genome alignments exceeds all other methods we compared in a previous study. We anticipate that this method will be widely applicable as the…
Citation impact
- FWCI
- 15.56
- Percentile
- 100%
- References
- 42
Authors
3Topics & keywords
- ENCODE
- Computational biology
- Biology
- Genome
- Comparative genomics
- Genomics
- Phylogenetic tree
- Coding region
- Life in Land