articleBioinformaticsNov 13, 2014HYBRID OA

UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches

SIB Swiss Institute of Bioinformatics · European Bioinformatics Institute · +3 more institutions

PubMed
Indexed incrossrefdoajpubmed

Abstract

Abstract Motivation: UniRef databases provide full-scale clustering of UniProtKB sequences and are utilized for a broad range of applications, particularly similarity-based functional annotation. Non-redundancy and intra-cluster homogeneity in UniRef were recently improved by adding a sequence length overlap threshold. Our hypothesis is that these improvements would enhance the speed and sensitivity of similarity searches and improve the consistency of annotation within clusters. Results: Intra-cluster molecular function consistency was examined by analysis of Gene Ontology terms. Results show that UniRef clusters bring together proteins of identical molecular function in more than 97% of the clusters,…

No related works found for this paper.

Funding