DeepLoc 2.0: multi-label subcellular localization prediction using protein language models
Indian Institute of Technology Madras · University of Copenhagen · +5 more institutions
Abstract
The prediction of protein subcellular localization is of great relevance for proteomics research. Here, we propose an update to the popular tool DeepLoc with multi-localization prediction and improvements in both performance and interpretability. For training and validation, we curate eukaryotic and human multi-location protein datasets with stringent homology partitioning and enriched with sorting signal information compiled from the literature. We achieve state-of-the-art performance in DeepLoc 2.0 by using a pre-trained protein language model. It has the further advantage that it uses sequence input rather than relying on slower protein profiles. We provide two means of better interpretability: an attention…
Citation impact
- FWCI
- 56.91
- Percentile
- 100%
- References
- 39
Authors
5- VTVineet Thumuluri
Indian Institute of Technology Madras
- JJJosé Juan Almagro Armenteros
University of Copenhagen, Novo Nordisk Foundation, Stanford University
- ARAlexander Rosenberg Johansen
Stanford University
- HNHenrik NielsenCorresponding
Technical University of Denmark
- OWOle Winther
University of Copenhagen, Copenhagen University Hospital, Rigshospitalet, Technical University of Denmark
Topics & keywords
- Interpretability
- Biology
- Sorting
- Web server
- Proteomics
- Computational biology
- Computer science
- Subcellular localization