MetFrag relaunched: incorporating strategies beyond in silico fragmentation
Leibniz Institute of Plant Biochemistry · Swiss Federal Institute of Aquatic Science and Technology
Abstract
The in silico fragmenter MetFrag, launched in 2010, was one of the first approaches combining compound database searching and fragmentation prediction for small molecule identification from tandem mass spectrometry data. Since then many new approaches have evolved, as has MetFrag itself. This article details the latest developments to MetFrag and its use in small molecule identification since the original publication.
MetFrag has gone through algorithmic and scoring refinements. New features include the retrieval of reference, data source and patent information via ChemSpider and PubChem web services, as well as InChIKey filtering to reduce candidate redundancy due to stereoisomerism. Candidates can be filtered or scored differently based on criteria like occurence of certain elements and/or substructures prior to fragmentation, or presence in so-called "suspect lists". Retention time information can now be calculated either within MetFrag with a sufficient amount of user-provided retention times, or incorporated separately as "user-defined scores" to be included in candidate ranking. The changes to MetFrag were evaluated on the original dataset as well as a dataset of 473 merged high resolution tandem mass spectra (HR-MS/MS) and compared with another open source in silico fragmenter, CFM-ID. Using HR-MS/MS information only, MetFrag2.2 and CFM-ID had 30 and 43 Top 1 ranks, respectively, using PubChem as a database. Including reference and retention information in MetFrag2.2 improved this to 420 and 336 Top 1 ranks with ChemSpider and PubChem (89 and 71 %), respectively, and even up to 343 Top 1 ranks (PubChem) when combining with CFM-ID. The optimal parameters and weights were verified using three additional datasets of 824 merged HR-MS/MS spectra in total. Further examples are given to demonstrate flexibility of the enhanced features.
Citation impact
- FWCI
- 84.35
- Percentile
- 100%
- References
- 47
Authors
5- CRChristoph RuttkiesCorresponding
Leibniz Institute of Plant Biochemistry
- ESEmma Schymanski
Swiss Federal Institute of Aquatic Science and Technology
- SISebastian I. Wolf
Leibniz Institute of Plant Biochemistry
- JHJuliane Hollender
Swiss Federal Institute of Aquatic Science and Technology
- SNSteffen Neumann
Leibniz Institute of Plant Biochemistry
Topics & keywords
- Fragmentation (computing)
- Computer science
- In silico
- Data science
- Chemistry