Your browser doesn't support javascript.
loading
Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships.
Huber, Florian; Ridder, Lars; Verhoeven, Stefan; Spaaks, Jurriaan H; Diblen, Faruk; Rogers, Simon; van der Hooft, Justin J J.
Afiliação
  • Huber F; Netherlands eScience Center, Amsterdam, the Netherlands.
  • Ridder L; Netherlands eScience Center, Amsterdam, the Netherlands.
  • Verhoeven S; Netherlands eScience Center, Amsterdam, the Netherlands.
  • Spaaks JH; Netherlands eScience Center, Amsterdam, the Netherlands.
  • Diblen F; Netherlands eScience Center, Amsterdam, the Netherlands.
  • Rogers S; School of Computing Science, University of Glasgow, Glasgow, United Kingdom.
  • van der Hooft JJJ; Bioinformatics Group, Wageningen University, Wageningen, the Netherlands.
PLoS Comput Biol ; 17(2): e1008724, 2021 02.
Article em En | MEDLINE | ID: mdl-33591968
Spectral similarity is used as a proxy for structural similarity in many tandem mass spectrometry (MS/MS) based metabolomics analyses such as library matching and molecular networking. Although weaknesses in the relationship between spectral similarity scores and the true structural similarities have been described, little development of alternative scores has been undertaken. Here, we introduce Spec2Vec, a novel spectral similarity score inspired by a natural language processing algorithm-Word2Vec. Spec2Vec learns fragmental relationships within a large set of spectral data to derive abstract spectral embeddings that can be used to assess spectral similarities. Using data derived from GNPS MS/MS libraries including spectra for nearly 13,000 unique molecules, we show how Spec2Vec scores correlate better with structural similarity than cosine-based scores. We demonstrate the advantages of Spec2Vec in library matching and molecular networking. Spec2Vec is computationally more scalable allowing structural analogue searches in large databases within seconds.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Biblioteca Gênica / Biologia Computacional / Espectrometria de Massas em Tandem / Metabolômica Tipo de estudo: Prognostic_studies Idioma: En Revista: PLoS Comput Biol Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Holanda

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Algoritmos / Biblioteca Gênica / Biologia Computacional / Espectrometria de Massas em Tandem / Metabolômica Tipo de estudo: Prognostic_studies Idioma: En Revista: PLoS Comput Biol Assunto da revista: BIOLOGIA / INFORMATICA MEDICA Ano de publicação: 2021 Tipo de documento: Article País de afiliação: Holanda