Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities.

Ekvall, Markus; Truong, Patrick; Gabriel, Wassim; Wilhelm, Mathias; Käll, Lukas

Ekvall, Markus; Truong, Patrick; Gabriel, Wassim; Wilhelm, Mathias; Käll, Lukas.

Afiliação

Ekvall M; Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, Royal Institute of TechnologyâKTH, Box 1031, SE-17121 Solna, Sweden.
Truong P; Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, Royal Institute of TechnologyâKTH, Box 1031, SE-17121 Solna, Sweden.
Gabriel W; Computational Mass Spectrometry, Technical University of Munich (TUM), D-85354 Freising, Germany.
Wilhelm M; Computational Mass Spectrometry, Technical University of Munich (TUM), D-85354 Freising, Germany.
Käll L; Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, Royal Institute of TechnologyâKTH, Box 1031, SE-17121 Solna, Sweden.

J Proteome Res ; 21(5): 1359-1364, 2022 05 06.

Article em En | MEDLINE | ID: mdl-35413196

ABSTRACT

ABSTRACT

Machine learning has been an integral part of interpreting data from mass spectrometry (MS)-based proteomics for a long time. Relatively recently, a machine-learning structure appeared successful in other areas of bioinformatics, Transformers. Furthermore, the implementation of Transformers within bioinformatics has become relatively convenient due to transfer learning, i.e., adapting a network trained for other tasks to new functionality. Transfer learning makes these relatively large networks more accessible as it generally requires less data, and the training time improves substantially. We implemented a Transformer based on the pretrained model TAPE to predict MS2 intensities. TAPE is a general model trained to predict missing residues from protein sequences. Despite being trained for a different task, we could modify its behavior by adding a prediction head at the end of the TAPE model and fine-tune it using the spectrum intensity from the training set to the well-known predictor Prosit. We demonstrate that the predictor, which we call Prosit Transformer, outperforms the recurrent neural-network-based predictor Prosit, increasing the median angular similarity on its hold-out set from 0.908 to 0.929. We believe that Transformers will significantly increase prediction accuracy for other types of predictions within MS-based proteomics.

Assuntos

Aprendizado de Máquina; Redes Neurais de Computação; Sequência de Aminoácidos; Espectrometria de Massas; Proteômica

Palavras-chave

MS2 Spectra; Machine Learning; Proteomics; Transformers

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Redes Neurais de Computação / Aprendizado de Máquina Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google