Your browser doesn't support javascript.
loading
Graphormer-IR: Graph Transformers Predict Experimental IR Spectra Using Highly Specialized Attention.
Stienstra, Cailum M K; Hebert, Liam; Thomas, Patrick; Haack, Alexander; Guo, Jason; Hopkins, W Scott.
Afiliação
  • Stienstra CMK; Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
  • Hebert L; Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
  • Thomas P; Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
  • Haack A; Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
  • Guo J; Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
  • Hopkins WS; Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.
J Chem Inf Model ; 64(12): 4613-4629, 2024 Jun 24.
Article em En | MEDLINE | ID: mdl-38845400
ABSTRACT
Infrared (IR) spectroscopy is an important analytical tool in various chemical and forensic domains and a great deal of effort has gone into developing in silico methods for predicting experimental spectra. A key challenge in this regard is generating highly accurate spectra quickly to enable real-time feedback between computation and experiment. Here, we employ Graphormer, a graph neural network (GNN) transformer, to predict IR spectra using only simplified molecular-input line-entry system (SMILES) strings. Our data set includes 53,528 high-quality spectra, measured in five different experimental media (i.e., phases), for molecules containing the elements H, C, N, O, F, Si, S, P, Cl, Br, and I. When using only atomic numbers for node encodings, Graphormer-IR achieved a mean test spectral information similarity (SISµ) value of 0.8449 ± 0.0012 (n = 5), which surpasses that the current state-of-the-art model Chemprop-IR (SISµ = 0.8409 ± 0.0014, n = 5) with only 36% of the encoded information. Augmenting node embeddings with additional node-level descriptors in learned embeddings generated through a multilayer perceptron improves scores to SISµ = 0.8523 ± 0.0006, a total improvement of 19.7σ (t = 19). These improved scores show how Graphormer-IR excels in capturing long-range interactions like hydrogen bonding, anharmonic peak positions in experimental spectra, and stretching frequencies of uncommon functional groups. Scaling our architecture to 210 attention heads demonstrates specialist-like behavior for distinct IR frequencies that improves model performance. Our model utilizes novel architectures, including a global node for phase encoding, learned node feature embeddings, and a one-dimensional (1D) smoothing convolutional neural network (CNN). Graphormer-IR's innovations underscore its value over traditional message-passing neural networks (MPNNs) due to its expressive embeddings and ability to capture long-range intramolecular relationships.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Espectrofotometria Infravermelho / Redes Neurais de Computação Idioma: En Revista: J Chem Inf Model Assunto da revista: INFORMATICA MEDICA / QUIMICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Canadá

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Espectrofotometria Infravermelho / Redes Neurais de Computação Idioma: En Revista: J Chem Inf Model Assunto da revista: INFORMATICA MEDICA / QUIMICA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: Canadá