Your browser doesn't support javascript.
loading
RxnScribe: A Sequence Generation Model for Reaction Diagram Parsing.
Qian, Yujie; Guo, Jiang; Tu, Zhengkai; Coley, Connor W; Barzilay, Regina.
Affiliation
  • Qian Y; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.
  • Guo J; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.
  • Tu Z; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.
  • Coley CW; Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.
  • Barzilay R; Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.
J Chem Inf Model ; 63(13): 4030-4041, 2023 07 10.
Article in En | MEDLINE | ID: mdl-37368970
ABSTRACT
Reaction diagram parsing is the task of extracting reaction schemes from a diagram in the chemistry literature. The reaction diagrams can be arbitrarily complex; thus, robustly parsing them into structured data is an open challenge. In this paper, we present RxnScribe, a machine learning model for parsing reaction diagrams of varying styles. We formulate this structured prediction task with a sequence generation approach, which condenses the traditional pipeline into an end-to-end model. We train RxnScribe on a dataset of 1378 diagrams and evaluate it with cross validation, achieving an 80.0% soft match F1 score, with significant improvements over previous models. Our code and data are publicly available at https//github.com/thomas0809/RxnScribe.
Subject(s)

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Machine Learning Type of study: Prognostic_studies Language: En Journal: J Chem Inf Model Journal subject: INFORMATICA MEDICA / QUIMICA Year: 2023 Type: Article Affiliation country: United States

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Machine Learning Type of study: Prognostic_studies Language: En Journal: J Chem Inf Model Journal subject: INFORMATICA MEDICA / QUIMICA Year: 2023 Type: Article Affiliation country: United States