Your browser doesn't support javascript.
loading
MolMiner: You Only Look Once for Chemical Structure Recognition.
Xu, Youjun; Xiao, Jinchuan; Chou, Chia-Han; Zhang, Jianhang; Zhu, Jintao; Hu, Qiwan; Li, Hemin; Han, Ningsheng; Liu, Bingyu; Zhang, Shuaipeng; Han, Jinyu; Zhang, Zhen; Zhang, Shuhao; Zhang, Weilin; Lai, Luhua; Pei, Jianfeng.
  • Xu Y; Infinite Intelligence Pharma, Beijing, China 100083.
  • Xiao J; Infinite Intelligence Pharma, Beijing, China 100083.
  • Chou CH; Infinite Intelligence Pharma, Beijing, China 100083.
  • Zhang J; Infinite Intelligence Pharma, Beijing, China 100083.
  • Zhu J; Center for Quantitative Biology, Peking University, Beijing, China 100871.
  • Hu Q; Center for Quantitative Biology, Peking University, Beijing, China 100871.
  • Li H; Infinite Intelligence Pharma, Beijing, China 100083.
  • Han N; Infinite Intelligence Pharma, Beijing, China 100083.
  • Liu B; Infinite Intelligence Pharma, Beijing, China 100083.
  • Zhang S; Infinite Intelligence Pharma, Beijing, China 100083.
  • Han J; Infinite Intelligence Pharma, Beijing, China 100083.
  • Zhang Z; Infinite Intelligence Pharma, Beijing, China 100083.
  • Zhang S; Infinite Intelligence Pharma, Beijing, China 100083.
  • Zhang W; Infinite Intelligence Pharma, Beijing, China 100083.
  • Lai L; Center for Quantitative Biology, Peking University, Beijing, China 100871.
  • Pei J; BNLMS, Peking-Tsinghua Center for Life Sciences at the College of Chemistry and Molecular Engineering, Peking University, Beijing, China 100871.
J Chem Inf Model ; 62(22): 5321-5328, 2022 11 28.
Article en En | MEDLINE | ID: mdl-36108142
Molecular structures are commonly depicted in 2D printed forms in scientific documents such as journal papers and patents. However, these 2D depictions are not machine readable. Due to a backlog of decades and an increasing amount of printed literatures, there is a high demand for translating printed depictions into machine-readable formats, which is known as Optical Chemical Structure Recognition (OCSR). Most OCSR systems developed over the last three decades use a rule-based approach, which vectorizes the depiction based on the interpretation of vectors and nodes as bonds and atoms. Here, we present a practical software called MolMiner, which is primarily built using deep neural networks originally developed for semantic segmentation and object detection to recognize atom and bond elements from documents. These recognized elements can be easily connected as a molecular graph with a distance-based construction algorithm. MolMiner gave state-of-the-art performance on four benchmark data sets and a self-collected external data set from scientific papers. As MolMiner performed similarly well in real-world OCSR tasks with a user-friendly interface, it is a useful and valuable tool for daily applications. The free download links of Mac and Windows versions are available at https://github.com/iipharma/pharmamind-molminer.
Asunto(s)

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Algoritmos / Programas Informáticos Idioma: En Año: 2022 Tipo del documento: Article

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Algoritmos / Programas Informáticos Idioma: En Año: 2022 Tipo del documento: Article