Your browser doesn't support javascript.
loading
SELFIES and the future of molecular string representations.
Krenn, Mario; Ai, Qianxiang; Barthel, Senja; Carson, Nessa; Frei, Angelo; Frey, Nathan C; Friederich, Pascal; Gaudin, Théophile; Gayle, Alberto Alexander; Jablonka, Kevin Maik; Lameiro, Rafael F; Lemm, Dominik; Lo, Alston; Moosavi, Seyed Mohamad; Nápoles-Duarte, José Manuel; Nigam, AkshatKumar; Pollice, Robert; Rajan, Kohulan; Schatzschneider, Ulrich; Schwaller, Philippe; Skreta, Marta; Smit, Berend; Strieth-Kalthoff, Felix; Sun, Chong; Tom, Gary; Falk von Rudorff, Guido; Wang, Andrew; White, Andrew D; Young, Adamo; Yu, Rose; Aspuru-Guzik, Alán.
Afiliación
  • Krenn M; Max Planck Institute for the Science of Light (MPL), Erlangen, Germany.
  • Ai Q; Department of Chemistry, Fordham University, The Bronx, NY, USA.
  • Barthel S; Department of Mathematics, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands.
  • Carson N; Syngenta Jealott's Hill International Research Centre, Bracknell, Berkshire, UK.
  • Frei A; Department of Chemistry, Imperial College London, Molecular Sciences Research Hub, White City Campus, Wood Lane, London, UK.
  • Frey NC; Massachusetts Institute of Technology, Cambridge, MA, USA.
  • Friederich P; Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany.
  • Gaudin T; Institute of Nanotechnology, Karlsruhe Institute of Technology, Eggenstein-Leopoldshafen, Germany.
  • Gayle AA; Department of Computer Science, University of Toronto, Toronto, ON, Canada.
  • Jablonka KM; IBM Research Europe, Zürich, Switzerland.
  • Lameiro RF; Sapporo, Japan.
  • Lemm D; Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), Sion, Valais, Switzerland.
  • Lo A; Medicinal and Biological Chemistry Group, São Carlos Institute of Chemistry, University of São Paulo, São Paulo, Brazil.
  • Moosavi SM; Faculty of Physics, University of Vienna, Vienna, Austria.
  • Nápoles-Duarte JM; Department of Computer Science, University of Toronto, Toronto, ON, Canada.
  • Nigam A; Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany.
  • Pollice R; Facultad de Ciencias Químicas, Universidad Autónoma de Chihuahua, Chihuahua, Mexico.
  • Rajan K; Department of Computer Science, Stanford University, Stanford, CA, USA.
  • Schatzschneider U; Department of Computer Science, University of Toronto, Toronto, ON, Canada.
  • Schwaller P; Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, ON, Canada.
  • Skreta M; Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller Universität Jena, Jena, Germany.
  • Smit B; Institut für Anorganische Chemie, Julius-Maximilians-Universität Würzburg, Würzburg, Germany.
  • Strieth-Kalthoff F; IBM Research Europe, Zürich, Switzerland.
  • Sun C; Laboratory of Artificial Chemical Intelligence (LIAC), Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
  • Tom G; National Centre of Competence in Research (NCCR) Catalysis, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
  • Falk von Rudorff G; Department of Computer Science, University of Toronto, Toronto, ON, Canada.
  • Wang A; Vector Institute for Artificial Intelligence, Toronto, ON, Canada.
  • White AD; Laboratory of Molecular Simulation (LSMO), Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), Sion, Valais, Switzerland.
  • Young A; Chemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, ON, Canada.
  • Yu R; Department of Computer Science, University of Toronto, Toronto, ON, Canada.
  • Aspuru-Guzik A; Department of Computer Science, University of Toronto, Toronto, ON, Canada.
Patterns (N Y) ; 3(10): 100588, 2022 Oct 14.
Article en En | MEDLINE | ID: mdl-36277819
ABSTRACT
Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, Smiles, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, Smiles has several shortcomings-most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100% robustness SELF-referencing embedded string (Selfies). Selfies has since simplified and enabled numerous new applications in chemistry. In this perspective, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete future projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages, and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science.

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Patterns (N Y) Año: 2022 Tipo del documento: Article País de afiliación: Alemania

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Patterns (N Y) Año: 2022 Tipo del documento: Article País de afiliación: Alemania