Your browser doesn't support javascript.
loading
Using SMILES strings for the description of chemical connectivity in the Crystallography Open Database.
Quirós, Miguel; Grazulis, Saulius; Girdzijauskaite, Saule; Merkys, Andrius; Vaitkus, Antanas.
Afiliación
  • Quirós M; Departamento de Química Inorgánica, Universidad de Granada, 18071, Granada, Spain. mquiros@ugr.es.
  • Grazulis S; Institute of Biotechnology, Vilnius University, Sauletekio al. 7, 10257, Vilnius, Lithuania.
  • Girdzijauskaite S; Faculty of Mathematics and Informatics, Vilnius University, Naugarduko st. 24, 03225, Vilnius, Lithuania.
  • Merkys A; Faculty of Mathematics and Informatics, Vilnius University, Naugarduko st. 24, 03225, Vilnius, Lithuania.
  • Vaitkus A; Institute of Biotechnology, Vilnius University, Sauletekio al. 7, 10257, Vilnius, Lithuania.
J Cheminform ; 10(1): 23, 2018 May 18.
Article en En | MEDLINE | ID: mdl-29777317
Computer descriptions of chemical molecular connectivity are necessary for searching chemical databases and for predicting chemical properties from molecular structure. In this article, the ongoing work to describe the chemical connectivity of entries contained in the Crystallography Open Database (COD) in SMILES format is reported. This collection of SMILES is publicly available for chemical (substructure) search or for any other purpose on an open-access basis, as is the COD itself. The conventions that have been followed for the representation of compounds that do not fit into the valence bond theory are outlined for the most frequently found cases. The procedure for getting the SMILES out of the CIF files starts with checking whether the atoms in the asymmetric unit are a chemically acceptable image of the compound. When they are not (molecule in a symmetry element, disorder, polymeric species,etc.), the previously published cif_molecule program is used to get such image in many cases. The program package Open Babel is then applied to get SMILES strings from the CIF files (either those directly taken from the COD or those produced by cif_molecule when applicable). The results are then checked and/or fixed by a human editor, in a computer-aided task that at present still consumes a great deal of human time. Even if the procedure still needs to be improved to make it more automatic (and hence faster), it has already yielded more than 160,000 curated chemical structures and the purpose of this article is to announce the existence of this work to the chemical community as well as to spread the use of its results.
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: J Cheminform Año: 2018 Tipo del documento: Article País de afiliación: España Pais de publicación: Reino Unido

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: J Cheminform Año: 2018 Tipo del documento: Article País de afiliación: España Pais de publicación: Reino Unido