Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Angew Chem Int Ed Engl ; 60(4): 2074-2077, 2021 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-32986914

RESUMEN

The generated databases (GDBs) enumerate billions of possible molecules following simple rules of chemical stability and synthetic feasibility. Exploring the GDBs shows that many chiral, 3D-shaped ring systems, often containing quaternary centers, have never been exploited for drug design. Shown herein is that such ring systems can be useful for medicinal chemistry by using the example of the enantioselective synthesis of triquinazine, a novel chiral piperazine analogue derived from angular triquinane. It is used to design a nanomolar and selective inhibitor of Janus Kinase 1 and is related to the marketed drug Tofacitinib, which is useful for treating autoimmune diseases.


Asunto(s)
Quinasas Janus/antagonistas & inhibidores , Piperazina/análogos & derivados , Inhibidores de Proteínas Quinasas/farmacología , Cristalografía por Rayos X , Bases de Datos Factuales , Diseño de Fármacos , Humanos , Estructura Molecular , Inhibidores de Proteínas Quinasas/química , Estereoisomerismo
2.
J Chem Inf Model ; 60(12): 5918-5922, 2020 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-33118816

RESUMEN

In the past few years, we have witnessed a renaissance of the field of molecular de novo drug design. The advancements in deep learning and artificial intelligence (AI) have triggered an avalanche of ideas on how to translate such techniques to a variety of domains including the field of drug design. A range of architectures have been devised to find the optimal way of generating chemical compounds by using either graph- or string (SMILES)-based representations. With this application note, we aim to offer the community a production-ready tool for de novo design, called REINVENT. It can be effectively applied on drug discovery projects that are striving to resolve either exploration or exploitation problems while navigating the chemical space. It can facilitate the idea generation process by bringing to the researcher's attention the most promising compounds. REINVENT's code is publicly available at https://github.com/MolecularAI/Reinvent.


Asunto(s)
Inteligencia Artificial , Diseño de Fármacos , Descubrimiento de Drogas
3.
Chimia (Aarau) ; 74(4): 241-246, 2020 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-32331540

RESUMEN

Drug discovery is in constant need of new molecules to develop drugs addressing unmet medical needs. To assess the chemical space available for drug design, our group investigates the generated databases (GDBs) listing all possible organic molecules up to a defined size, the largest of which is GDB-17 featuring 166.4 billion molecules up to 17 non-hydrogen atoms. While known drugs and bioactive compounds are mostly aromatic and planar, the GDBs contain a plethora of non-aromatic 3D-shaped molecules, which are very useful for drug discovery since they generally have more desirable absorption, distribution, metabolism, excretion and toxicity (ADMET) properties. Here we review GDB enumeration methods and the selection and synthesis of GDB molecules as modulators of ion channels. We summarize the constitution of GDB subsets focusing on fragments (FDB17), medicinal chemistry (GDBMedChem) and ChEMBL-like molecules (GDBChEMBL), and the ring system database GDB4c as a rich source of novel 3D-shaped chiral molecules containing quaternary centers, such as the recently reported trinorbornane.


Asunto(s)
Descubrimiento de Drogas , Bases de Datos Factuales , Estereoisomerismo
4.
J Cheminform ; 12(1): 38, 2020 May 29.
Artículo en Inglés | MEDLINE | ID: mdl-33431013

RESUMEN

Molecular generative models trained with small sets of molecules represented as SMILES strings can generate large regions of the chemical space. Unfortunately, due to the sequential nature of SMILES strings, these models are not able to generate molecules given a scaffold (i.e., partially-built molecules with explicit attachment points). Herein we report a new SMILES-based molecular generative architecture that generates molecules from scaffolds and can be trained from any arbitrary molecular set. This approach is possible thanks to a new molecular set pre-processing algorithm that exhaustively slices all possible combinations of acyclic bonds of every molecule, combinatorically obtaining a large number of scaffolds with their respective decorations. Moreover, it serves as a data augmentation technique and can be readily coupled with randomized SMILES to obtain even better results with small sets. Two examples showcasing the potential of the architecture in medicinal and synthetic chemistry are described: First, models were trained with a training set obtained from a small set of Dopamine Receptor D2 (DRD2) active modulators and were able to meaningfully decorate a wide range of scaffolds and obtain molecular series predicted active on DRD2. Second, a larger set of drug-like molecules from ChEMBL was selectively sliced using synthetic chemistry constraints (RECAP rules). In this case, the resulting scaffolds with decorations were filtered only to allow those that included fragment-like decorations. This filtering process allowed models trained with this dataset to selectively decorate diverse scaffolds with fragments that were generally predicted to be synthesizable and attachable to the scaffold using known synthetic approaches. In both cases, the models were already able to decorate molecules using specific knowledge without the need to add it with other techniques, such as reinforcement learning. We envision that this architecture will become a useful addition to the already existent architectures for de novo molecular generation.

5.
Chimia (Aarau) ; 73(12): 1018-1023, 2019 Dec 18.
Artículo en Inglés | MEDLINE | ID: mdl-31883554

RESUMEN

Chemical space is a concept to organize molecular diversity by postulating that different molecules occupy different regions of a mathematical space where the position of each molecule is defined by its properties. Our aim is to develop methods to explicitly explore chemical space in the area of drug discovery. Here we review our implementations of machine learning in this project, including our use of deep neural networks to enumerate the GDB13 database from a small sample set, to generate analogs of drugs and natural products after training with fragment-size molecules, and to predict the polypharmacology of molecules after training with known bioactive compounds from ChEMBL. We also discuss visualization methods for big data as means to keep track and learn from machine learning results. Computational tools discussed in this review are freely available at http://gdb.unibe.ch and https://github.com/reymond-group.

6.
Front Pharmacol ; 10: 1303, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31749705

RESUMEN

In recent years, the development of high-throughput screening (HTS) technologies and their establishment in an industrialized environment have given scientists the possibility to test millions of molecules and profile them against a multitude of biological targets in a short period of time, generating data in a much faster pace and with a higher quality than before. Besides the structure activity data from traditional bioassays, more complex assays such as transcriptomics profiling or imaging have also been established as routine profiling experiments thanks to the advancement of Next Generation Sequencing or automated microscopy technologies. In industrial pharmaceutical research, these technologies are typically established in conjunction with automated platforms in order to enable efficient handling of screening collections of thousands to millions of compounds. To exploit the ever-growing amount of data that are generated by these approaches, computational techniques are constantly evolving. In this regard, artificial intelligence technologies such as deep learning and machine learning methods play a key role in cheminformatics and bio-image analytics fields to address activity prediction, scaffold hopping, de novo molecule design, reaction/retrosynthesis predictions, or high content screening analysis. Herein we summarize the current state of analyzing large-scale compound data in industrial pharmaceutical research and describe the impact it has had on the drug discovery process over the last two decades, with a specific focus on deep-learning technologies.

7.
J Cheminform ; 11(1): 20, 2019 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-30868314

RESUMEN

Recent applications of recurrent neural networks (RNN) enable training models that sample the chemical space. In this study we train RNN with molecular string representations (SMILES) with a subset of the enumerated database GDB-13 (975 million molecules). We show that a model trained with 1 million structures (0.1% of the database) reproduces 68.9% of the entire database after training, when sampling 2 billion molecules. We also developed a method to assess the quality of the training process using negative log-likelihood plots. Furthermore, we use a mathematical model based on the "coupon collector problem" that compares the trained model to an upper bound and thus we are able to quantify how much it has learned. We also suggest that this method can be used as a tool to benchmark the learning capabilities of any molecular generative model architecture. Additionally, an analysis of the generated chemical space was performed, which shows that, mostly due to the syntax of SMILES, complex molecules with many rings and heteroatoms are more difficult to sample.

8.
J Cheminform ; 11(1): 74, 2019 Dec 03.
Artículo en Inglés | MEDLINE | ID: mdl-33430938

RESUMEN

Deep learning methods applied to drug discovery have been used to generate novel structures. In this study, we propose a new deep learning architecture, LatentGAN, which combines an autoencoder and a generative adversarial neural network for de novo molecular design. We applied the method in two scenarios: one to generate random drug-like compounds and another to generate target-biased compounds. Our results show that the method works well in both cases. Sampled compounds from the trained model can largely occupy the same chemical space as the training set and also generate a substantial fraction of novel compounds. Moreover, the drug-likeness score of compounds sampled from LatentGAN is also similar to that of the training set. Lastly, generated compounds differ from those obtained with a Recurrent Neural Network-based generative model approach, indicating that both methods can be used complementarily.

9.
J Cheminform ; 11(1): 71, 2019 Nov 21.
Artículo en Inglés | MEDLINE | ID: mdl-33430971

RESUMEN

Recurrent Neural Networks (RNNs) trained with a set of molecules represented as unique (canonical) SMILES strings, have shown the capacity to create large chemical spaces of valid and meaningful structures. Herein we perform an extensive benchmark on models trained with subsets of GDB-13 of different sizes (1 million, 10,000 and 1000), with different SMILES variants (canonical, randomized and DeepSMILES), with two different recurrent cell types (LSTM and GRU) and with different hyperparameter combinations. To guide the benchmarks new metrics were developed that define how well a model has generalized the training set. The generated chemical space is evaluated with respect to its uniformity, closedness and completeness. Results show that models that use LSTM cells trained with 1 million randomized SMILES, a non-unique molecular string representation, are able to generalize to larger chemical spaces than the other approaches and they represent more accurately the target chemical space. Specifically, a model was trained with randomized SMILES that was able to generate almost all molecules from GDB-13 with a quasi-uniform probability. Models trained with smaller samples show an even bigger improvement when trained with randomized SMILES models. Additionally, models were trained on molecules obtained from ChEMBL and illustrate again that training with randomized SMILES lead to models having a better representation of the drug-like chemical space. Namely, the model trained with randomized SMILES was able to generate at least double the amount of unique molecules with the same distribution of properties comparing to one trained with canonical SMILES.

10.
Chimia (Aarau) ; 72(1): 70-71, 2018 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-29490798
11.
Chimia (Aarau) ; 71(10): 661-666, 2017 10 25.
Artículo en Inglés | MEDLINE | ID: mdl-29070411

RESUMEN

Chemical space describes all possible molecules as well as multi-dimensional conceptual spaces representing the structural diversity of these molecules. Part of this chemical space is available in public databases ranging from thousands to billions of compounds. Exploiting these databases for drug discovery represents a typical big data problem limited by computational power, data storage and data access capacity. Here we review recent developments of our laboratory, including progress in the chemical universe databases (GDB) and the fragment subset FDB-17, tools for ligand-based virtual screening by nearest neighbor searches, such as our multi-fingerprint browser for the ZINC database to select purchasable screening compounds, and their application to discover potent and selective inhibitors for calcium channel TRPV6 and Aurora A kinase, the polypharmacology browser (PPB) for predicting off-target effects, and finally interactive 3D-chemical space visualization using our online tools WebDrugCS and WebMolCS. All resources described in this paper are available for public use at www.gdb.unibe.ch.


Asunto(s)
Bases de Datos de Compuestos Químicos , Descubrimiento de Drogas
12.
J Chem Inf Model ; 57(11): 2707-2718, 2017 11 27.
Artículo en Inglés | MEDLINE | ID: mdl-29019686

RESUMEN

Here, we explore the chemical space of all virtually possible organic molecules focusing on ring systems, which represent the cyclic cores of organic molecules obtained by removing all acyclic bonds and converting all remaining atoms to carbon. This approach circumvents the combinatorial explosion encountered when enumerating the molecules themselves. We report the chemical universe database GDB4c containing 916 130 ring systems up to four saturated or aromatic rings and maximum ring size of 14 atoms and GDB4c3D containing the corresponding 6 555 929 stereoisomers. Almost all (98.6%) of these ring systems are unknown and represent chiral 3D-shaped macrocycles containing small rings and quaternary centers reminiscent of polycyclic natural products. We envision that GDB4c can serve to select new ring systems from which to design analogs of such natural products. The database is available for download at www.gdb.unibe.ch together with interactive visualization and search tools as a resource for molecular design.


Asunto(s)
Bases de Datos de Compuestos Químicos , Informática/métodos , Compuestos Orgánicos/química , Modelos Moleculares , Conformación Molecular , Interfaz Usuario-Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...