Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 253
Filtrar
Más filtros

País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 178(1): 91-106.e23, 2019 06 27.
Artículo en Inglés | MEDLINE | ID: mdl-31178116

RESUMEN

Alternative polyadenylation (APA) is a major driver of transcriptome diversity in human cells. Here, we use deep learning to predict APA from DNA sequence alone. We trained our model (APARENT, APA REgression NeT) on isoform expression data from over 3 million APA reporters. APARENT's predictions are highly accurate when tasked with inferring APA in synthetic and human 3'UTRs. Visualizing features learned across all network layers reveals that APARENT recognizes sequence motifs known to recruit APA regulators, discovers previously unknown sequence determinants of 3' end processing, and integrates these features into a comprehensive, interpretable, cis-regulatory code. We apply APARENT to forward engineer functional polyadenylation signals with precisely defined cleavage position and isoform usage and validate predictions experimentally. Finally, we use APARENT to quantify the impact of genetic variants on APA. Our approach detects pathogenic variants in a wide range of disease contexts, expanding our understanding of the genetic origins of disease.


Asunto(s)
Aprendizaje Profundo , Modelos Genéticos , Poliadenilación/genética , Regiones no Traducidas 3'/genética , Secuencia de Bases/genética , Bases de Datos Genéticas , Expresión Génica/genética , Células HEK293 , Humanos , Mutagénesis/genética , División del ARN/genética , ARN Mensajero/genética , RNA-Seq , Biología Sintética , Transcriptoma
2.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-39007594

RESUMEN

Artificial intelligence (AI)-driven methods can vastly improve the historically costly drug design process, with various generative models already in widespread use. Generative models for de novo drug design, in particular, focus on the creation of novel biological compounds entirely from scratch, representing a promising future direction. Rapid development in the field, combined with the inherent complexity of the drug design process, creates a difficult landscape for new researchers to enter. In this survey, we organize de novo drug design into two overarching themes: small molecule and protein generation. Within each theme, we identify a variety of subtasks and applications, highlighting important datasets, benchmarks, and model architectures and comparing the performance of top models. We take a broad approach to AI-driven drug design, allowing for both micro-level comparisons of various methods within each subtask and macro-level observations across different fields. We discuss parallel challenges and approaches between the two applications and highlight future directions for AI-driven de novo drug design as a whole. An organized repository of all covered sources is available at https://github.com/gersteinlab/GenAI4Drug.


Asunto(s)
Inteligencia Artificial , Diseño de Fármacos , Proteínas , Humanos , Biología Computacional/métodos , Proteínas/química
3.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38647154

RESUMEN

Molecular generative models have exhibited promising capabilities in designing molecules from scratch with high binding affinities in a predetermined protein pocket, offering potential synergies with traditional structural-based drug design strategy. However, the generative processes of such models are random and the atomic interaction information between ligand and protein are ignored. On the other hand, the ligand has high propensity to bind with residues called hotspots. Hotspot residues contribute to the majority of the binding free energies and have been recognized as appealing targets for designed molecules. In this work, we develop an interaction prompt guided diffusion model, InterDiff to deal with the challenges. Four kinds of atomic interactions are involved in our model and represented as learnable vector embeddings. These embeddings serve as conditions for individual residue to guide the molecular generative process. Comprehensive in silico experiments evince that our model could generate molecules with desired ligand-protein interactions in a guidable way. Furthermore, we validate InterDiff on two realistic protein-based therapeutic agents. Results show that InterDiff could generate molecules with better or similar binding mode compared to known targeted drugs.


Asunto(s)
Proteínas , Proteínas/química , Proteínas/metabolismo , Ligandos , Unión Proteica , Diseño de Fármacos , Modelos Moleculares , Algoritmos , Sitios de Unión , Simulación por Computador
4.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38980369

RESUMEN

Recent studies have extensively used deep learning algorithms to analyze gene expression to predict disease diagnosis, treatment effectiveness, and survival outcomes. Survival analysis studies on diseases with high mortality rates, such as cancer, are indispensable. However, deep learning models are plagued by overfitting owing to the limited sample size relative to the large number of genes. Consequently, the latest style-transfer deep generative models have been implemented to generate gene expression data. However, these models are limited in their applicability for clinical purposes because they generate only transcriptomic data. Therefore, this study proposes ctGAN, which enables the combined transformation of gene expression and survival data using a generative adversarial network (GAN). ctGAN improves survival analysis by augmenting data through style transformations between breast cancer and 11 other cancer types. We evaluated the concordance index (C-index) enhancements compared with previous models to demonstrate its superiority. Performance improvements were observed in nine of the 11 cancer types. Moreover, ctGAN outperformed previous models in seven out of the 11 cancer types, with colon adenocarcinoma (COAD) exhibiting the most significant improvement (median C-index increase of ~15.70%). Furthermore, integrating the generated COAD enhanced the log-rank p-value (0.041) compared with using only the real COAD (p-value = 0.797). Based on the data distribution, we demonstrated that the model generated highly plausible data. In clustering evaluation, ctGAN exhibited the highest performance in most cases (89.62%). These findings suggest that ctGAN can be meaningfully utilized to predict disease progression and select personalized treatments in the medical field.


Asunto(s)
Aprendizaje Profundo , Humanos , Análisis de Supervivencia , Algoritmos , Neoplasias/genética , Neoplasias/mortalidad , Perfilación de la Expresión Génica/métodos , Redes Neurales de la Computación , Biología Computacional/métodos , Neoplasias de la Mama/genética , Neoplasias de la Mama/mortalidad , Femenino , Regulación Neoplásica de la Expresión Génica
5.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38960404

RESUMEN

Recent advances in microfluidics and sequencing technologies allow researchers to explore cellular heterogeneity at single-cell resolution. In recent years, deep learning frameworks, such as generative models, have brought great changes to the analysis of transcriptomic data. Nevertheless, relying on the potential space of these generative models alone is insufficient to generate biological explanations. In addition, most of the previous work based on generative models is limited to shallow neural networks with one to three layers of latent variables, which may limit the capabilities of the models. Here, we propose a deep interpretable generative model called d-scIGM for single-cell data analysis. d-scIGM combines sawtooth connectivity techniques and residual networks, thereby constructing a deep generative framework. In addition, d-scIGM incorporates hierarchical prior knowledge of biological domains to enhance the interpretability of the model. We show that d-scIGM achieves excellent performance in a variety of fundamental tasks, including clustering, visualization, and pseudo-temporal inference. Through topic pathway studies, we found that d-scIGM-learned topics are better enriched for biologically meaningful pathways compared to the baseline models. Furthermore, the analysis of drug response data shows that d-scIGM can capture drug response patterns in large-scale experiments, which provides a promising way to elucidate the underlying biological mechanisms. Lastly, in the melanoma dataset, d-scIGM accurately identified different cell types and revealed multiple melanin-related driver genes and key pathways, which are critical for understanding disease mechanisms and drug development.


Asunto(s)
Aprendizaje Profundo , RNA-Seq , Análisis de Expresión Génica de una Sola Célula , Humanos , Algoritmos , Biología Computacional/métodos , Redes Neurales de la Computación , RNA-Seq/métodos , Análisis de Expresión Génica de una Sola Célula/métodos
6.
Proc Natl Acad Sci U S A ; 120(48): e2312848120, 2023 Nov 28.
Artículo en Inglés | MEDLINE | ID: mdl-37983512

RESUMEN

The availability of natural protein sequences synergized with generative AI provides new paradigms to engineer enzymes. Although active enzyme variants with numerous mutations have been designed using generative models, their performance often falls short of their wild type counterparts. Additionally, in practical applications, choosing fewer mutations that can rival the efficacy of extensive sequence alterations is usually more advantageous. Pinpointing beneficial single mutations continues to be a formidable task. In this study, using the generative maximum entropy model to analyze Renilla luciferase (RLuc) homologs, and in conjunction with biochemistry experiments, we demonstrated that natural evolutionary information could be used to predictively improve enzyme activity and stability by engineering the active center and protein scaffold, respectively. The success rate to improve either luciferase activity or stability of designed single mutants is ~50%. This finding highlights nature's ingenious approach to evolving proficient enzymes, wherein diverse evolutionary pressures are preferentially applied to distinct regions of the enzyme, ultimately culminating in an overall high performance. We also reveal an evolutionary preference in RLuc toward emitting blue light that holds advantages in terms of water penetration compared to other light spectra. Taken together, our approach facilitates navigation through enzyme sequence space and offers effective strategies for computer-aided rational enzyme engineering.


Asunto(s)
Luz , Mutación , Luciferasas de Renilla/genética , Luciferasas de Renilla/metabolismo , Estabilidad de Enzimas
7.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36575569

RESUMEN

Single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) has been a powerful technology for transcriptome analysis. However, the systematic validation of diverse computational tools used in scRNA-seq analysis remains challenging. Here, we propose a novel simulation tool, termed as Simulation of Cellular Heterogeneity (SimCH), for the flexible and comprehensive assessment of scRNA-seq computational methods. The Gaussian Copula framework is recruited to retain gene coexpression of experimental data shown to be associated with cellular heterogeneity. The synthetic count matrices generated by suitable SimCH modes closely match experimental data originating from either homogeneous or heterogeneous cell populations and either unique molecular identifier (UMI)-based or non-UMI-based techniques. We demonstrate how SimCH can benchmark several types of computational methods, including cell clustering, discovery of differentially expressed genes, trajectory inference, batch correction and imputation. Moreover, we show how SimCH can be used to conduct power evaluation of cell clustering methods. Given these merits, we believe that SimCH can accelerate single-cell research.


Asunto(s)
ARN , Análisis de la Célula Individual , ARN/genética , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos , Análisis por Conglomerados , Expresión Génica
8.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37670499

RESUMEN

Proteolysis targeting chimera (PROTAC), has emerged as an effective modality to selectively degrade disease-related proteins by harnessing the ubiquitin-proteasome system. Due to PROTACs' hetero-bifunctional characteristics, in which a linker joins a warhead binding to a protein of interest (POI), conferring specificity and a E3-ligand binding to an E3 ubiquitin ligase, this could trigger the ubiquitination and transportation of POI to the proteasome, followed by degradation. The rational PROTAC linker design is challenging due to its relatively large molecular weight and the complexity of maintaining the binding mode of warhead and E3-ligand in the binding pockets of counterpart. Conventional linker generation method can only generate linkers in either 1D SMILES or 2D graph, without taking into account the information of ternary structures. Here we propose a novel 3D linker generative model PROTAC-INVENT which can not only generate SMILES of PROTAC but also its 3D putative binding conformation coupled with the target protein and the E3 ligase. The model is trained jointly with the RL approach to bias the generation of PROTAC structures toward pre-defined 2D and 3D based properties. Examples were provided to demonstrate the utility of the model for generating reasonable 3D conformation of PROTACs. On the other hand, our results show that the associated workflow for 3D PROTAC conformation generation can also be used as an efficient docking protocol for PROTACs.


Asunto(s)
Aprendizaje , Complejo de la Endopetidasa Proteasomal , Ligandos , Citoplasma , Quimera Dirigida a la Proteólisis
9.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36869850

RESUMEN

Alignment is the cornerstone of many long-read pipelines and plays an essential role in resolving structural variants (SVs). However, forced alignments of SVs embedded in long reads, inflexibility of integrating novel SVs models and computational inefficiency remain problems. Here, we investigate the feasibility of resolving long-read SVs with alignment-free algorithms. We ask: (1) Is it possible to resolve long-read SVs with alignment-free approaches? and (2) Does it provide an advantage over existing approaches? To this end, we implemented the framework named Linear, which can flexibly integrate alignment-free algorithms such as the generative model for long-read SV detection. Furthermore, Linear addresses the problem of compatibility of alignment-free approaches with existing software. It takes as input long reads and outputs standardized results existing software can directly process. We conducted large-scale assessments in this work and the results show that the sensitivity, and flexibility of Linear outperform alignment-based pipelines. Moreover, the computational efficiency is orders of magnitude faster.


Asunto(s)
Genoma Humano , Programas Informáticos , Humanos , Algoritmos , Análisis de Secuencia , Modelos Estadísticos , Análisis de Secuencia de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento
10.
Brief Bioinform ; 24(6)2023 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-37756591

RESUMEN

In the process of drug discovery, one of the key problems is how to improve the biological activity and ADMET properties starting from a specific structure, which is also called structural optimization. Based on a starting scaffold, the use of deep generative model to generate molecules with desired drug-like properties will provide a powerful tool to accelerate the structural optimization process. However, the existing generative models remain challenging in extracting molecular features efficiently in 3D space to generate drug-like 3D molecules. Moreover, most of the existing ADMET prediction models made predictions of different properties through a single model, which can result in reduced prediction accuracy on some datasets. To effectively generate molecules from a specific scaffold and provide basis for the structural optimization, the 3D-SMGE (3-Dimensional Scaffold-based Molecular Generation and Evaluation) work consisting of molecular generation and prediction of ADMET properties is presented. For the molecular generation, we proposed 3D-SMG, a novel deep generative model for the end-to-end design of 3D molecules. In the 3D-SMG model, we designed the cross-aggregated continuous-filter convolution (ca-cfconv), which is used to achieve efficient and low-cost 3D spatial feature extraction while ensuring the invariance of atomic space rotation. 3D-SMG was proved to generate valid, unique and novel molecules with high drug-likeness. Besides, the proposed data-adaptive multi-model ADMET prediction method outperformed or maintained the best evaluation metrics on 24 out of 27 ADMET benchmark datasets. 3D-SMGE is anticipated to emerge as a powerful tool for hit-to-lead structural optimizations and accelerate the drug discovery process.

11.
Bioinformatics ; 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39172488

RESUMEN

MOTIVATION: Single-cell RNA sequencing (scRNA-seq) enables comprehensive characterization of the cell state. However, its destructive nature prohibits measuring gene expression changes during dynamic processes such as embryogenesis. Although recent studies integrating scRNA-seq with lineage tracing have provided clonal insights between progenitor and mature cells, challenges remain. Because of their experimental nature, observations are sparse, and cells observed in the early state are not the exact progenitors of cells observed at later time points. To overcome these limitations, we developed LineageVAE, a novel computational methodology that utilizes deep learning based on the property that cells sharing barcodes have identical progenitors. RESULTS: LineageVAE is a deep generative model that transforms scRNA-seq observations with identical lineage barcodes into sequential trajectories toward a common progenitor in a latent cell state space. This method enables the reconstruction of unobservable cell state transitions, historical transcriptomes, and regulatory dynamics at a single-cell resolution. Applied to hematopoiesis and reprogrammed fibroblast datasets, LineageVAE demonstrated its ability to restore backward cell state transitions and infer progenitor heterogeneity and transcription factor activity along differentiation trajectories. AVAILABILITY AND IMPLEMENTATION: The LineageVAE model was implemented in Python using the PyTorch deep learning library. The code is available on GitHub at https://github.com/LzrRacer/LineageVAE/. SUPPLEMENTARY INFORMATION: Available at Bioinformatics online.

12.
Nano Lett ; 24(15): 4447-4453, 2024 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-38588344

RESUMEN

Modern microscopy techniques can be used to investigate soft nano-objects at the nanometer scale. However, time-consuming microscopy measurements combined with low numbers of observable polydisperse objects often limit the statistics. We propose a method for identifying the most representative objects from their respective point clouds. These point cloud data are obtained, for example, through the localization of single emitters in super-resolution fluorescence microscopy. External stimuli, such as temperature, can cause changes in the shape and properties of adaptive objects. Due to the demanding and time-consuming nature of super-resolution microscopy experiments, only a limited number of temperature steps can be performed. Therefore, we propose a deep generative model that learns the underlying point distribution of temperature-dependent microgels, enabling the reliable generation of unlimited samples with an arbitrary number of localizations. Our method greatly cuts down the data collection effort across diverse experimental conditions, proving invaluable for soft condensed matter studies.

13.
Small ; : e2402685, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38770745

RESUMEN

Designing novel materials is greatly dependent on understanding the design principles, physical mechanisms, and modeling methods of material microstructures, requiring experienced designers with expertise and several rounds of trial and error. Although recent advances in deep generative networks have enabled the inverse design of material microstructures, most studies involve property-conditional generation and focus on a specific type of structure, resulting in limited generation diversity and poor human-computer interaction. In this study, a pioneering text-to-microstructure deep generative network (Txt2Microstruct-Net) is proposed that enables the generation of 3D material microstructures directly from text prompts without additional optimization procedures. The Txt2Microstruct-Net model is trained on a large microstructure-caption paired dataset that is extensible using the algorithms provided. Moreover, the model is sufficiently flexible to generate different geometric representations, such as voxels and point clouds. The model's performance is also demonstrated in the inverse design of material microstructures and metamaterials. It has promising potential for interactive microstructure design when associated with large language models and could be a user-friendly tool for material design and discovery.

14.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35039853

RESUMEN

Deep learning shortens the cycle of the drug discovery for its success in extracting features of molecules and proteins. Generating new molecules with deep learning methods could enlarge the molecule space and obtain molecules with specific properties. However, it is also a challenging task considering that the connections between atoms are constrained by chemical rules. Aiming at generating and optimizing new valid molecules, this article proposed Molecular Substructure Tree Generative Model, in which the molecule is generated by adding substructure gradually. The proposed model is based on the Variational Auto-Encoder architecture, which uses the encoder to map molecules to the latent vector space, and then builds an autoregressive generative model as a decoder to generate new molecules from Gaussian distribution. At the same time, for the molecular optimization task, a molecular optimization model based on CycleGAN was constructed. Experiments showed that the model could generate valid and novel molecules, and the optimized model effectively improves the molecular properties.


Asunto(s)
Diseño de Fármacos , Modelos Moleculares , Descubrimiento de Drogas
15.
Brief Bioinform ; 23(4)2022 07 18.
Artículo en Inglés | MEDLINE | ID: mdl-35724626

RESUMEN

Deep learning is an artificial intelligence technique in which models express geometric transformations over multiple levels. This method has shown great promise in various fields, including drug development. The availability of public structure databases prompted the researchers to use generative artificial intelligence models to narrow down their search of the chemical space, a novel approach to chemogenomics and de novo drug development. In this study, we developed a strategy that combined an accelerated LSTM_Chem (long short-term memory for de novo compounds generation), dense fully convolutional neural network (DFCNN), and docking to generate a large number of de novo small molecular chemical compounds for given targets. To demonstrate its efficacy and applicability, six important targets that account for various human disorders were used as test examples. Moreover, using the M protease as a proof-of-concept example, we find that iteratively training with previously selected candidates can significantly increase the chance of obtaining novel compounds with higher and higher predicted binding affinities. In addition, we also check the potential benefit of obtaining reliable final de novo compounds with the help of MD simulation and metadynamics simulation. The generation of de novo compounds and the discovery of binders against various targets proposed here would be a practical and effective approach. Assessing the efficacy of these top de novo compounds with biochemical studies is promising to promote related drug development.


Asunto(s)
Aprendizaje Profundo , Inteligencia Artificial , Simulación por Computador , Diseño de Fármacos , Humanos , Redes Neurales de la Computación
16.
Brief Bioinform ; 23(4)2022 07 18.
Artículo en Inglés | MEDLINE | ID: mdl-35830870

RESUMEN

We construct a protein-protein interaction (PPI) targeted drug-likeness dataset and propose a deep molecular generative framework to generate novel drug-likeness molecules from the features of the seed compounds. This framework gains inspiration from published molecular generative models, uses the key features associated with PPI inhibitors as input and develops deep molecular generative models for de novo molecular design of PPI inhibitors. For the first time, quantitative estimation index for compounds targeting PPI was applied to the evaluation of the molecular generation model for de novo design of PPI-targeted compounds. Our results estimated that the generated molecules had better PPI-targeted drug-likeness and drug-likeness. Additionally, our model also exhibits comparable performance to other several state-of-the-art molecule generation models. The generated molecules share chemical space with iPPI-DB inhibitors as demonstrated by chemical space analysis. The peptide characterization-oriented design of PPI inhibitors and the ligand-based design of PPI inhibitors are explored. Finally, we recommend that this framework will be an important step forward for the de novo design of PPI-targeted therapeutics.


Asunto(s)
Diseño de Fármacos , Redes Neurales de la Computación , Ligandos , Modelos Moleculares
17.
J Magn Reson Imaging ; 2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39052258

RESUMEN

BACKGROUND: There is increasing interest in utilizing AI-generated content for gadolinium-free contrast-enhanced breast MRI. PURPOSE: To develop a generative model for gadolinium-free contrast-enhanced breast MRI and evaluate the diagnostic utility of the generated scans. STUDY TYPE: Retrospective. POPULATION: Two hundred seventy-six women with 304 breast MRI examinations (49 ± 13 years, 243/61 for training/testing). FIELD STRENGTH/SEQUENCE: ZOOMit diffusion-weighted imaging (DWI), T1-weighted volumetric interpolated breath-hold examination (T1W VIBE), and axial T2 3D SPACE at 3.0 T. ASSESSMENT: A generative model was developed to generate contrast-enhanced scans using precontrast T1W VIBE and DWI images. The generated and real images were quantitatively compared using the structural similarity index (SSIM), mean absolute error (MAE), and Dice similarity coefficient. Three radiologists with 8, 5, and 5 years of experience independently rated the image quality and lesion visibility on AI-generated and real images within various subgroups using a five-point scale. Four breast radiologists, with 8, 8, 5, and 5 years of experience, independently and blindly interpreted four reading protocols: unenhanced MRI protocol alone and combined with AI-generated scans, abbreviated MRI protocol, and full-MRI protocol. STATISTICAL ANALYSIS: Results were assessed using t-tests and McNemar tests. Using pathology diagnosis as reference standard, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated for each reading protocol. A P value <0.05 was considered significant. RESULTS: In the test set, the generated images showed similarity to the real images (SSIM: 0.935 ± 0.047 [SD], MAE: 0.015 ± 0.012 [SD], and Dice coefficient: 0.726 ± 0.177 [SD]). No significant difference in lesion visibility was observed between real and AI-generated scans of the mass, non-mass, and benign lesion subgroups. Adding AI-generated scans to the unenhanced MRI protocol slightly improved breast cancer detection (sensitivity: 92.86% vs. 85.71%, NPV: 76.92% vs. 70.00%); achieved non-inferior diagnostic utility compared to the AB-MRI protocol and full-protocol (sensitivity: 92.86%, 95.24%; NPV: 75.00%, 81.82%). DATA CONCLUSION: AI-generated gadolinium-free contrast-enhanced breast MRI has potential to improve the sensitivity of unenhanced MRI in detecting breast cancer. EVIDENCE LEVEL: 4 TECHNICAL EFFICACY: Stage 3.

18.
Methods ; 210: 52-59, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36682423

RESUMEN

The process of design/discovery of drugs involves the identification and design of novel molecules that have the desired properties and bind well to a given disease-relevant target. One of the main challenges to effectively identify potential drug candidates is to explore the vast drug-like chemical space to find novel chemical structures with desired physicochemical properties and biological characteristics. Moreover, the chemical space of currently available molecular libraries is only a small fraction of the total possible drug-like chemical space. Deep molecular generative models have received much attention and provide an alternative approach to the design and discovery of molecules. To efficiently explore the drug-like space, we first constructed the drug-like dataset and then performed the generative design of drug-like molecules using a Conditional Randomized Transformer approach with the molecular access system (MACCS) fingerprint as a condition and compared it with previously published molecular generative models. The results show that the deep molecular generative model explores the wider drug-like chemical space. The generated drug-like molecules share the chemical space with known drugs, and the drug-like space captured by the combination of quantitative estimation of drug-likeness (QED) and quantitative estimate of protein-protein interaction targeting drug-likeness (QEPPI) can cover a larger drug-like space. Finally, we show the potential application of the model in design of inhibitors of MDM2-p53 protein-protein interaction. Our results demonstrate the potential application of deep molecular generative models for guided exploration in drug-like chemical space and molecular design.


Asunto(s)
Diseño de Fármacos , Modelos Moleculares
19.
Methods ; 211: 10-22, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36764588

RESUMEN

Deep learning is improving and changing the process of de novo molecular design at a rapid pace. In recent years, great progress has been made in drug discovery and development by using deep generative models for de novo molecular design. However, most of the existing methods are string-based or graph-based and are limited by the lack of some very important properties, such as the three-dimensional information of molecules. We propose DNMG, a deep generative adversarial network (GAN) combined with transfer learning. Specifically, we use a Wasserstein-variant GAN based network architecture that considers the 3D grid spatial information of the ligand with atomic physicochemical properties to generate a representation of the molecule, which is then parsed into SMILES strings using an improved captioning network. Comprehensive in experiments demonstrate the ability of DNMG to generate valid and novel drug-like ligands. The DNMG model is used to design inhibitors for three targets, MK14, FNTA, and CDK2. The computational results show that the molecules generated by DNMG have better binding ability to the target proteins and better physicochemical properties. Overall, our deep generative model has excellent potential to generate molecules with high binding affinity for targets and explore the space of drug-like chemistry.


Asunto(s)
Diseño de Fármacos , Descubrimiento de Drogas , Modelos Moleculares , Descubrimiento de Drogas/métodos , Ligandos , Proteínas
20.
Mol Divers ; 2024 Aug 04.
Artículo en Inglés | MEDLINE | ID: mdl-39097862

RESUMEN

The deep molecular generative model has recently become a research hotspot in pharmacy. This paper analyzes a large number of recent reports and reviews these models. In the central part of this paper, four compound databases and two molecular representation methods are compared. Five model architectures and applications for deep molecular generative models are emphatically introduced. Three evaluation metrics for model evaluation are listed. Finally, the limitations and challenges in this field are discussed to provide a reference and basis for developing and researching new models published in future.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA