Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 523
Filtrar
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38446739

RESUMO

Antimicrobial peptides (AMPs), short peptides with diverse functions, effectively target and combat various organisms. The widespread misuse of chemical antibiotics has led to increasing microbial resistance. Due to their low drug resistance and toxicity, AMPs are considered promising substitutes for traditional antibiotics. While existing deep learning technology enhances AMP generation, it also presents certain challenges. Firstly, AMP generation overlooks the complex interdependencies among amino acids. Secondly, current models fail to integrate crucial tasks like screening, attribute prediction and iterative optimization. Consequently, we develop a integrated deep learning framework, Diff-AMP, that automates AMP generation, identification, attribute prediction and iterative optimization. We innovatively integrate kinetic diffusion and attention mechanisms into the reinforcement learning framework for efficient AMP generation. Additionally, our prediction module incorporates pre-training and transfer learning strategies for precise AMP identification and screening. We employ a convolutional neural network for multi-attribute prediction and a reinforcement learning-based iterative optimization strategy to produce diverse AMPs. This framework automates molecule generation, screening, attribute prediction and optimization, thereby advancing AMP research. We have also deployed Diff-AMP on a web server, with code, data and server details available in the Data Availability section.


Assuntos
Aminoácidos , Peptídeos Antimicrobianos , Antibacterianos , Difusão , Cinética
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38600666

RESUMO

Predicting the drug response of cancer cell lines is crucial for advancing personalized cancer treatment, yet remains challenging due to tumor heterogeneity and individual diversity. In this study, we present a deep learning-based framework named Deep neural network Integrating Prior Knowledge (DIPK) (DIPK), which adopts self-supervised techniques to integrate multiple valuable information, including gene interaction relationships, gene expression profiles and molecular topologies, to enhance prediction accuracy and robustness. We demonstrated the superior performance of DIPK compared to existing methods on both known and novel cells and drugs, underscoring the importance of gene interaction relationships in drug response prediction. In addition, DIPK extends its applicability to single-cell RNA sequencing data, showcasing its capability for single-cell-level response prediction and cell identification. Further, we assess the applicability of DIPK on clinical data. DIPK accurately predicted a higher response to paclitaxel in the pathological complete response (pCR) group compared to the residual disease group, affirming the better response of the pCR group to the chemotherapy compound. We believe that the integration of DIPK into clinical decision-making processes has the potential to enhance individualized treatment strategies for cancer patients.


Assuntos
Aprendizado Profundo , Neoplasias , Humanos , Redes Neurais de Computação , Neoplasias/tratamento farmacológico , Neoplasias/genética , Linhagem Celular
3.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36526280

RESUMO

Graph neural networks based on deep learning methods have been extensively applied to the molecular property prediction because of its powerful feature learning ability and good performance. However, most of them are black boxes and cannot give the reasonable explanation about the underlying prediction mechanisms, which seriously reduce people's trust on the neural network-based prediction models. Here we proposed a novel graph neural network named iteratively focused graph network (IFGN), which can gradually identify the key atoms/groups in the molecule that are closely related to the predicted properties by the multistep focus mechanism. At the same time, the combination of the multistep focus mechanism with visualization can also generate multistep interpretations, thus allowing us to gain a deep understanding of the predictive behaviors of the model. For all studied eight datasets, the IFGN model achieved good prediction performance, indicating that the proposed multistep focus mechanism also can improve the performance of the model obviously besides increasing the interpretability of built model. For researchers to use conveniently, the corresponding website (http://graphadmet.cn/works/IFGN) was also developed and can be used free of charge.


Assuntos
Aprendizagem , Redes Neurais de Computação , Humanos , Pesquisadores
4.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37427977

RESUMO

Studies have shown that the mechanism of action of many drugs is related to miRNA. In-depth research on the relationship between miRNA and drugs can provide theoretical foundations and practical approaches for various areas, such as drug target discovery, drug repositioning and biomarker research. Traditional biological experiments to test miRNA-drug susceptibility are costly and time-consuming. Thus, sequence- or topology-based deep learning methods are recognized in this field for their efficiency and accuracy. However, these methods have limitations in dealing with sparse topologies and higher-order information of miRNA (drug) feature. In this work, we propose GCFMCL, a model for multi-view contrastive learning based on graph collaborative filtering. To the best of our knowledge, this is the first attempt that incorporates contrastive learning strategy into the graph collaborative filtering framework to predict the sensitivity relationships between miRNA and drug. The proposed multi-view contrastive learning method is divided into topological contrastive objective and feature contrastive objective: (1) For the homogeneous neighbors of the topological graph, we propose a novel topological contrastive learning method via constructing the contrastive target through the topological neighborhood information of nodes. (2) The proposed model obtains feature contrastive targets from high-order feature information according to the correlation of node features, and mines potential neighborhood relationships in the feature space. The proposed multi-view comparative learning effectively alleviates the impact of heterogeneous node noise and graph data sparsity in graph collaborative filtering, and significantly enhances the performance of the model. Our study employs a dataset derived from the NoncoRNA and ncDR databases, encompassing 2049 experimentally validated miRNA-drug sensitivity associations. Five-fold cross-validation shows that the Area Under the Curve (AUC), Area Under the Precision-Recall Curve (AUPR) and F1-score (F1) of GCFMCL reach 95.28%, 95.66% and 89.77%, which outperforms the state-of-the-art (SOTA) method by the margin of 2.73%, 3.42% and 4.96%, respectively. Our code and data can be accessed at https://github.com/kkkayle/GCFMCL.


Assuntos
Sistemas de Liberação de Medicamentos , MicroRNAs , Área Sob a Curva , Bases de Dados Factuais , Descoberta de Drogas , MicroRNAs/genética
5.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37099690

RESUMO

Rapid and accurate prediction of drug-target affinity can accelerate and improve the drug discovery process. Recent studies show that deep learning models may have the potential to provide fast and accurate drug-target affinity prediction. However, the existing deep learning models still have their own disadvantages that make it difficult to complete the task satisfactorily. Complex-based models rely heavily on the time-consuming docking process, and complex-free models lacks interpretability. In this study, we introduced a novel knowledge-distillation insights drug-target affinity prediction model with feature fusion inputs to make fast, accurate and explainable predictions. We benchmarked the model on public affinity prediction and virtual screening dataset. The results show that it outperformed previous state-of-the-art models and achieved comparable performance to previous complex-based models. Finally, we study the interpretability of this model through visualization and find it can provide meaningful explanations for pairwise interaction. We believe this model can further improve the drug-target affinity prediction for its higher accuracy and reliable interpretability.


Assuntos
Benchmarking , Descoberta de Drogas , Sistemas de Liberação de Medicamentos
6.
Methods ; 227: 17-26, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38705502

RESUMO

Messenger RNA (mRNA) is vital for post-transcriptional gene regulation, acting as the direct template for protein synthesis. However, the methods available for predicting mRNA subcellular localization need to be improved and enhanced. Notably, few existing algorithms can annotate mRNA sequences with multiple localizations. In this work, we propose the mRNA-CLA, an innovative multi-label subcellular localization prediction framework for mRNA, leveraging a deep learning approach with a multi-head self-attention mechanism. The framework employs a multi-scale convolutional layer to extract sequence features across different regions and uses a self-attention mechanism explicitly designed for each sequence. Paired with Position Weight Matrices (PWMs) derived from the convolutional neural network layers, our model offers interpretability in the analysis. In particular, we perform a base-level analysis of mRNA sequences from diverse subcellular localizations to determine the nucleotide specificity corresponding to each site. Our evaluations demonstrate that the mRNA-CLA model substantially outperforms existing methods and tools.


Assuntos
Aprendizado Profundo , RNA Mensageiro , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Biologia Computacional/métodos , Redes Neurais de Computação , Humanos , Algoritmos
7.
Proc Natl Acad Sci U S A ; 119(31): e2204114119, 2022 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-35878019

RESUMO

The lack of effective and safe analgesics for chronic pain management has been a health problem associated with people's livelihoods for many years. Analgesic peptides have recently shown significant therapeutic potential, as they are devoid of opioid-related adverse effects. Programmed cell death protein 1 (PD-1) is widely expressed in neurons. Activation of PD-1 by PD-L1 modulates neuronal excitability and evokes significant analgesic effects, making it a promising target for pain treatment. However, the research and development of small molecule analgesic peptides targeting PD-1 have not been reported. Here, we screened the peptide H-20 using high-throughput screening. The in vitro data demonstrated that H-20 binds to PD-1 with micromolar affinity, evokes Src homology 2 domain-containing tyrosine phosphatase 1 (SHP-1) phosphorylation, and diminishes nociceptive signals in dorsal root ganglion (DRG) neurons. Preemptive treatment with H-20 effectively attenuates perceived pain in naïve WT mice. Spinal H-20 administration displayed effective and longer-lasting analgesia in multiple preclinical pain models with a reduction in or absence of tolerance, abuse liability, constipation, itch, and motor coordination impairment. In summary, our findings reveal that H-20 is a promising candidate drug that ameliorates chronic pain in the clinic.


Assuntos
Analgésicos , Dor Crônica , Peptídeos , Receptor de Morte Celular Programada 1 , Analgésicos/farmacologia , Analgésicos Opioides , Animais , Dor Crônica/tratamento farmacológico , Gânglios Espinais/metabolismo , Camundongos , Peptídeos/farmacologia , Receptor de Morte Celular Programada 1/metabolismo
8.
Mol Cancer ; 23(1): 129, 2024 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-38902727

RESUMO

Malignant tumors have increasing morbidity and high mortality, and their occurrence and development is a complicate process. The development of sequencing technologies enabled us to gain a better understanding of the underlying genetic and molecular mechanisms in tumors. In recent years, the spatial transcriptomics sequencing technologies have been developed rapidly and allow the quantification and illustration of gene expression in the spatial context of tissues. Compared with the traditional transcriptomics technologies, spatial transcriptomics technologies not only detect gene expression levels in cells, but also inform the spatial location of genes within tissues, cell composition of biological tissues, and interaction between cells. Here we summarize the development of spatial transcriptomics technologies, spatial transcriptomics tools and its application in cancer research. We also discuss the limitations and challenges of current spatial transcriptomics approaches, as well as future development and prospects.


Assuntos
Perfilação da Expressão Gênica , Neoplasias , Transcriptoma , Humanos , Neoplasias/genética , Neoplasias/patologia , Animais , Regulação Neoplásica da Expressão Gênica , Biologia Computacional/métodos , Biomarcadores Tumorais/genética
9.
Bioconjug Chem ; 35(5): 638-652, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38669628

RESUMO

Aberrant canonical NF-κB signaling has been implicated in diseases, such as autoimmune disorders and cancer. Direct disruption of the interaction of NEMO and IKKα/ß has been developed as a novel way to inhibit the overactivation of NF-κB. Peptides are a potential solution for disrupting protein-protein interactions (PPIs); however, they typically suffer from poor stability in vivo and limited tissue penetration permeability, hampering their widespread use as new chemical biology tools and potential therapeutics. In this work, decafluorobiphenyl-cysteine SNAr chemistry, molecular modeling, and biological validation allowed the development of peptide PPI inhibitors. The resulting cyclic peptide specifically inhibited canonical NF-κB signaling in vitro and in vivo, and presented positive metabolic stability, anti-inflammatory effects, and low cytotoxicity. Importantly, our results also revealed that cyclic peptides had huge potential in acute lung injury (ALI) treatment, and confirmed the role of the decafluorobiphenyl-based cyclization strategy in enhancing the biological activity of peptide NEMO-IKKα/ß inhibitors. Moreover, it provided a promising method for the development of peptide-PPI inhibitors.


Assuntos
Lesão Pulmonar Aguda , Quinase I-kappa B , Lipopolissacarídeos , Peptídeos Cíclicos , Quinase I-kappa B/metabolismo , Quinase I-kappa B/antagonistas & inibidores , Lesão Pulmonar Aguda/tratamento farmacológico , Lesão Pulmonar Aguda/induzido quimicamente , Lesão Pulmonar Aguda/metabolismo , Animais , Camundongos , Peptídeos Cíclicos/química , Peptídeos Cíclicos/farmacologia , Humanos , NF-kappa B/metabolismo , Ligação Proteica , Ciclização
10.
J Chem Inf Model ; 64(7): 2798-2806, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37643082

RESUMO

Plant small secretory peptides (SSPs) play an important role in the regulation of biological processes in plants. Accurately predicting SSPs enables efficient exploration of their functions. Traditional experimental verification methods are very reliable and accurate, but they require expensive equipment and a lot of time. The method of machine learning speeds up the prediction process of SSPs, but the instability of feature extraction will also lead to further limitations of this type of method. Therefore, this paper proposes a new feature-correction-based model for SSP recognition in plants, abbreviated as SE-SSP. The model mainly includes the following three advantages: First, the use of transformer encoders can better reveal implicit features. Second, design a feature correction module suitable for sequences, named 2-D SENET, to adaptively adjust the features to obtain a more robust feature representation. Third, stack multiple linear modules to further dig out the deep information on the sample. At the same time, the training based on a contrastive learning strategy can alleviate the problem of sparse samples. We construct experiments on publicly available data sets, and the results verify that our model shows an excellent performance. The proposed model can be used as a convenient and effective SSP prediction tool in the future. Our data and code are publicly available at https://github.com/wrab12/SE-SSP/.


Assuntos
Fontes de Energia Elétrica , Aprendizado de Máquina , Transporte Biológico , Peptídeos , Projetos de Pesquisa
11.
J Chem Inf Model ; 64(7): 2912-2920, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37920888

RESUMO

Deep learning methods can accurately study noncoding RNA protein interactions (NPI), which is of great significance in gene regulation, human disease, and other fields. However, the computational method for predicting NPI in large-scale dynamic ncRNA protein bipartite graphs is rarely discussed, which is an online modeling and prediction problem. In addition, the results published by researchers on the Web site cannot meet real-time needs due to the large amount of basic data and long update cycles. Therefore, we propose a real-time method based on the dynamic ncRNA-protein bipartite graph learning framework, termed ML-GNN, which can model and predict the NPIs in real time. Our proposed method has the following advantages: first, the meta-learning strategy can alleviate the problem of large prediction errors in sparse neighborhood samples; second, dynamic modeling of newly added data can reduce computational pressure and predict NPIs in real-time. In the experiment, we built a dynamic bipartite graph based on 300000 NPIs from the NPInterv4.0 database. The experimental results indicate that our model achieved excellent performance in multiple experiments. The code for the model is available at https://github.com/taowang11/ML-NPI, and the data can be downloaded freely at http://bigdata.ibp.ac.cn/npinter4.


Assuntos
RNA não Traduzido , Pesquisadores , Humanos , Bases de Dados Factuais , RNA não Traduzido/genética
12.
J Chem Inf Model ; 64(4): 1213-1228, 2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38302422

RESUMO

Deep learning-based de novo molecular design has recently gained significant attention. While numerous DL-based generative models have been successfully developed for designing novel compounds, the majority of the generated molecules lack sufficiently novel scaffolds or high drug-like profiles. The aforementioned issues may not be fully captured by commonly used metrics for the assessment of molecular generative models, such as novelty, diversity, and quantitative estimation of the drug-likeness score. To address these limitations, we proposed a genetic algorithm-guided generative model called GARel (genetic algorithm-based receptor-ligand interaction generator), a novel framework for training a DL-based generative model to produce drug-like molecules with novel scaffolds. To efficiently train the GARel model, we utilized dense net to update the parameters based on molecules with novel scaffolds and drug-like features. To demonstrate the capability of the GARel model, we used it to design inhibitors for three targets: AA2AR, EGFR, and SARS-Cov2. The results indicate that GARel-generated molecules feature more diverse and novel scaffolds and possess more desirable physicochemical properties and favorable docking scores. Compared with other generative models, GARel makes significant progress in balancing novelty and drug-likeness, providing a promising direction for the further development of DL-based de novo design methodology with potential impacts on drug discovery.


Assuntos
Desenho de Fármacos , RNA Viral , Ligantes , Algoritmos , Descoberta de Drogas
13.
Bioorg Chem ; 142: 106952, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-37952486

RESUMO

PARP1 is a multifaceted component of DNA repair and chromatin remodeling, making it an effective therapeutic target for cancer therapy. The recently reported proteolytic targeting chimera (PROTAC) could effectively degrade PARP1 through the ubiquitin-proteasome pathway, expanding the therapeutic application of PARP1 blocking. In this study, a series of nitrogen heterocyclic PROTACs were designed and synthesized through ternary complex simulation analysis based on our previous work. Our efforts have resulted in a potent PARP1 degrader D6 (DC50 = 25.23 nM) with high selectivity due to nitrogen heterocyclic linker generating multiple interactions with the PARP1-CRBN PPI surface, specifically. Moreover, D6 exhibited strong cytotoxicity to triple negative breast cancer cell line MDA-MB-231 (IC50 = 1.04 µM). And the proteomic results showed that the antitumor mechanism of D6 was found that intensifies DNA damage by intercepting the CDC25C-CDK1 axis to halt cell cycle transition in triple-negative breast cancer cells. Furthermore, in vivo study, D6 showed a promising PK property with moderate oral absorption activity. And D6 could effectively inhibit tumor growth (TGI rate = 71.4 % at 40 mg/kg) without other signs of toxicity in MDA-MB-321 tumor-bearing mice. In summary, we have identified an original scaffold and potent PARP1 PROTAC that provided a novel intervention strategy for the treatment of triple-negative breast cancer.


Assuntos
Neoplasias de Mama Triplo Negativas , Humanos , Camundongos , Animais , Neoplasias de Mama Triplo Negativas/patologia , Proteômica , Proliferação de Células , Pontos de Checagem do Ciclo Celular , Nitrogênio , Linhagem Celular Tumoral , Fosfatases cdc25 , Poli(ADP-Ribose) Polimerase-1 , Proteína Quinase CDC2
14.
BMC Genomics ; 24(1): 742, 2023 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-38053026

RESUMO

BACKGROUND: DNA methylation, instrumental in numerous life processes, underscores the paramount importance of its accurate prediction. Recent studies suggest that deep learning, due to its capacity to extract profound insights, provides a more precise DNA methylation prediction. However, issues related to the stability and generalization performance of these models persist. RESULTS: In this study, we introduce an efficient and stable DNA methylation prediction model. This model incorporates a feature fusion approach, adaptive feature correction technology, and a contrastive learning strategy. The proposed model presents several advantages. First, DNA sequences are encoded at four levels to comprehensively capture intricate information across multi-scale and low-span features. Second, we design a sequence-specific feature correction module that adaptively adjusts the weights of sequence features. This improvement enhances the model's stability and scalability, or its generality. Third, our contrastive learning strategy mitigates the instability issues resulting from sparse data. To validate our model, we conducted multiple sets of experiments on commonly used datasets, demonstrating the model's robustness and stability. Simultaneously, we amalgamate various datasets into a single, unified dataset. The experimental outcomes from this combined dataset substantiate the model's robust adaptability. CONCLUSIONS: Our research findings affirm that the StableDNAm model is a general, stable, and effective instrument for DNA methylation prediction. It holds substantial promise for providing invaluable assistance in future methylation-related research and analyses.


Assuntos
Metilação de DNA , Processamento de Proteína Pós-Traducional
15.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32778891

RESUMO

Deep learning is an important branch of artificial intelligence that has been successfully applied into medicine and two-dimensional ligand design. The three-dimensional (3D) ligand generation in the 3D pocket of protein target is an interesting and challenging issue for drug design by deep learning. Here, the MolAICal software is introduced to supply a way for generating 3D drugs in the 3D pocket of protein targets by combining with merits of deep learning model and classical algorithm. The MolAICal software mainly contains two modules for 3D drug design. In the first module of MolAICal, it employs the genetic algorithm, deep learning model trained by FDA-approved drug fragments and Vinardo score fitting on the basis of PDBbind database for drug design. In the second module, it uses deep learning generative model trained by drug-like molecules of ZINC database and molecular docking invoked by Autodock Vina automatically. Besides, the Lipinski's rule of five, Pan-assay interference compounds (PAINS), synthetic accessibility (SA) and other user-defined rules are introduced for filtering out unwanted ligands in MolAICal. To show the drug design modules of MolAICal, the membrane protein glucagon receptor and non-membrane protein SARS-CoV-2 main protease are chosen as the investigative drug targets. The results show MolAICal can generate the various and novel ligands with good binding scores and appropriate XLOGP values. We believe that MolAICal can use the advantages of deep learning model and classical programming for designing 3D drugs in protein pocket. MolAICal is freely for any nonprofit purpose and accessible at https://molaical.github.io.


Assuntos
Algoritmos , Inteligência Artificial , Desenho de Fármacos , Proteínas/química , Software , Bases de Dados de Proteínas , Relação Quantitativa Estrutura-Atividade
16.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33147620

RESUMO

MOTIVATION: Computational methods accelerate drug discovery and play an important role in biomedicine, such as molecular property prediction and compound-protein interaction (CPI) identification. A key challenge is to learn useful molecular representation. In the early years, molecular properties are mainly calculated by quantum mechanics or predicted by traditional machine learning methods, which requires expert knowledge and is often labor-intensive. Nowadays, graph neural networks have received significant attention because of the powerful ability to learn representation from graph data. Nevertheless, current graph-based methods have some limitations that need to be addressed, such as large-scale parameters and insufficient bond information extraction. RESULTS: In this study, we proposed a graph-based approach and employed a novel triplet message mechanism to learn molecular representation efficiently, named triplet message networks (TrimNet). We show that TrimNet can accurately complete multiple molecular representation learning tasks with significant parameter reduction, including the quantum properties, bioactivity, physiology and CPI prediction. In the experiments, TrimNet outperforms the previous state-of-the-art method by a significant margin on various datasets. Besides the few parameters and high prediction accuracy, TrimNet could focus on the atoms essential to the target properties, providing a clear interpretation of the prediction tasks. These advantages have established TrimNet as a powerful and useful computational tool in solving the challenging problem of molecular representation learning. AVAILABILITY: The quantum and drug datasets are available on the website of MoleculeNet: http://moleculenet.ai. The source code is available in GitHub: https://github.com/yvquanli/trimnet. CONTACT: xjyao@lzu.edu.cn, songsen@tsinghua.edu.cn.


Assuntos
Descoberta de Drogas , Aprendizado de Máquina , Software
17.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33940598

RESUMO

How to produce expressive molecular representations is a fundamental challenge in artificial intelligence-driven drug discovery. Graph neural network (GNN) has emerged as a powerful technique for modeling molecular data. However, previous supervised approaches usually suffer from the scarcity of labeled data and poor generalization capability. Here, we propose a novel molecular pre-training graph-based deep learning framework, named MPG, that learns molecular representations from large-scale unlabeled molecules. In MPG, we proposed a powerful GNN for modelling molecular graph named MolGNet, and designed an effective self-supervised strategy for pre-training the model at both the node and graph-level. After pre-training on 11 million unlabeled molecules, we revealed that MolGNet can capture valuable chemical insights to produce interpretable representation. The pre-trained MolGNet can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of drug discovery tasks, including molecular properties prediction, drug-drug interaction and drug-target interaction, on 14 benchmark datasets. The pre-trained MolGNet in MPG has the potential to become an advanced molecular encoder in the drug discovery pipeline.


Assuntos
Bases de Dados de Compostos Químicos , Sistemas de Liberação de Medicamentos , Descoberta de Drogas , Modelos Moleculares , Redes Neurais de Computação
18.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33418562

RESUMO

Machine-learning (ML)-based scoring functions (MLSFs) have gradually emerged as a promising alternative for protein-ligand binding affinity prediction and structure-based virtual screening. However, clouds of doubts have still been raised against the benefits of this novel type of scoring functions (SFs). In this study, to benchmark the performance of target-specific MLSFs on a relatively unbiased dataset, the MLSFs trained from three representative protein-ligand interaction representations were assessed on the LIT-PCBA dataset, and the classical Glide SP SF and three types of ligand-based quantitative structure-activity relationship (QSAR) models were also utilized for comparison. Two major aspects in virtual screening campaigns, including prediction accuracy and hit novelty, were systematically explored. The calculation results illustrate that the tested target-specific MLSFs yielded generally superior performance over the classical Glide SP SF, but they could hardly outperform the 2D fingerprint-based QSAR models. Although substantial improvements could be achieved by integrating multiple types of protein-ligand interaction features, the MLSFs were still not sufficient to exceed MACCS-based QSAR models. In terms of the correlations between the hit ranks or the structures of the top-ranked hits, the MLSFs developed by different featurization strategies would have the ability to identify quite different hits. Nevertheless, it seems that target-specific MLSFs do not have the intrinsic attributes of a traditional SF and may not be a substitute for classical SFs. In contrast, MLSFs can be regarded as a new derivative of ligand-based QSAR models. It is expected that our study may provide valuable guidance for the assessment and further development of target-specific MLSFs.


Assuntos
Bases de Dados de Proteínas , Aprendizado de Máquina , Simulação de Acoplamento Molecular , Proteínas/química , Ligantes , Relação Quantitativa Estrutura-Atividade
19.
Brief Bioinform ; 22(1): 497-514, 2021 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-31982914

RESUMO

How to accurately estimate protein-ligand binding affinity remains a key challenge in computer-aided drug design (CADD). In many cases, it has been shown that the binding affinities predicted by classical scoring functions (SFs) cannot correlate well with experimentally measured biological activities. In the past few years, machine learning (ML)-based SFs have gradually emerged as potential alternatives and outperformed classical SFs in a series of studies. In this study, to better recognize the potential of classical SFs, we have conducted a comparative assessment of 25 commonly used SFs. Accordingly, the scoring power was systematically estimated by using the state-of-the-art ML methods that replaced the original multiple linear regression method to refit individual energy terms. The results show that the newly-developed ML-based SFs consistently performed better than classical ones. In particular, gradient boosting decision tree (GBDT) and random forest (RF) achieved the best predictions in most cases. The newly-developed ML-based SFs were also tested on another benchmark modified from PDBbind v2007, and the impacts of structural and sequence similarities were evaluated. The results indicated that the superiority of the ML-based SFs could be fully guaranteed when sufficient similar targets were contained in the training set. Moreover, the effect of the combinations of features from multiple SFs was explored, and the results indicated that combining NNscore2.0 with one to four other classical SFs could yield the best scoring power. However, it was not applicable to derive a generic target-specific SF or SF combination.


Assuntos
Desenvolvimento de Medicamentos/métodos , Aprendizado de Máquina/normas , Proteômica/métodos , Animais , Desenvolvimento de Medicamentos/normas , Humanos , Ligantes , Ligação Proteica , Proteoma/metabolismo , Proteômica/normas
20.
Phys Rev Lett ; 130(5): 052302, 2023 Feb 03.
Artigo em Inglês | MEDLINE | ID: mdl-36800457

RESUMO

Many transport coefficients of the quark-gluon plasma and nuclear structure functions can be written as gauge invariant correlation functions of non-Abelian field strengths dressed with Wilson lines. We discuss the applicability of axial gauge n·A=0 to calculate them. In particular, we address issues that appear when one attempts to trivialize the Wilson lines in the correlation functions by gauge fixing. We find it is always impossible to completely remove the gauge fields n·A in Wilson lines that extend to infinity in the n direction by means of gauge transformations. We show how the obstruction appears in an explicit example of a perturbative calculation, and we also explain it more generally from the perspective of the path integral that defines the theory. Our results explain why the two correlators that define the heavy quark and quarkonium transport coefficients, which are seemingly equal in axial gauge, are actually different physical quantities of the quark-gluon plasma and have different values. Furthermore, our findings provide insights into the difference between two inequivalent gluon parton distribution functions.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA