Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-38084920

RESUMO

Protein-ligand binding affinity (PLBA) prediction is the fundamental task in drug discovery. Recently, various deep learning-based models predict binding affinity by incorporating the three-dimensional (3D) structure of protein-ligand complexes as input and achieving astounding progress. However, due to the scarcity of high-quality training data, the generalization ability of current models is still limited. Although there is a vast amount of affinity data available in large-scale databases such as ChEMBL, issues such as inconsistent affinity measurement labels (i.e. IC50, Ki, Kd), different experimental conditions, and the lack of available 3D binding structures complicate the development of high-precision affinity prediction models using these data. To address these issues, we (i) propose Multi-task Bioassay Pre-training (MBP), a pre-training framework for structure-based PLBA prediction; (ii) construct a pre-training dataset called ChEMBL-Dock with more than 300k experimentally measured affinity labels and about 2.8M docked 3D structures. By introducing multi-task pre-training to treat the prediction of different affinity labels as different tasks and classifying relative rankings between samples from the same bioassay, MBP learns robust and transferrable structural knowledge from our new ChEMBL-Dock dataset with varied and noisy labels. Experiments substantiate the capability of MBP on the structure-based PLBA prediction task. To the best of our knowledge, MBP is the first affinity pre-training model and shows great potential for future development. MBP web-server is now available for free at: https://huggingface.co/spaces/jiaxianustc/mbp.


Assuntos
Descoberta de Drogas , Proteínas , Ligantes , Proteínas/química , Ligação Proteica , Marcadores de Afinidade
2.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37328552

RESUMO

AlphaFold-Multimer has greatly improved the protein complex structure prediction, but its accuracy also depends on the quality of the multiple sequence alignment (MSA) formed by the interacting homologs (i.e. interologs) of the complex under prediction. Here we propose a novel method, ESMPair, that can identify interologs of a complex using protein language models. We show that ESMPair can generate better interologs than the default MSA generation method in AlphaFold-Multimer. Our method results in better complex structure prediction than AlphaFold-Multimer by a large margin (+10.7% in terms of the Top-5 best DockQ), especially when the predicted complex structures have low confidence. We further show that by combining several MSA generation methods, we may yield even better complex structure prediction accuracy than Alphafold-Multimer (+22% in terms of the Top-5 best DockQ). By systematically analyzing the impact factors of our algorithm we find that the diversity of MSA of interologs significantly affects the prediction accuracy. Moreover, we show that ESMPair performs particularly well on complexes in eucaryotes.


Assuntos
Algoritmos , Proteínas , Proteínas/química , Alinhamento de Sequência , Eucariotos/metabolismo
3.
Commun Chem ; 6(1): 123, 2023 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-37316673

RESUMO

Mutation-induced drug resistance is a significant challenge to the clinical treatment of many diseases, as structural changes in proteins can diminish drug efficacy. Understanding how mutations affect protein-ligand binding affinities is crucial for developing new drugs and therapies. However, the lack of a large-scale and high-quality database has hindered the research progresses in this area. To address this issue, we have developed MdrDB, a database that integrates data from seven publicly available datasets, which is the largest database of its kind. By integrating information on drug sensitivity and cell line mutations from Genomics of Drug Sensitivity in Cancer and DepMap, MdrDB has substantially expanded the existing drug resistance data. MdrDB is comprised of 100,537 samples of 240 proteins (which encompass 5119 total PDB structures), 2503 mutations, and 440 drugs. Each sample brings together 3D structures of wild type and mutant protein-ligand complexes, binding affinity changes upon mutation (ΔΔG), and biochemical features. Experimental results with MdrDB demonstrate its effectiveness in significantly enhancing the performance of commonly used machine learning models when predicting ΔΔG in three standard benchmarking scenarios. In conclusion, MdrDB is a comprehensive database that can advance the understanding of mutation-induced drug resistance, and accelerate the discovery of novel chemicals.

4.
Chem Sci ; 14(8): 2054-2069, 2023 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-36845922

RESUMO

Metalloproteins play indispensable roles in various biological processes ranging from reaction catalysis to free radical scavenging, and they are also pertinent to numerous pathologies including cancer, HIV infection, neurodegeneration, and inflammation. Discovery of high-affinity ligands for metalloproteins powers the treatment of these pathologies. Extensive efforts have been made to develop in silico approaches, such as molecular docking and machine learning (ML)-based models, for fast identification of ligands binding to heterogeneous proteins, but few of them have exclusively concentrated on metalloproteins. In this study, we first compiled the largest metalloprotein-ligand complex dataset containing 3079 high-quality structures, and systematically evaluated the scoring and docking powers of three competitive docking tools (i.e., PLANTS, AutoDock Vina and Glide SP) for metalloproteins. Then, a structure-based deep graph model called MetalProGNet was developed to predict metalloprotein-ligand interactions. In the model, the coordination interactions between metal ions and protein atoms and the interactions between metal ions and ligand atoms were explicitly modelled through graph convolution. The binding features were then predicted by the informative molecular binding vector learned from a noncovalent atom-atom interaction network. The evaluation on the internal metalloprotein test set, the independent ChEMBL dataset towards 22 different metalloproteins and the virtual screening dataset indicated that MetalProGNet outperformed various baselines. Finally, a noncovalent atom-atom interaction masking technique was employed to interpret MetalProGNet, and the learned knowledge accords with our understanding of physics.

5.
Cell Regen ; 12(1): 8, 2023 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-36600111

RESUMO

Inflammatory bowel disease (IBD) is a chronic inflammatory condition caused by multiple genetic and environmental factors. Numerous genes are implicated in the etiology of IBD, but the diagnosis of IBD is challenging. Here, XGBoost, a machine learning prediction model, has been used to distinguish IBD from healthy cases following elaborative feature selection. Using combined unsupervised clustering analysis and the XGBoost feature selection method, we successfully identified a 32-gene signature that can predict IBD occurrence in new cohorts with 0.8651 accuracy. The signature shows enrichment in neutrophil extracellular trap formation and cytokine signaling in the immune system. The probability threshold of the XGBoost-based classification model can be adjusted to fit personalized lifestyle and health status. Therefore, this study reveals potential IBD-related biomarkers that facilitate an effective personalized diagnosis of IBD.

6.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35262669

RESUMO

Drug resistance is a major threat to the global health and a significant concern throughout the clinical treatment of diseases and drug development. The mutation in proteins that is related to drug binding is a common cause for adaptive drug resistance. Therefore, quantitative estimations of how mutations would affect the interaction between a drug and the target protein would be of vital significance for the drug development and the clinical practice. Computational methods that rely on molecular dynamics simulations, Rosetta protocols, as well as machine learning methods have been proven to be capable of predicting ligand affinity changes upon protein mutation. However, the severely limited sample size and heavy noise induced overfitting and generalization issues have impeded wide adoption of machine learning for studying drug resistance. In this paper, we propose a robust machine learning method, termed SPLDExtraTrees, which can accurately predict ligand binding affinity changes upon protein mutation and identify resistance-causing mutations. Especially, the proposed method ranks training data following a specific scheme that starts with easy-to-learn samples and gradually incorporates harder and diverse samples into the training, and then iterates between sample weight recalculations and model updates. In addition, we calculate additional physics-based structural features to provide the machine learning model with the valuable domain knowledge on proteins for these data-limited predictive tasks. The experiments substantiate the capability of the proposed method for predicting kinase inhibitor resistance under three scenarios and achieve predictive accuracy comparable with that of molecular dynamics and Rosetta methods with much less computational costs.


Assuntos
Aprendizado de Máquina , Proteínas , Ligantes , Simulação de Dinâmica Molecular , Mutação , Proteínas/química
7.
Nucleic Acids Res ; 50(1): 46-56, 2022 01 11.
Artigo em Inglês | MEDLINE | ID: mdl-34850940

RESUMO

Clustering cells and depicting the lineage relationship among cell subpopulations are fundamental tasks in single-cell omics studies. However, existing analytical methods face challenges in stratifying cells, tracking cellular trajectories, and identifying critical points of cell transitions. To overcome these, we proposed a novel Markov hierarchical clustering algorithm (MarkovHC), a topological clustering method that leverages the metastability of exponentially perturbed Markov chains for systematically reconstructing the cellular landscape. Briefly, MarkovHC starts with local connectivity and density derived from the input and outputs a hierarchical structure for the data. We firstly benchmarked MarkovHC on five simulated datasets and ten public single-cell datasets with known labels. Then, we used MarkovHC to investigate the multi-level architectures and transition processes during human embryo preimplantation development and gastric cancer procession. MarkovHC found heterogeneous cell states and sub-cell types in lineage-specific progenitor cells and revealed the most possible transition paths and critical points in the cellular processes. These results demonstrated MarkovHC's effectiveness in facilitating the stratification of cells, identification of cell populations, and characterization of cellular trajectories and critical points.


Assuntos
Biologia Computacional/métodos , Análise de Célula Única/métodos , Blastocisto/citologia , Blastocisto/metabolismo , Carcinogênese/genética , Carcinogênese/metabolismo , Linhagem da Célula , Humanos , Cadeias de Markov
8.
Commun Biol ; 4(1): 1420, 2021 12 21.
Artigo em Inglês | MEDLINE | ID: mdl-34934174

RESUMO

Elevated aldehyde dehydrogenase (ALDH) activity correlates with poor outcome for many solid tumors as ALDHs may regulate cell proliferation and chemoresistance of cancer stem cells (CSCs). Accordingly, potent, and selective inhibitors of key ALDH enzymes may represent a novel CSC-directed treatment paradigm for ALDH+ cancer types. Of the many ALDH isoforms, we and others have implicated the elevated expression of ALDH1A3 in mesenchymal glioma stem cells (MES GSCs) as a target for the development of novel therapeutics. To this end, our structure of human ALDH1A3 combined with in silico modeling identifies a selective, active-site inhibitor of ALDH1A3. The lead compound, MCI-INI-3, is a selective competitive inhibitor of human ALDH1A3 and shows poor inhibitory effect on the structurally related isoform ALDH1A1. Mass spectrometry-based cellular thermal shift analysis reveals that ALDH1A3 is the primary binding protein for MCI-INI-3 in MES GSC lysates. The inhibitory effect of MCI-INI-3 on retinoic acid biosynthesis is comparable with that of ALDH1A3 knockout, suggesting that effective inhibition of ALDH1A3 is achieved with MCI-INI-3. Further development is warranted to characterize the role of ALDH1A3 and retinoic acid biosynthesis in glioma stem cell growth and differentiation.


Assuntos
Aldeído Oxirredutases/antagonistas & inibidores , Glioma/metabolismo , Células-Tronco Neoplásicas/metabolismo , Tretinoína/metabolismo , Humanos
9.
Adv Sci (Weinh) ; 8(24): e2102092, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34723439

RESUMO

Combinational therapy is used for a long time in cancer treatment to overcome drug resistance related to monotherapy. Increased pharmacological data and the rapid development of deep learning methods have enabled the construction of models to predict and screen drug pairs. However, the size of drug libraries is restricted to hundreds to thousands of compounds. The ScaffComb framework, which aims to bridge the gaps in the virtual screening of drug combinations in large-scale databases, is proposed here. Inspired by phenotype-based drug design, ScaffComb integrates phenotypic information into molecular scaffolds, which can be used to screen the drug library and identify potent drug combinations. First, ScaffComb is validated using the US food and drug administration dataset and known drug combinations are successfully reidentified. Then, ScaffComb is applied to screen the ZINC and ChEMBL databases, which yield novel drug combinations and reveal an ability to discover new synergistic mechanisms. To our knowledge, ScaffComb is the first method to use phenotype-based virtual screening of drug combinations in large-scale chemical datasets.


Assuntos
Antineoplásicos/uso terapêutico , Conjuntos de Dados como Assunto/estatística & dados numéricos , Avaliação Pré-Clínica de Medicamentos/métodos , Neoplasias/tratamento farmacológico , Linhagem Celular Tumoral , Combinação de Medicamentos , Desenho de Fármacos , Humanos , Fenótipo
10.
Proc Natl Acad Sci U S A ; 116(31): 15696-15705, 2019 07 30.
Artigo em Inglês | MEDLINE | ID: mdl-31308225

RESUMO

The neuronal cell death-promoting loss of cytoplasmic K+ following injury is mediated by an increase in Kv2.1 potassium channels in the plasma membrane. This phenomenon relies on Kv2.1 binding to syntaxin 1A via 9 amino acids within the channel intrinsically disordered C terminus. Preventing this interaction with a cell and blood-brain barrier-permeant peptide is neuroprotective in an in vivo stroke model. Here a rational approach was applied to define the key molecular interactions between syntaxin and Kv2.1, some of which are shared with mammalian uncoordinated-18 (munc18). Armed with this information, we found a small molecule Kv2.1-syntaxin-binding inhibitor (cpd5) that improves cortical neuron survival by suppressing SNARE-dependent enhancement of Kv2.1-mediated currents following excitotoxic injury. We validated that cpd5 selectively displaces Kv2.1-syntaxin-binding peptides from syntaxin and, at higher concentrations, munc18, but without affecting either synaptic or neuronal intrinsic properties in brain tissue slices at neuroprotective concentrations. Collectively, our findings provide insight into the role of syntaxin in neuronal cell death and validate an important target for neuroprotection.


Assuntos
Encéfalo/metabolismo , Fármacos Neuroprotetores , Canais de Potássio Shab/metabolismo , Sintaxina 1/metabolismo , Animais , Proteínas Munc18/metabolismo , Fármacos Neuroprotetores/química , Fármacos Neuroprotetores/farmacologia , Ratos , Proteínas SNARE/metabolismo
11.
PLoS Comput Biol ; 14(12): e1006651, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30532261

RESUMO

An expanded chemical space is essential for improved identification of small molecules for emerging therapeutic targets. However, the identification of targets for novel compounds is biased towards the synthesis of known scaffolds that bind familiar protein families, limiting the exploration of chemical space. To change this paradigm, we validated a new pipeline that identifies small molecule-protein interactions and works even for compounds lacking similarity to known drugs. Based on differential mRNA profiles in multiple cell types exposed to drugs and in which gene knockdowns (KD) were conducted, we showed that drugs induce gene regulatory networks that correlate with those produced after silencing protein-coding genes. Next, we applied supervised machine learning to exploit drug-KD signature correlations and enriched our predictions using an orthogonal structure-based screen. As a proof-of-principle for this regimen, top-10/top-100 target prediction accuracies of 26% and 41%, respectively, were achieved on a validation of set 152 FDA-approved drugs and 3104 potential targets. We then predicted targets for 1680 compounds and validated chemical interactors with four targets that have proven difficult to chemically modulate, including non-covalent inhibitors of HRAS and KRAS. Importantly, drug-target interactions manifest as gene expression correlations between drug treatment and both target gene KD and KD of genes that act up- or down-stream of the target, even for relatively weak binders. These correlations provide new insights on the cellular response of disrupting protein interactions and highlight the complex genetic phenotypes of drug treatment. With further refinement, our pipeline may accelerate the identification and development of novel chemical classes by screening compound-target interactions.


Assuntos
Descoberta de Drogas/métodos , Perfilação da Expressão Gênica/métodos , Proteínas/química , Proteínas/efeitos dos fármacos , Linhagem Celular , Biologia Computacional , Simulação por Computador , Bases de Dados de Ácidos Nucleicos/estatística & dados numéricos , Descoberta de Drogas/estatística & dados numéricos , Avaliação Pré-Clínica de Medicamentos/métodos , Avaliação Pré-Clínica de Medicamentos/estatística & dados numéricos , Perfilação da Expressão Gênica/estatística & dados numéricos , Técnicas de Silenciamento de Genes , Ontologia Genética , Redes Reguladoras de Genes/efeitos dos fármacos , Humanos , Modelos Moleculares , Simulação de Acoplamento Molecular , Inibidores de Proteínas Quinases/química , Inibidores de Proteínas Quinases/farmacologia , Proteínas/genética , Ubiquitina-Proteína Ligases/antagonistas & inibidores , Ubiquitina-Proteína Ligases/química , Ubiquitina-Proteína Ligases/genética , Wortmanina/química , Wortmanina/farmacologia , Proteínas ras/antagonistas & inibidores , Proteínas ras/química , Proteínas ras/genética
12.
Sci Rep ; 7(1): 1789, 2017 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-28496195

RESUMO

C-terminus of Hsc/p70-Interacting Protein (CHIP) is a homodimeric E3 ubiquitin ligase. Each CHIP monomer consists of a tetratricopeptide-repeat (TPR), helix-turn-helix (HH), and U-box domain. In contrast to nearly all homodimeric proteins, CHIP is asymmetric. To uncover the origins of asymmetry, we performed molecular dynamics simulations of dimer assembly. We determined that a CHIP monomer is most stable when the HH domain has an extended helix that supports intra-monomer TPR-U-box interaction, blocking the E2-binding surface of the U-box. We also discovered that monomers first dimerize symmetrically through their HH domains, which then triggers U-box dimerization. This brings the extended helices into close proximity, including a repulsive stretch of positively charged residues. Unable to smoothly unwind, this conflict bends the helices until the helix of one protomer breaks to relieve the repulsion. The abrupt snapping of the helix forces the C-terminal residues of the other protomer to disrupt that protomer's TPR-U-box tight binding interface, swiftly exposing and activating one of the E2 binding sites. Mutagenesis and biochemical experiments confirm that C-terminal residues are necessary both to maintain CHIP stability and function. This novel mechanism indicates how a ubiquitin ligase maintains an inactive monomeric form that rapidly activates only after asymmetric assembly.


Assuntos
Multimerização Proteica , Ubiquitina-Proteína Ligases/química , Ubiquitina-Proteína Ligases/metabolismo , Animais , Ativação Enzimática , Humanos , Modelos Moleculares , Complexos Multiproteicos , Conformação Proteica , Domínios Proteicos , Dobramento de Proteína , Domínios e Motivos de Interação entre Proteínas
13.
J Comput Aided Mol Des ; 30(9): 695-706, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-27573981

RESUMO

Induced fit or protein flexibility can make a given structure less useful for docking and/or scoring. The 2015 Drug Design Data Resource (D3R) Grand Challenge provided a unique opportunity to prospectively test optimal strategies for virtual screening in these type of targets: heat shock protein 90 (HSP90), a protein with multiple ligand-induced binding modes; and mitogen-activated protein kinase kinase kinase kinase 4 (MAP4K4), a kinase with a large flexible pocket. Using previously known co-crystal structures, we tested predictions from methods that keep the receptor structure fixed and used (a) multiple receptor/ligand co-crystals as binding templates for minimization or docking ("close"), (b) methods that align or dock to a single receptor ("cross"), and (c) a hybrid approach that chose from multiple bound ligands as initial templates for minimization to a single receptor ("min-cross"). Pose prediction using our "close" models resulted in average ligand RMSDs of 0.32 and 1.6 Å for HSP90 and MAP4K4, respectively, the most accurate models of the community-wide challenge. On the other hand, affinity ranking using our "cross" methods performed well overall despite the fact that a fixed receptor cannot model ligand-induced structural changes,. In addition, "close" methods that leverage the co-crystals of the different binding modes of HSP90 also predicted the best affinity ranking. Our studies suggest that analysis of changes on the receptor structure upon ligand binding can help select an optimal virtual screening strategy.


Assuntos
Desenho de Fármacos , Proteínas de Choque Térmico HSP90/química , Peptídeos e Proteínas de Sinalização Intracelular/química , Proteínas Serina-Treonina Quinases/química , Bibliotecas de Moléculas Pequenas/química , Algoritmos , Sítios de Ligação , Cristalografia por Raios X , Ligantes , Modelos Moleculares , Ligação Proteica , Conformação Proteica , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA