RESUMO
MOTIVATION: Understanding metal-protein interaction can provide structural and functional insights into cellular processes. As the number of protein sequences increases, developing fast yet precise computational approaches to predict and annotate metal-binding sites becomes imperative. Quick and resource-efficient pre-trained protein language model (pLM) embeddings have successfully predicted binding sites from protein sequences despite not using structural or evolutionary features (multiple sequence alignments). Using residue-level embeddings from the pLMs, we have developed a sequence-based method (M-Ionic) to identify metal-binding proteins and predict residues involved in metal binding. RESULTS: On independent validation of recent proteins, M-Ionic reports an area under the curve (AUROC) of 0.83 (recall = 84.6%) in distinguishing metal binding from non-binding proteins compared to AUROC of 0.74 (recall = 61.8%) of the next best method. In addition to comparable performance to the state-of-the-art method for identifying metal-binding residues (Ca2+, Mg2+, Mn2+, Zn2+), M-Ionic provides binding probabilities for six additional ions (i.e. Cu2+, Po43-, So42-, Fe2+, Fe3+, Co2+). We show that the pLM embedding of a single residue contains sufficient information about its neighbours to predict its binding properties. AVAILABILITY AND IMPLEMENTATION: M-Ionic can be used on your protein of interest using a Google Colab Notebook (https://bit.ly/40FrRbK). The GitHub repository (https://github.com/TeamSundar/m-ionic) contains all code and data.
Assuntos
Metais , Proteínas , Proteínas/química , Sequência de Aminoácidos , Sítios de Ligação , Íons , Domínios Proteicos , Metais/química , Metais/metabolismoRESUMO
The CRISPR/Cas9 genome editing technology has transformed basic and translational research in biology and medicine. However, the advances are hindered by off-target effects and a paucity in the knowledge of the mechanism of the Cas9 protein. Machine learning models have been proposed for the prediction of Cas9 activity at unintended sites, yet feature engineering plays a major role in the outcome of the predictors. This study evaluates the improvement in the performance of similar predictors upon inclusion of epigenetic and DNA shape feature groups in the conventionally used sequence-based Cas9 target and off-target datasets. The approach involved the utilization of neural networks trained on a diverse range of parameters, allowing us to systematically assess the performance increase for the meticulously designed datasets- (i) sequence only, (ii) sequence and epigenetic features, and (iii) sequence, epigenetic and DNA shape feature datasets. The addition of DNA shape information significantly improved predictive performance, evaluated by Akaike and Bayesian information criteria. The evaluation of individual feature importance by permutation and LIME-based methods also indicates that not only sequence features like mismatches and nucleotide composition, but also base pairing parameters like opening and stretch, that are indicative of distortion in the DNA-RNA hybrid in the presence of mismatches, influence model outcomes.
Assuntos
Sistemas CRISPR-Cas , DNA , Edição de Genes , Aprendizado de Máquina , Redes Neurais de Computação , Sistemas CRISPR-Cas/genética , DNA/genética , DNA/química , Edição de Genes/métodos , Conformação de Ácido Nucleico , Humanos , Teorema de Bayes , Epigênese GenéticaRESUMO
The global spread of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) since 2019 has led to a continuous evolution of viral variants, with the latest concern being the Omicron (B.1.1.529) variant. In this study, classical molecular dynamics simulations were conducted to elucidate the biophysical aspects of the Omicron spike protein's receptor-binding domain (RBD) in its interaction with human angiotensin-converting enzyme 2 (hACE2) and a neutralizing antibody, comparing it to the wildtype (WT). To model the Omicron variant, 15 in silico mutations were introduced in the RBD region of WT (retrieved from PDB). The simulations of WT spike-hACE2 and Omicron spike-hACE2 complexes revealed comparable binding stability and dynamics. Notably, the Q493R mutation in the Omicron spike increased interactions with hACE2, particularly with ASP38 and ASP355. Additionally, mutations such as N417K, T478K, and Y505H contributed to enhanced structural stability in the Omicron variant. Conversely, when comparing WT with Omicron in complex with a neutralizing antibody, simulation results demonstrated poorer binding dynamics and stability for the Omicron variant. The E484K mutation significantly decreased binding interactions, resulting in an overall decrease in binding energy (â¼-57 kcal/mol) compared to WT (â¼-84 kcal/mol). This study provides valuable molecular insights into the heightened infectivity of the Omicron variant, shedding light on the specific mutations influencing its interactions with hACE2 and neutralizing antibodies.
Assuntos
Enzima de Conversão de Angiotensina 2 , Anticorpos Neutralizantes , Simulação de Dinâmica Molecular , Ligação Proteica , SARS-CoV-2 , Glicoproteína da Espícula de Coronavírus , Enzima de Conversão de Angiotensina 2/metabolismo , Enzima de Conversão de Angiotensina 2/química , Enzima de Conversão de Angiotensina 2/genética , Glicoproteína da Espícula de Coronavírus/metabolismo , Glicoproteína da Espícula de Coronavírus/genética , Glicoproteína da Espícula de Coronavírus/química , Anticorpos Neutralizantes/imunologia , Anticorpos Neutralizantes/metabolismo , Humanos , SARS-CoV-2/metabolismo , SARS-CoV-2/genética , SARS-CoV-2/imunologia , COVID-19/virologia , COVID-19/metabolismo , COVID-19/imunologia , Mutação , Sítios de Ligação , Anticorpos Antivirais/imunologia , Anticorpos Antivirais/metabolismo , Anticorpos Antivirais/químicaRESUMO
Mycobacterium tuberculosis (M. tb), the causative pathogen of tuberculosis (TB) remains the leading cause of death from single infectious agent. Furthermore, its evolution to multi-drug resistant (MDR) and extremely drug-resistant (XDR) strains necessitate de novo identification of drug-targets/candidates or to repurpose existing drugs against known targets through drug repurposing. Repurposing of drugs has gained traction recently where orphan drugs are exploited for new indications. In the current study, we have combined drug repurposing with polypharmacological targeting approach to modulate structure-function of multiple proteins in M. tb. Based on previously established essentiality of genes in M. tb, four proteins implicated in acceleration of protein folding (PpiB), chaperone assisted protein folding (MoxR1), microbial replication (RipA) and host immune modulation (S-adenosyl dependent methyltransferase, sMTase) were selected. Genetic diversity analyses in target proteins showed accumulation of mutations outside respective substrate/drug binding sites. Using a composite receptor-template based screening method followed by molecular dynamics simulations, we have identified potential candidates from FDA approved drugs database; Anidulafungin (anti-fungal), Azilsartan (anti-hypertensive) and Degarelix (anti-cancer). Isothermal titration calorimetric analyses showed that the drugs can bind with high affinity to target proteins and interfere with known protein-protein interaction of MoxR1 and RipA. Cell based inhibitory assays of these drugs against M. tb (H37Ra) culture indicates their potential to interfere with pathogen growth and replication. Topographic assessment of drug-treated bacteria showed induction of morphological aberrations in M. tb. The approved candidates may also serve as scaffolds for optimization to future anti-mycobacterial agents which can target MDR strains of M. tb.
Assuntos
Antituberculosos , Reposicionamento de Medicamentos , Mycobacterium tuberculosis , Mycobacterium tuberculosis/efeitos dos fármacos , Mycobacterium tuberculosis/genética , Antituberculosos/farmacologia , Tuberculose Extensivamente Resistente a Medicamentos/tratamento farmacológico , Anidulafungina/farmacologia , Proteínas de Bactérias/genética , Estrutura Terciária de Proteína , Simulação de Dinâmica MolecularRESUMO
BACKGROUND: Mitochondria are the cell organelles that produce most of the chemical energy required to power the cell's biochemical reactions. Despite being a part of a eukaryotic host cell, the mitochondria contain a separate genome whose origin is linked with the endosymbiosis of a prokaryotic cell by the host cell and encode independent genomic information throughout their genomes. Mitochondrial genomes accommodate essential genes and are regularly utilized in biotechnology and phylogenetics. Various assemblers capable of generating complete mitochondrial genomes are being continuously developed. These tools often use whole-genome sequencing data as an input containing reads from the mitochondrial genome. Till now, no published work has explored the systematic comparison of all the available tools for assembling human mitochondrial genomes using short-read sequencing data. This evaluation is required to identify the best tool that can be well-optimized for small-scale projects or even national-level research. RESULTS: In this study, we have tested the mitochondrial genome assemblers for both simulated datasets and whole genome sequencing (WGS) datasets of humans. For the highest computational setting of 16 computational threads with the simulated dataset having 1000X read depth, MitoFlex took the least execution time of 69 s, and IOGA took the longest execution time of 1278 s. NOVOPlasty utilized the least computational memory of approximately 0.098 GB for the same setting, whereas IOGA utilized the highest computational memory of 11.858 GB. In the case of WGS datasets for humans, GetOrganelle and MitoFlex performed the best in capturing the SNPs information with a mean F1-score of 0.919 at the sequencing depth of 10X. MToolBox and NOVOPlasty performed consistently across all sequencing depths with a mean F1 score of 0.897 and 0.890, respectively. CONCLUSIONS: Based on the overall performance metrics and consistency in assembly quality for all sequencing data, MToolBox performed the best. However, NOVOPlasty was the second fastest tool in execution time despite being single-threaded, and it utilized the least computational resources among all the assemblers when tested on simulated datasets. Therefore, NOVOPlasty may be more practical when there is a significant sample size and a lack of computational resources. Besides, as long-read sequencing gains popularity, mitochondrial genome assemblers must be developed to use long-read sequencing data.
Assuntos
Genoma Mitocondrial , Humanos , Genoma Humano , Mitocôndrias/genética , Benchmarking , BiotecnologiaRESUMO
BACKGROUND: Survival and drug response are two highly emphasized clinical outcomes in cancer research that directs the prognosis of a cancer patient. Here, we have proposed a late multi omics integrative framework that robustly quantifies survival and drug response for breast cancer patients with a focus on the relative predictive ability of available omics datatypes. Neighborhood component analysis (NCA), a supervised feature selection algorithm selected relevant features from multi-omics datasets retrieved from The Cancer Genome Atlas (TCGA) and Genomics of Drug Sensitivity in Cancer (GDSC) databases. A Neural network framework, fed with NCA selected features, was used to develop survival and drug response prediction models for breast cancer patients. The drug response framework used regression and unsupervised clustering (K-means) to segregate samples into responders and non-responders based on their predicted IC50 values (Z-score). RESULTS: The survival prediction framework was highly effective in categorizing patients into risk subtypes with an accuracy of 94%. Compared to single-omics and early integration approaches, our drug response prediction models performed significantly better and were able to predict IC50 values (Z-score) with a mean square error (MSE) of 1.154 and an overall regression value of 0.92, showing a linear relationship between predicted and actual IC50 values. CONCLUSION: The proposed omics integration strategy provides an effective way of extracting critical information from diverse omics data types enabling estimation of prognostic indicators. Such integrative models with high predictive power would have a significant impact and utility in precision oncology.
Assuntos
Neoplasias da Mama , Aprendizado Profundo , Preparações Farmacêuticas , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Genômica , Humanos , Medicina de PrecisãoRESUMO
The ease of programming CRISPR/Cas9 system for targeting a specific location within the genome has paved way for many clinical and industrial applications. However, its widespread use is still limited owing to its off-target effects. Though this off-target activity has been reported to be dependent on both sgRNA sequence and experimental conditions, a clear understanding of the factors imparting specificity to CRISPR/Cas9 system is important. A machine learning-based computational model has been developed for prediction of off-targets with more likelihood to be cleaved in vivo with an accuracy of 91.49%. The sequence features important for the prediction of positive off-targets were found to be accessibility, mismatches, GC-content and position-specific conservation of nucleotides. The instructions and code to generate the dataset and reproduce the analysis has been made available at http://web.iitd.ac.in/crispcut/off-targets/.
Assuntos
Sistemas CRISPR-Cas , Aprendizado de Máquina , RNA/genética , Algoritmos , Edição de RNARESUMO
The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) outbreak in December 2019 has caused a global pandemic. The rapid mutation rate in the virus has created alarming situations worldwide and is being attributed to the false negativity in RT-PCR tests. It has also increased the chances of reinfection and immune escape. Recently various lineages namely, B.1.1.7 (Alpha), B.1.617.1 (Kappa), B.1.617.2 (Delta) and B.1.617.3 have caused rapid infection around the globe. To understand the biophysical perspective, we have performed molecular dynamic simulations of four different spikes (receptor binding domain)-hACE2 complexes, namely wildtype (WT), Alpha variant (N501Y spike mutant), Kappa (L452R, E484Q) and Delta (L452R, T478K), and compared their dynamics, binding energy and molecular interactions. Our results show that mutation has caused significant increase in the binding energy between the spike and hACE2 in Alpha and Kappa variants. In the case of Kappa and Delta variants, the mutations at L452R, T478K and E484Q increased the stability and intra-chain interactions in the spike protein, which may change the interaction ability of neutralizing antibodies to these spike variants. Further, we found that the Alpha variant had increased hydrogen interaction with Lys353 of hACE2 and more binding affinity in comparison to WT. The current study provides the biophysical basis for understanding the molecular mechanism and rationale behind the increase in the transmissivity and infectivity of the mutants compared to wild-type SARS-CoV-2.
Assuntos
Enzima de Conversão de Angiotensina 2/metabolismo , COVID-19/transmissão , SARS-CoV-2/patogenicidade , Glicoproteína da Espícula de Coronavírus/metabolismo , Enzima de Conversão de Angiotensina 2/ultraestrutura , Anticorpos Neutralizantes/imunologia , Anticorpos Neutralizantes/metabolismo , Anticorpos Antivirais/imunologia , Anticorpos Antivirais/metabolismo , COVID-19/virologia , Cristalografia por Raios X , Humanos , Simulação de Dinâmica Molecular , Mutação , Estabilidade Proteica , SARS-CoV-2/genética , SARS-CoV-2/imunologia , Glicoproteína da Espícula de Coronavírus/genética , Glicoproteína da Espícula de Coronavírus/imunologia , Glicoproteína da Espícula de Coronavírus/ultraestrutura , TermodinâmicaRESUMO
The ability to direct the CRISPR/Cas9 nuclease to a unique target site within a genome would have broad use in targeted genome engineering. However, CRISPR RNA is reported to bind to other genomic locations that differ from the intended target site by a few nucleotides, demonstrating significant off-target activity. We have developed the CRISPcut tool that screens the off-targets using various parameters and predicts the ideal genomic target for -guide RNAs in human cell lines. sgRNAs for four different types of Cas9 nucleases can be designed with an option for the user to work with different PAM sequences. Direct experimental measurement of genome-wide DNA accessibility is incorporated that effectively restricts the prediction of CRISPR targets to open chromatin. An option to predict target sites for paired CRISPR nickases is also provided. The tool has been validated using a dataset of experimentally used sgRNA and their identified off-targets. URL: http://web.iitd.ac.in/crispcut.
Assuntos
Sistemas CRISPR-Cas , Edição de Genes/métodos , RNA Guia de Cinetoplastídeos/genética , Software , Reparo Gênico Alvo-Dirigido/métodos , Proteína 9 Associada à CRISPR/genética , Proteína 9 Associada à CRISPR/metabolismo , Cromatina/química , Humanos , Motivos de Nucleotídeos , RNA Guia de Cinetoplastídeos/metabolismoRESUMO
The anti-metastatic and anti-angiogenic activities of triethylene glycol derivatives have been reported. In this study, we investigated their molecular mechanism(s) using bioinformatics and experimental tools. By molecular dynamics analysis, we found that (i) triethylene glycol dimethacrylate (TD-10) and tetraethylene glycol dimethacrylate (TD-11) can act as inhibitors of the catalytic domain of matrix metalloproteinases (MMP-2, MMP-7 and MMP-9) by binding to the S1' pocket of MMP-2 and MMP-9 and the catalytic Zn ion binding site of MMP-7, and that (ii) TD-11 can cause local disruption of the secondary structure of vascular endothelial growth factor A (VEGFA) dimer and exhibit stable interaction at the binding interface of VEGFA receptor R1 complex. Cell-culture-based in vitro experiments showed anti-metastatic phenotypes as seen in migration and invasion assays in cancer cells by both TD-10 and TD-11. Underlying biochemical evidence revealed downregulation of VEGF and MMPs at the protein level; MMP-9 was also downregulated at the transcriptional level. By molecular analyses, we demonstrate that TD-10 and TD-11 target stress chaperone mortalin at the transcription and translational level, yielding decreased expression of vimentin, fibronectin and hnRNP-K, and increase in extracellular matrix (ECM) proteins (collagen IV and E-cadherin) endorsing reversal of epithelial-mesenchymal transition (EMT) signaling.
Assuntos
Biologia Computacional , Metástase Neoplásica/tratamento farmacológico , Neoplasias/tratamento farmacológico , Polietilenoglicóis/química , Caderinas/genética , Linhagem Celular Tumoral , Movimento Celular/efeitos dos fármacos , Transição Epitelial-Mesenquimal , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Humanos , Metaloproteinase 2 da Matriz/genética , Metaloproteinase 9 da Matriz/genética , Metástase Neoplásica/patologia , Neoplasias/patologia , Polietilenoglicóis/uso terapêutico , Transdução de Sinais/genéticaRESUMO
Fucoxanthin is commonly found in marine organisms; however, to date, it has been one of the scarcely explored natural compounds. We investigated its activities in human cancer cell culture-based viability, migration, and molecular assays, and found that it possesses strong anticancer and anti-metastatic activities that work irrespective of the p53 status of cancer cells. In our experiments, fucoxanthin caused the transcriptional suppression of mortalin. Cell phenotype-driven molecular analyses on control and treated cells demonstrated that fucoxanthin caused a decrease in hallmark proteins associated with cell proliferation, survival, and the metastatic spread of cancer cells at doses that were relatively safe to the normal cells. The data suggested that the cancer therapy regimen may benefit from the recruitment of fucoxanthin; hence, it warrants further attention for basic mechanistic studies as well as drug development.
Assuntos
Sobrevivência Celular/efeitos dos fármacos , Xantofilas/farmacologia , Antineoplásicos/farmacologia , Organismos Aquáticos/química , Linhagem Celular , Linhagem Celular Tumoral , Proliferação de Células/efeitos dos fármacos , Fibroblastos/efeitos dos fármacos , HumanosRESUMO
Synthetic lethality occurs when co-occurrence of two genetic events is unfavorable for the survival of the cell or organism. The conventional approach of high throughput screening of synthetic lethal targets using chemical compounds has been replaced by RNAi technology. CRISPR/Cas9, an RNA guided endonuclease system is the most recent technology for this work. Here, we have discussed the major considerations involved in designing a CRISPR/Cas9 based screening experiment for identification of synthetic lethal targets. It mainly includes CRISPR library to be used, cell types for conducting the experiment, the most appropriate screening strategy and ways of selecting the desired phenotypes from the complete cell population. The complete knockdown of genes can be achieved using CRISPR/Cas9 knockout libraries. For higher quality loss-of-function screens, haploid cells with defective homology-directed DNA repair mechanism could be used. Two widely used screening formats include arrayed and pooled screens followed by negative or positive selection of the cells with desired phenotype. However, pooled screening format with negative selection of cells serves the best. The advantages of using CRISPR/Cas9 system over the other RNAi approaches have also been discussed. Finally, some studies using CRISPR/Cas9 for genome-wide knockout screening in human cells and computational approaches for identification of synthetic lethal interactions have been discussed.
Assuntos
Sistemas CRISPR-Cas/genética , Biologia Computacional/métodos , Terapia Genética/métodos , Ensaios de Triagem em Larga Escala/métodos , Neoplasias/genética , Reparo do DNA/genética , Endonucleases/genética , Técnicas de Silenciamento de Genes/métodos , Biblioteca Gênica , Engenharia Genética/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação com Perda de Função/genética , Neoplasias/terapia , Interferência de RNA , RNA Interferente Pequeno/genética , Análise de Sequência de DNA , Bibliotecas de Moléculas Pequenas , Mutações Sintéticas Letais/genéticaRESUMO
Drug discovery in simple words is all about finding small molecular compounds that possess the potential to interact with specific bio-macromolecules, mainly proteins, thereby bringing a desired effect in the functioning of the target molecules. Virtual screening of large compound libraries using computational approaches has come up as a great alternative to cost and labor-intensive high-throughput screening carried out in laboratories. Virtual high-throughput screening enormously reduces the number of compounds for systematic analysis using biochemical assays before entering the clinical trials. Here, we first give a brief overview of the rationale behind virtual screening, types of virtual screening - structure-based, ligand-based and inverse virtual screening, and challenges that need to be addressed to improve the existing strategies. Subsequently, we describe the methodology adopted for virtual screening of small molecules, peptides and proteins. Finally, we use few case studies to provide a better insight to the application of computer-aided high-throughput screening.
Assuntos
Biologia Computacional/métodos , Descoberta de Drogas/métodos , Peptídeos/química , Proteínas/química , Bibliotecas de Moléculas Pequenas/química , Desenho de Fármacos , Ligantes , Simulação de Acoplamento Molecular , Terapia de Alvo Molecular/métodos , Ligação Proteica , Relação Estrutura-AtividadeRESUMO
BACKGROUND: Engineering zinc finger protein motifs for specific binding to double-stranded DNA is critical for targeted genome editing. Most existing tools for predicting DNA-binding specificity in zinc fingers are trained on data obtained from naturally occurring proteins, thereby skewing the predictions. Moreover, these mostly neglect the cooperativity exhibited by zinc fingers. METHODS: Here, we present an ab-initio method that is based on mutation of the key α-helical residues of individual fingers of the parent template for Zif-268 and its consensus sequence (PDB ID: 1AAY). In an attempt to elucidate the mechanism of zinc finger protein-DNA interactions, we evaluated and compared three approaches, differing in the amino acid mutations introduced in the Zif-268 parent template, and the mode of binding they try to mimic, i.e., modular and synergistic mode of binding. RESULTS: Comparative evaluation of the three strategies reveals that the synergistic mode of binding appears to mimic the ideal mechanism of DNA-zinc finger protein binding. Analysis of the predictions made by all three strategies indicate strong dependence of zinc finger binding specificity on the amino acid propensity and the position of a 3-bp DNA sub-site in the target DNA sequence. Moreover, the binding affinity of the individual zinc fingers was found to increase in the order Finger 1 < Finger 2 < Finger 3, thus confirming the cooperative effect. CONCLUSIONS: Our analysis offers novel insights into the prediction of ZFPs for target DNA sequences and the approaches have been made available as an easy to use web server at http://web.iitd.ac.in/~sundar/zifpredict_ihbe.
Assuntos
Sítios de Ligação , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/metabolismo , DNA/química , DNA/metabolismo , Dedos de Zinco , Sequência de Aminoácidos , Sequência de Bases , Sequência Consenso , DNA/genética , Ligação de Hidrogênio , Modelos Moleculares , Conformação Molecular , Mutação , Ligação ProteicaRESUMO
BACKGROUND: The ability to engineer zinc finger proteins binding to a DNA sequence of choice is essential for targeted genome editing to be possible. Experimental techniques and molecular docking have been successful in predicting protein-DNA interactions, however, they are highly time and resource intensive. Here, we present a novel algorithm designed for high throughput prediction of optimal zinc finger protein for 9 bp DNA sequences of choice. In accordance with the principles of information theory, a subset identified by using K-means clustering was used as a representative for the space of all possible 9 bp DNA sequences. The modeling and simulation results assuming synergistic mode of binding obtained from this subset were used to train an ensemble micro neural network. Synergistic mode of binding is the closest to the DNA-protein binding seen in nature, and gives much higher quality predictions, while the time and resources increase exponentially in the trade off. Our algorithm is inspired from an ensemble machine learning approach, and incorporates the predictions made by 100 parallel neural networks, each with a different hidden layer architecture designed to pick up different features from the training dataset to predict optimal zinc finger proteins for any 9 bp target DNA. RESULTS: The model gave an accuracy of an average 83% sequence identity for the testing dataset. The BLAST e-value are well within the statistical confidence interval of E-05 for 100% of the testing samples. The geometric mean and median value for the BLAST e-values were found to be 1.70E-12 and 7.00E-12 respectively. For final validation of approach, we compared our predictions against optimal ZFPs reported in literature for a set of experimentally studied DNA sequences. The accuracy, as measured by the average string identity between our predictions and the optimal zinc finger protein reported in literature for a 9 bp DNA target was found to be as high as 81% for DNA targets with a consensus sequence GCNGNNGCN reported in literature. Moreover, the average string identity of our predictions for a catalogue of over 100 9 bp DNA for which the optimal zinc finger protein has been reported in literature was found to be 71%. CONCLUSIONS: Validation with experimental data shows that our tool is capable of domain adaptation and thus scales well to datasets other than the training set with high accuracy. As synergistic binding comes the closest to the ideal mode of binding, our algorithm predicts biologically relevant results in sync with the experimental data present in the literature. While there have been disjointed attempts to approach this problem synergistically reported in literature, there is no work covering the whole sample space. Our algorithm allows designing zinc finger proteins for DNA targets of the user's choice, opening up new frontiers in the field of targeted genome editing. This algorithm is also available as an easy to use web server, ZifNN, at http://web.iitd.ac.in/~sundar/ZifNN/ .
Assuntos
Proteínas de Ligação a DNA/química , DNA/química , Modelos Moleculares , Redes Neurais de Computação , Dedos de Zinco , Algoritmos , Sítios de Ligação , DNA/metabolismo , Proteínas de Ligação a DNA/metabolismo , Conformação Molecular , Ligação ProteicaRESUMO
BACKGROUND: Transcription factors, regulating the expression inventory of a cell, interact with its respective DNA subjugated by a specific recognition pattern, which if well exploited may ensure targeted genome engineering. The mostly widely studied transcription factors are zinc finger proteins that bind to its target DNA via direct and indirect recognition levels at the interaction interface. Exploiting the binding specificity and affinity of the interaction between the zinc fingers and the respective DNA can help in generating engineered zinc fingers for therapeutic applications. Experimental evidences lucidly substantiate the effect of indirect interaction like DNA deformation and desolvation kinetics, in empowering ZFPs to accomplish partial sequence specificity functioning around structural properties of DNA. Exploring the structure-function relationships of the existing zinc finger-DNA complexes at the indirect recognition level can aid in predicting the probable zinc fingers that could bind to any target DNA. Deformation energy, which defines the energy required to bend DNA from its native shape to its shape when bound to the ZFP, is an effect of indirect recognition mechanism. Water is treated as a co-reactant for unfurling the affinity studies in ZFP-DNA binding equilibria that takes into account the unavoidable change in hydration that occurs when these two solvated surfaces come into contact. RESULTS: Aspects like desolvation and DNA deformation have been theoretically investigated based on simulations and free energy perturbation data revealing a consensus in correlating affinity and specificity as well as stability for ZFP-DNA interactions. Greater loss of water at the interaction interface of the DNA calls for binding with higher affinity, eventually distorting the DNA to a greater extent accounted by the change in major groove width and DNA tilt, stretch and rise. CONCLUSION: Most prediction algorithms for ZFPs do not account for water loss at the interface. The above findings may significantly affect these algorithms. Further the sequence dependent deformation in the DNA upon complexation with our prototype as well as preference of bases at the 2nd and 3rd position of the repeating triplet provide an absolutely new insight about the indirect interactions undergoing a change that have not been probed yet.
Assuntos
DNA/química , DNA/metabolismo , Proteína 1 de Resposta de Crescimento Precoce/química , Proteína 1 de Resposta de Crescimento Precoce/metabolismo , Algoritmos , Sequência de Bases , Sítios de Ligação , Ligação de Hidrogênio , Cinética , Simulação de Acoplamento Molecular , Ligação ProteicaRESUMO
Hypericin, a natural compound from Hypericum perforatum (St. John's wort), has been identified as a specific inhibitor of Leishmania donovani spermidine synthase (LdSS) using integrated computational and biochemical approaches. Hypericin showed in vitro inhibition of recombinant LdSS enzyme activity. The in vivo estimation of spermidine levels in Leishmania promastigotes after hypericin treatment showed significant decreases in the spermidine pools of the parasites, indicating target specificity of the inhibitor molecule. The inhibitor, hypericin, showed significant antileishmanial activity, and the mode of death showed necrosis-like features. Further, decreased trypanothione levels and increased glutathione levels with elevated reactive oxygen species (ROS) levels were observed after hypericin treatment. Supplementation with trypanothione in the medium with hypericin treatment restored in vivo trypanothione levels and ROS levels but could not prevent necrosis-like death of the parasites. However, supplementation with spermidine in the medium with hypericin treatment restored in vivo spermidine levels and parasite death was prevented to a large extent. The data overall suggest that the parasite death due to spermidine starvation as a result of LdSS inhibition is not related to elevated levels of reactive oxygen species. This suggests the involvement of spermidine in processes other than redox metabolism in Leishmania parasites. Moreover, the work provides a novel scaffold, i.e., hypericin, as a potent antileishmanial molecule.
Assuntos
Inibidores Enzimáticos/farmacologia , Leishmania donovani/efeitos dos fármacos , Perileno/análogos & derivados , Espermidina Sintase/antagonistas & inibidores , Espermidina/metabolismo , Animais , Antracenos , Antiprotozoários/farmacologia , Glutationa/análogos & derivados , Glutationa/metabolismo , Glutationa/farmacologia , Leishmania donovani/metabolismo , Macrófagos/efeitos dos fármacos , Oxirredução , Perileno/farmacologia , Espécies Reativas de Oxigênio/metabolismo , Espermidina/análogos & derivados , Espermidina/farmacologiaRESUMO
BACKGROUND: Interaction of the small peptide hormone glucagon with glucagon receptor (GCGR) stimulates the release of glucose from the hepatic cells during fasting; hence GCGR performs a significant function in glucose homeostasis. Inhibiting the interaction between glucagon and its receptor has been reported to control hepatic glucose overproduction and thus GCGR has evolved as an attractive therapeutic target for the treatment of type II diabetes mellitus. RESULTS: In the present study, a large library of natural compounds was screened against 7 transmembrane domain of GCGR to identify novel therapeutic molecules that can inhibit the binding of glucagon with GCGR. Molecular dynamics simulations were performed to study the dynamic behaviour of the docked complexes and the molecular interactions between the screened compounds and the ligand binding residues of GCGR were analysed in detail. The top scoring compounds were also compared with already documented GCGR inhibitors- MK-0893 and LY2409021 for their binding affinity and other ADME properties. Finally, we have reported two natural drug like compounds PIB and CAA which showed good binding affinity for GCGR and are potent inhibitor of its functional activity. CONCLUSION: This study contributes evidence for application of these compounds as prospective small ligand molecules against type II diabetes. Novel natural drug like inhibitors against the 7 transmembrane domain of GCGR have been identified which showed high binding affinity and potent inhibition of GCGR.
Assuntos
Biologia Computacional/métodos , Diabetes Mellitus Tipo 2/tratamento farmacológico , Diabetes Mellitus Tipo 2/metabolismo , Glucagon/antagonistas & inibidores , Preparações Farmacêuticas/metabolismo , Receptores de Glucagon/antagonistas & inibidores , Glicemia/análise , Glucagon/metabolismo , Ensaios de Triagem em Larga Escala , Humanos , Fígado/efeitos dos fármacos , Fígado/metabolismo , Simulação de Dinâmica Molecular , Biblioteca de Peptídeos , Preparações Farmacêuticas/química , Ligação Proteica/efeitos dos fármacos , Conformação Proteica , Pirazóis/farmacologia , Receptores de Glucagon/metabolismo , beta-Alanina/análogos & derivados , beta-Alanina/farmacologiaRESUMO
Cancer is a lethal disease that affects numerous people worldwide. Chemotherapy stands as one of the most effective treatment regimens to combat cancer. Nevertheless, anticancer drugs face a high failure rate due to safety and efficacy issues. Drug failure could be subdued by instigating drug leads with reduced toxicity and enhanced efficacy. Computer-aided drug discovery endorses drug leads in manoeuvring protein and ligand structures or representations. Simplified molecular input line entry system (SMILES) is a linear notation representing the three-dimensional structure of a molecule using symbols and alphanumeric characters. SMILES representation hoards rings and scaffold structures in its depiction. Mining ring and scaffold patterns from molecular SMILES would assist in ascertaining biological properties based on molecular patterns. Moreover, the emergence of artificial intelligence (AI) technologies would accelerate identification of efficient anticancer drug leads. AI algorithms proclaimed for their pattern recognition ability could be employed for identifying molecular patterns from SMILES representation, thereby enabling property prediction. Consequently, we developed a multilayer perceptron (MLP) model for the prediction of anticancer activity using SMILES of NCI-60 cancer growth inhibition data. Furthermore, the top 8 frequent scaffolds were identified on preliminary analysis of cancer growth inhibition data and ChEMBL drugs. The developed MLP model classified anticancer and nonanticancer compounds with a classification accuracy of 0.92. Also, benchmarking of the developed model with machine learning algorithms exhibited better performance of the MLP model.