Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 104
Filter
Add more filters

Publication year range
1.
Bioinformatics ; 40(1)2024 01 02.
Article in English | MEDLINE | ID: mdl-38175787

ABSTRACT

MOTIVATION: Understanding metal-protein interaction can provide structural and functional insights into cellular processes. As the number of protein sequences increases, developing fast yet precise computational approaches to predict and annotate metal-binding sites becomes imperative. Quick and resource-efficient pre-trained protein language model (pLM) embeddings have successfully predicted binding sites from protein sequences despite not using structural or evolutionary features (multiple sequence alignments). Using residue-level embeddings from the pLMs, we have developed a sequence-based method (M-Ionic) to identify metal-binding proteins and predict residues involved in metal binding. RESULTS: On independent validation of recent proteins, M-Ionic reports an area under the curve (AUROC) of 0.83 (recall = 84.6%) in distinguishing metal binding from non-binding proteins compared to AUROC of 0.74 (recall = 61.8%) of the next best method. In addition to comparable performance to the state-of-the-art method for identifying metal-binding residues (Ca2+, Mg2+, Mn2+, Zn2+), M-Ionic provides binding probabilities for six additional ions (i.e. Cu2+, Po43-, So42-, Fe2+, Fe3+, Co2+). We show that the pLM embedding of a single residue contains sufficient information about its neighbours to predict its binding properties. AVAILABILITY AND IMPLEMENTATION: M-Ionic can be used on your protein of interest using a Google Colab Notebook (https://bit.ly/40FrRbK). The GitHub repository (https://github.com/TeamSundar/m-ionic) contains all code and data.


Subject(s)
Metals , Proteins , Proteins/chemistry , Amino Acid Sequence , Binding Sites , Ions , Protein Domains , Metals/chemistry , Metals/metabolism
2.
Methods ; 226: 120-126, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38641083

ABSTRACT

The CRISPR/Cas9 genome editing technology has transformed basic and translational research in biology and medicine. However, the advances are hindered by off-target effects and a paucity in the knowledge of the mechanism of the Cas9 protein. Machine learning models have been proposed for the prediction of Cas9 activity at unintended sites, yet feature engineering plays a major role in the outcome of the predictors. This study evaluates the improvement in the performance of similar predictors upon inclusion of epigenetic and DNA shape feature groups in the conventionally used sequence-based Cas9 target and off-target datasets. The approach involved the utilization of neural networks trained on a diverse range of parameters, allowing us to systematically assess the performance increase for the meticulously designed datasets- (i) sequence only, (ii) sequence and epigenetic features, and (iii) sequence, epigenetic and DNA shape feature datasets. The addition of DNA shape information significantly improved predictive performance, evaluated by Akaike and Bayesian information criteria. The evaluation of individual feature importance by permutation and LIME-based methods also indicates that not only sequence features like mismatches and nucleotide composition, but also base pairing parameters like opening and stretch, that are indicative of distortion in the DNA-RNA hybrid in the presence of mismatches, influence model outcomes.


Subject(s)
CRISPR-Cas Systems , DNA , Gene Editing , Machine Learning , Neural Networks, Computer , CRISPR-Cas Systems/genetics , DNA/genetics , DNA/chemistry , Gene Editing/methods , Nucleic Acid Conformation , Humans , Bayes Theorem , Epigenesis, Genetic
3.
J Struct Biol ; 216(2): 108087, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38494148

ABSTRACT

The global spread of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) since 2019 has led to a continuous evolution of viral variants, with the latest concern being the Omicron (B.1.1.529) variant. In this study, classical molecular dynamics simulations were conducted to elucidate the biophysical aspects of the Omicron spike protein's receptor-binding domain (RBD) in its interaction with human angiotensin-converting enzyme 2 (hACE2) and a neutralizing antibody, comparing it to the wildtype (WT). To model the Omicron variant, 15 in silico mutations were introduced in the RBD region of WT (retrieved from PDB). The simulations of WT spike-hACE2 and Omicron spike-hACE2 complexes revealed comparable binding stability and dynamics. Notably, the Q493R mutation in the Omicron spike increased interactions with hACE2, particularly with ASP38 and ASP355. Additionally, mutations such as N417K, T478K, and Y505H contributed to enhanced structural stability in the Omicron variant. Conversely, when comparing WT with Omicron in complex with a neutralizing antibody, simulation results demonstrated poorer binding dynamics and stability for the Omicron variant. The E484K mutation significantly decreased binding interactions, resulting in an overall decrease in binding energy (∼-57 kcal/mol) compared to WT (∼-84 kcal/mol). This study provides valuable molecular insights into the heightened infectivity of the Omicron variant, shedding light on the specific mutations influencing its interactions with hACE2 and neutralizing antibodies.


Subject(s)
Angiotensin-Converting Enzyme 2 , Antibodies, Neutralizing , Molecular Dynamics Simulation , Protein Binding , SARS-CoV-2 , Spike Glycoprotein, Coronavirus , Angiotensin-Converting Enzyme 2/metabolism , Angiotensin-Converting Enzyme 2/chemistry , Angiotensin-Converting Enzyme 2/genetics , Spike Glycoprotein, Coronavirus/metabolism , Spike Glycoprotein, Coronavirus/genetics , Spike Glycoprotein, Coronavirus/chemistry , Antibodies, Neutralizing/immunology , Antibodies, Neutralizing/metabolism , Humans , SARS-CoV-2/metabolism , SARS-CoV-2/genetics , SARS-CoV-2/immunology , COVID-19/virology , COVID-19/metabolism , COVID-19/immunology , Mutation , Binding Sites , Antibodies, Viral/immunology , Antibodies, Viral/metabolism , Antibodies, Viral/chemistry
4.
Biochem J ; 480(14): 1079-1096, 2023 07 26.
Article in English | MEDLINE | ID: mdl-37306466

ABSTRACT

Mycobacterium tuberculosis (M. tb), the causative pathogen of tuberculosis (TB) remains the leading cause of death from single infectious agent. Furthermore, its evolution to multi-drug resistant (MDR) and extremely drug-resistant (XDR) strains necessitate de novo identification of drug-targets/candidates or to repurpose existing drugs against known targets through drug repurposing. Repurposing of drugs has gained traction recently where orphan drugs are exploited for new indications. In the current study, we have combined drug repurposing with polypharmacological targeting approach to modulate structure-function of multiple proteins in M. tb. Based on previously established essentiality of genes in M. tb, four proteins implicated in acceleration of protein folding (PpiB), chaperone assisted protein folding (MoxR1), microbial replication (RipA) and host immune modulation (S-adenosyl dependent methyltransferase, sMTase) were selected. Genetic diversity analyses in target proteins showed accumulation of mutations outside respective substrate/drug binding sites. Using a composite receptor-template based screening method followed by molecular dynamics simulations, we have identified potential candidates from FDA approved drugs database; Anidulafungin (anti-fungal), Azilsartan (anti-hypertensive) and Degarelix (anti-cancer). Isothermal titration calorimetric analyses showed that the drugs can bind with high affinity to target proteins and interfere with known protein-protein interaction of MoxR1 and RipA. Cell based inhibitory assays of these drugs against M. tb (H37Ra) culture indicates their potential to interfere with pathogen growth and replication. Topographic assessment of drug-treated bacteria showed induction of morphological aberrations in M. tb. The approved candidates may also serve as scaffolds for optimization to future anti-mycobacterial agents which can target MDR strains of M. tb.


Subject(s)
Antitubercular Agents , Drug Repositioning , Mycobacterium tuberculosis , Mycobacterium tuberculosis/drug effects , Mycobacterium tuberculosis/genetics , Antitubercular Agents/pharmacology , Extensively Drug-Resistant Tuberculosis/drug therapy , Anidulafungin/pharmacology , Bacterial Proteins/genetics , Protein Structure, Tertiary , Molecular Dynamics Simulation
5.
BMC Bioinformatics ; 24(1): 341, 2023 Sep 13.
Article in English | MEDLINE | ID: mdl-37704952

ABSTRACT

BACKGROUND: Mitochondria are the cell organelles that produce most of the chemical energy required to power the cell's biochemical reactions. Despite being a part of a eukaryotic host cell, the mitochondria contain a separate genome whose origin is linked with the endosymbiosis of a prokaryotic cell by the host cell and encode independent genomic information throughout their genomes. Mitochondrial genomes accommodate essential genes and are regularly utilized in biotechnology and phylogenetics. Various assemblers capable of generating complete mitochondrial genomes are being continuously developed. These tools often use whole-genome sequencing data as an input containing reads from the mitochondrial genome. Till now, no published work has explored the systematic comparison of all the available tools for assembling human mitochondrial genomes using short-read sequencing data. This evaluation is required to identify the best tool that can be well-optimized for small-scale projects or even national-level research. RESULTS: In this study, we have tested the mitochondrial genome assemblers for both simulated datasets and whole genome sequencing (WGS) datasets of humans. For the highest computational setting of 16 computational threads with the simulated dataset having 1000X read depth, MitoFlex took the least execution time of 69 s, and IOGA took the longest execution time of 1278 s. NOVOPlasty utilized the least computational memory of approximately 0.098 GB for the same setting, whereas IOGA utilized the highest computational memory of 11.858 GB. In the case of WGS datasets for humans, GetOrganelle and MitoFlex performed the best in capturing the SNPs information with a mean F1-score of 0.919 at the sequencing depth of 10X. MToolBox and NOVOPlasty performed consistently across all sequencing depths with a mean F1 score of 0.897 and 0.890, respectively. CONCLUSIONS: Based on the overall performance metrics and consistency in assembly quality for all sequencing data, MToolBox performed the best. However, NOVOPlasty was the second fastest tool in execution time despite being single-threaded, and it utilized the least computational resources among all the assemblers when tested on simulated datasets. Therefore, NOVOPlasty may be more practical when there is a significant sample size and a lack of computational resources. Besides, as long-read sequencing gains popularity, mitochondrial genome assemblers must be developed to use long-read sequencing data.


Subject(s)
Genome, Mitochondrial , Humans , Genome, Human , Mitochondria/genetics , Benchmarking , Biotechnology
6.
BMC Genomics ; 22(1): 214, 2021 Mar 24.
Article in English | MEDLINE | ID: mdl-33761889

ABSTRACT

BACKGROUND: Survival and drug response are two highly emphasized clinical outcomes in cancer research that directs the prognosis of a cancer patient. Here, we have proposed a late multi omics integrative framework that robustly quantifies survival and drug response for breast cancer patients with a focus on the relative predictive ability of available omics datatypes. Neighborhood component analysis (NCA), a supervised feature selection algorithm selected relevant features from multi-omics datasets retrieved from The Cancer Genome Atlas (TCGA) and Genomics of Drug Sensitivity in Cancer (GDSC) databases. A Neural network framework, fed with NCA selected features, was used to develop survival and drug response prediction models for breast cancer patients. The drug response framework used regression and unsupervised clustering (K-means) to segregate samples into responders and non-responders based on their predicted IC50 values (Z-score). RESULTS: The survival prediction framework was highly effective in categorizing patients into risk subtypes with an accuracy of 94%. Compared to single-omics and early integration approaches, our drug response prediction models performed significantly better and were able to predict IC50 values (Z-score) with a mean square error (MSE) of 1.154 and an overall regression value of 0.92, showing a linear relationship between predicted and actual IC50 values. CONCLUSION: The proposed omics integration strategy provides an effective way of extracting critical information from diverse omics data types enabling estimation of prognostic indicators. Such integrative models with high predictive power would have a significant impact and utility in precision oncology.


Subject(s)
Breast Neoplasms , Deep Learning , Pharmaceutical Preparations , Breast Neoplasms/drug therapy , Breast Neoplasms/genetics , Genomics , Humans , Precision Medicine
7.
Genomics ; 112(5): 3609-3614, 2020 09.
Article in English | MEDLINE | ID: mdl-32353475

ABSTRACT

The ease of programming CRISPR/Cas9 system for targeting a specific location within the genome has paved way for many clinical and industrial applications. However, its widespread use is still limited owing to its off-target effects. Though this off-target activity has been reported to be dependent on both sgRNA sequence and experimental conditions, a clear understanding of the factors imparting specificity to CRISPR/Cas9 system is important. A machine learning-based computational model has been developed for prediction of off-targets with more likelihood to be cleaved in vivo with an accuracy of 91.49%. The sequence features important for the prediction of positive off-targets were found to be accessibility, mismatches, GC-content and position-specific conservation of nucleotides. The instructions and code to generate the dataset and reproduce the analysis has been made available at http://web.iitd.ac.in/crispcut/off-targets/.


Subject(s)
CRISPR-Cas Systems , Machine Learning , RNA/genetics , Algorithms , RNA Editing
8.
Int J Mol Sci ; 22(17)2021 Aug 24.
Article in English | MEDLINE | ID: mdl-34502041

ABSTRACT

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) outbreak in December 2019 has caused a global pandemic. The rapid mutation rate in the virus has created alarming situations worldwide and is being attributed to the false negativity in RT-PCR tests. It has also increased the chances of reinfection and immune escape. Recently various lineages namely, B.1.1.7 (Alpha), B.1.617.1 (Kappa), B.1.617.2 (Delta) and B.1.617.3 have caused rapid infection around the globe. To understand the biophysical perspective, we have performed molecular dynamic simulations of four different spikes (receptor binding domain)-hACE2 complexes, namely wildtype (WT), Alpha variant (N501Y spike mutant), Kappa (L452R, E484Q) and Delta (L452R, T478K), and compared their dynamics, binding energy and molecular interactions. Our results show that mutation has caused significant increase in the binding energy between the spike and hACE2 in Alpha and Kappa variants. In the case of Kappa and Delta variants, the mutations at L452R, T478K and E484Q increased the stability and intra-chain interactions in the spike protein, which may change the interaction ability of neutralizing antibodies to these spike variants. Further, we found that the Alpha variant had increased hydrogen interaction with Lys353 of hACE2 and more binding affinity in comparison to WT. The current study provides the biophysical basis for understanding the molecular mechanism and rationale behind the increase in the transmissivity and infectivity of the mutants compared to wild-type SARS-CoV-2.


Subject(s)
Angiotensin-Converting Enzyme 2/metabolism , COVID-19/transmission , SARS-CoV-2/pathogenicity , Spike Glycoprotein, Coronavirus/metabolism , Angiotensin-Converting Enzyme 2/ultrastructure , Antibodies, Neutralizing/immunology , Antibodies, Neutralizing/metabolism , Antibodies, Viral/immunology , Antibodies, Viral/metabolism , COVID-19/virology , Crystallography, X-Ray , Humans , Molecular Dynamics Simulation , Mutation , Protein Stability , SARS-CoV-2/genetics , SARS-CoV-2/immunology , Spike Glycoprotein, Coronavirus/genetics , Spike Glycoprotein, Coronavirus/immunology , Spike Glycoprotein, Coronavirus/ultrastructure , Thermodynamics
9.
Genomics ; 111(4): 560-566, 2019 07.
Article in English | MEDLINE | ID: mdl-29605634

ABSTRACT

The ability to direct the CRISPR/Cas9 nuclease to a unique target site within a genome would have broad use in targeted genome engineering. However, CRISPR RNA is reported to bind to other genomic locations that differ from the intended target site by a few nucleotides, demonstrating significant off-target activity. We have developed the CRISPcut tool that screens the off-targets using various parameters and predicts the ideal genomic target for -guide RNAs in human cell lines. sgRNAs for four different types of Cas9 nucleases can be designed with an option for the user to work with different PAM sequences. Direct experimental measurement of genome-wide DNA accessibility is incorporated that effectively restricts the prediction of CRISPR targets to open chromatin. An option to predict target sites for paired CRISPR nickases is also provided. The tool has been validated using a dataset of experimentally used sgRNA and their identified off-targets. URL: http://web.iitd.ac.in/crispcut.


Subject(s)
CRISPR-Cas Systems , Gene Editing/methods , RNA, Guide, Kinetoplastida/genetics , Software , Targeted Gene Repair/methods , CRISPR-Associated Protein 9/genetics , CRISPR-Associated Protein 9/metabolism , Chromatin/chemistry , Humans , Nucleotide Motifs , RNA, Guide, Kinetoplastida/metabolism
10.
Int J Mol Sci ; 21(15)2020 Jul 30.
Article in English | MEDLINE | ID: mdl-32751717

ABSTRACT

The anti-metastatic and anti-angiogenic activities of triethylene glycol derivatives have been reported. In this study, we investigated their molecular mechanism(s) using bioinformatics and experimental tools. By molecular dynamics analysis, we found that (i) triethylene glycol dimethacrylate (TD-10) and tetraethylene glycol dimethacrylate (TD-11) can act as inhibitors of the catalytic domain of matrix metalloproteinases (MMP-2, MMP-7 and MMP-9) by binding to the S1' pocket of MMP-2 and MMP-9 and the catalytic Zn ion binding site of MMP-7, and that (ii) TD-11 can cause local disruption of the secondary structure of vascular endothelial growth factor A (VEGFA) dimer and exhibit stable interaction at the binding interface of VEGFA receptor R1 complex. Cell-culture-based in vitro experiments showed anti-metastatic phenotypes as seen in migration and invasion assays in cancer cells by both TD-10 and TD-11. Underlying biochemical evidence revealed downregulation of VEGF and MMPs at the protein level; MMP-9 was also downregulated at the transcriptional level. By molecular analyses, we demonstrate that TD-10 and TD-11 target stress chaperone mortalin at the transcription and translational level, yielding decreased expression of vimentin, fibronectin and hnRNP-K, and increase in extracellular matrix (ECM) proteins (collagen IV and E-cadherin) endorsing reversal of epithelial-mesenchymal transition (EMT) signaling.


Subject(s)
Computational Biology , Neoplasm Metastasis/drug therapy , Neoplasms/drug therapy , Polyethylene Glycols/chemistry , Cadherins/genetics , Cell Line, Tumor , Cell Movement/drug effects , Epithelial-Mesenchymal Transition , Gene Expression Regulation, Neoplastic/drug effects , Humans , Matrix Metalloproteinase 2/genetics , Matrix Metalloproteinase 9/genetics , Neoplasm Metastasis/pathology , Neoplasms/pathology , Polyethylene Glycols/therapeutic use , Signal Transduction/genetics
11.
Mar Drugs ; 17(6)2019 Jun 05.
Article in English | MEDLINE | ID: mdl-31195739

ABSTRACT

Fucoxanthin is commonly found in marine organisms; however, to date, it has been one of the scarcely explored natural compounds. We investigated its activities in human cancer cell culture-based viability, migration, and molecular assays, and found that it possesses strong anticancer and anti-metastatic activities that work irrespective of the p53 status of cancer cells. In our experiments, fucoxanthin caused the transcriptional suppression of mortalin. Cell phenotype-driven molecular analyses on control and treated cells demonstrated that fucoxanthin caused a decrease in hallmark proteins associated with cell proliferation, survival, and the metastatic spread of cancer cells at doses that were relatively safe to the normal cells. The data suggested that the cancer therapy regimen may benefit from the recruitment of fucoxanthin; hence, it warrants further attention for basic mechanistic studies as well as drug development.


Subject(s)
Cell Survival/drug effects , Xanthophylls/pharmacology , Antineoplastic Agents/pharmacology , Aquatic Organisms/chemistry , Cell Line , Cell Line, Tumor , Cell Proliferation/drug effects , Fibroblasts/drug effects , Humans
12.
Methods ; 131: 66-73, 2017 12 01.
Article in English | MEDLINE | ID: mdl-28710008

ABSTRACT

Synthetic lethality occurs when co-occurrence of two genetic events is unfavorable for the survival of the cell or organism. The conventional approach of high throughput screening of synthetic lethal targets using chemical compounds has been replaced by RNAi technology. CRISPR/Cas9, an RNA guided endonuclease system is the most recent technology for this work. Here, we have discussed the major considerations involved in designing a CRISPR/Cas9 based screening experiment for identification of synthetic lethal targets. It mainly includes CRISPR library to be used, cell types for conducting the experiment, the most appropriate screening strategy and ways of selecting the desired phenotypes from the complete cell population. The complete knockdown of genes can be achieved using CRISPR/Cas9 knockout libraries. For higher quality loss-of-function screens, haploid cells with defective homology-directed DNA repair mechanism could be used. Two widely used screening formats include arrayed and pooled screens followed by negative or positive selection of the cells with desired phenotype. However, pooled screening format with negative selection of cells serves the best. The advantages of using CRISPR/Cas9 system over the other RNAi approaches have also been discussed. Finally, some studies using CRISPR/Cas9 for genome-wide knockout screening in human cells and computational approaches for identification of synthetic lethal interactions have been discussed.


Subject(s)
CRISPR-Cas Systems/genetics , Computational Biology/methods , Genetic Therapy/methods , High-Throughput Screening Assays/methods , Neoplasms/genetics , DNA Repair/genetics , Endonucleases/genetics , Gene Knockdown Techniques/methods , Gene Library , Genetic Engineering/methods , High-Throughput Nucleotide Sequencing , Humans , Loss of Function Mutation/genetics , Neoplasms/therapy , RNA Interference , RNA, Small Interfering/genetics , Sequence Analysis, DNA , Small Molecule Libraries , Synthetic Lethal Mutations/genetics
13.
Methods ; 131: 10-21, 2017 12 01.
Article in English | MEDLINE | ID: mdl-28843611

ABSTRACT

Drug discovery in simple words is all about finding small molecular compounds that possess the potential to interact with specific bio-macromolecules, mainly proteins, thereby bringing a desired effect in the functioning of the target molecules. Virtual screening of large compound libraries using computational approaches has come up as a great alternative to cost and labor-intensive high-throughput screening carried out in laboratories. Virtual high-throughput screening enormously reduces the number of compounds for systematic analysis using biochemical assays before entering the clinical trials. Here, we first give a brief overview of the rationale behind virtual screening, types of virtual screening - structure-based, ligand-based and inverse virtual screening, and challenges that need to be addressed to improve the existing strategies. Subsequently, we describe the methodology adopted for virtual screening of small molecules, peptides and proteins. Finally, we use few case studies to provide a better insight to the application of computer-aided high-throughput screening.


Subject(s)
Computational Biology/methods , Drug Discovery/methods , Peptides/chemistry , Proteins/chemistry , Small Molecule Libraries/chemistry , Drug Design , Ligands , Molecular Docking Simulation , Molecular Targeted Therapy/methods , Protein Binding , Structure-Activity Relationship
15.
BMC Genomics ; 17(Suppl 13): 1037, 2016 12 22.
Article in English | MEDLINE | ID: mdl-28155654

ABSTRACT

BACKGROUND: Engineering zinc finger protein motifs for specific binding to double-stranded DNA is critical for targeted genome editing. Most existing tools for predicting DNA-binding specificity in zinc fingers are trained on data obtained from naturally occurring proteins, thereby skewing the predictions. Moreover, these mostly neglect the cooperativity exhibited by zinc fingers. METHODS: Here, we present an ab-initio method that is based on mutation of the key α-helical residues of individual fingers of the parent template for Zif-268 and its consensus sequence (PDB ID: 1AAY). In an attempt to elucidate the mechanism of zinc finger protein-DNA interactions, we evaluated and compared three approaches, differing in the amino acid mutations introduced in the Zif-268 parent template, and the mode of binding they try to mimic, i.e., modular and synergistic mode of binding. RESULTS: Comparative evaluation of the three strategies reveals that the synergistic mode of binding appears to mimic the ideal mechanism of DNA-zinc finger protein binding. Analysis of the predictions made by all three strategies indicate strong dependence of zinc finger binding specificity on the amino acid propensity and the position of a 3-bp DNA sub-site in the target DNA sequence. Moreover, the binding affinity of the individual zinc fingers was found to increase in the order Finger 1 < Finger 2 < Finger 3, thus confirming the cooperative effect. CONCLUSIONS: Our analysis offers novel insights into the prediction of ZFPs for target DNA sequences and the approaches have been made available as an easy to use web server at http://web.iitd.ac.in/~sundar/zifpredict_ihbe.


Subject(s)
Binding Sites , DNA-Binding Proteins/chemistry , DNA-Binding Proteins/metabolism , DNA/chemistry , DNA/metabolism , Zinc Fingers , Amino Acid Sequence , Base Sequence , Consensus Sequence , DNA/genetics , Hydrogen Bonding , Models, Molecular , Molecular Conformation , Mutation , Protein Binding
16.
BMC Genomics ; 17(Suppl 13): 1033, 2016 12 22.
Article in English | MEDLINE | ID: mdl-28155662

ABSTRACT

BACKGROUND: The ability to engineer zinc finger proteins binding to a DNA sequence of choice is essential for targeted genome editing to be possible. Experimental techniques and molecular docking have been successful in predicting protein-DNA interactions, however, they are highly time and resource intensive. Here, we present a novel algorithm designed for high throughput prediction of optimal zinc finger protein for 9 bp DNA sequences of choice. In accordance with the principles of information theory, a subset identified by using K-means clustering was used as a representative for the space of all possible 9 bp DNA sequences. The modeling and simulation results assuming synergistic mode of binding obtained from this subset were used to train an ensemble micro neural network. Synergistic mode of binding is the closest to the DNA-protein binding seen in nature, and gives much higher quality predictions, while the time and resources increase exponentially in the trade off. Our algorithm is inspired from an ensemble machine learning approach, and incorporates the predictions made by 100 parallel neural networks, each with a different hidden layer architecture designed to pick up different features from the training dataset to predict optimal zinc finger proteins for any 9 bp target DNA. RESULTS: The model gave an accuracy of an average 83% sequence identity for the testing dataset. The BLAST e-value are well within the statistical confidence interval of E-05 for 100% of the testing samples. The geometric mean and median value for the BLAST e-values were found to be 1.70E-12 and 7.00E-12 respectively. For final validation of approach, we compared our predictions against optimal ZFPs reported in literature for a set of experimentally studied DNA sequences. The accuracy, as measured by the average string identity between our predictions and the optimal zinc finger protein reported in literature for a 9 bp DNA target was found to be as high as 81% for DNA targets with a consensus sequence GCNGNNGCN reported in literature. Moreover, the average string identity of our predictions for a catalogue of over 100 9 bp DNA for which the optimal zinc finger protein has been reported in literature was found to be 71%. CONCLUSIONS: Validation with experimental data shows that our tool is capable of domain adaptation and thus scales well to datasets other than the training set with high accuracy. As synergistic binding comes the closest to the ideal mode of binding, our algorithm predicts biologically relevant results in sync with the experimental data present in the literature. While there have been disjointed attempts to approach this problem synergistically reported in literature, there is no work covering the whole sample space. Our algorithm allows designing zinc finger proteins for DNA targets of the user's choice, opening up new frontiers in the field of targeted genome editing. This algorithm is also available as an easy to use web server, ZifNN, at http://web.iitd.ac.in/~sundar/ZifNN/ .


Subject(s)
DNA-Binding Proteins/chemistry , DNA/chemistry , Models, Molecular , Neural Networks, Computer , Zinc Fingers , Algorithms , Binding Sites , DNA/metabolism , DNA-Binding Proteins/metabolism , Molecular Conformation , Protein Binding
17.
BMC Genomics ; 16 Suppl 12: S5, 2015.
Article in English | MEDLINE | ID: mdl-26677774

ABSTRACT

BACKGROUND: Transcription factors, regulating the expression inventory of a cell, interact with its respective DNA subjugated by a specific recognition pattern, which if well exploited may ensure targeted genome engineering. The mostly widely studied transcription factors are zinc finger proteins that bind to its target DNA via direct and indirect recognition levels at the interaction interface. Exploiting the binding specificity and affinity of the interaction between the zinc fingers and the respective DNA can help in generating engineered zinc fingers for therapeutic applications. Experimental evidences lucidly substantiate the effect of indirect interaction like DNA deformation and desolvation kinetics, in empowering ZFPs to accomplish partial sequence specificity functioning around structural properties of DNA. Exploring the structure-function relationships of the existing zinc finger-DNA complexes at the indirect recognition level can aid in predicting the probable zinc fingers that could bind to any target DNA. Deformation energy, which defines the energy required to bend DNA from its native shape to its shape when bound to the ZFP, is an effect of indirect recognition mechanism. Water is treated as a co-reactant for unfurling the affinity studies in ZFP-DNA binding equilibria that takes into account the unavoidable change in hydration that occurs when these two solvated surfaces come into contact. RESULTS: Aspects like desolvation and DNA deformation have been theoretically investigated based on simulations and free energy perturbation data revealing a consensus in correlating affinity and specificity as well as stability for ZFP-DNA interactions. Greater loss of water at the interaction interface of the DNA calls for binding with higher affinity, eventually distorting the DNA to a greater extent accounted by the change in major groove width and DNA tilt, stretch and rise. CONCLUSION: Most prediction algorithms for ZFPs do not account for water loss at the interface. The above findings may significantly affect these algorithms. Further the sequence dependent deformation in the DNA upon complexation with our prototype as well as preference of bases at the 2nd and 3rd position of the repeating triplet provide an absolutely new insight about the indirect interactions undergoing a change that have not been probed yet.


Subject(s)
DNA/chemistry , DNA/metabolism , Early Growth Response Protein 1/chemistry , Early Growth Response Protein 1/metabolism , Algorithms , Base Sequence , Binding Sites , Hydrogen Bonding , Kinetics , Molecular Docking Simulation , Protein Binding
18.
Antimicrob Agents Chemother ; 59(1): 15-24, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25313212

ABSTRACT

Hypericin, a natural compound from Hypericum perforatum (St. John's wort), has been identified as a specific inhibitor of Leishmania donovani spermidine synthase (LdSS) using integrated computational and biochemical approaches. Hypericin showed in vitro inhibition of recombinant LdSS enzyme activity. The in vivo estimation of spermidine levels in Leishmania promastigotes after hypericin treatment showed significant decreases in the spermidine pools of the parasites, indicating target specificity of the inhibitor molecule. The inhibitor, hypericin, showed significant antileishmanial activity, and the mode of death showed necrosis-like features. Further, decreased trypanothione levels and increased glutathione levels with elevated reactive oxygen species (ROS) levels were observed after hypericin treatment. Supplementation with trypanothione in the medium with hypericin treatment restored in vivo trypanothione levels and ROS levels but could not prevent necrosis-like death of the parasites. However, supplementation with spermidine in the medium with hypericin treatment restored in vivo spermidine levels and parasite death was prevented to a large extent. The data overall suggest that the parasite death due to spermidine starvation as a result of LdSS inhibition is not related to elevated levels of reactive oxygen species. This suggests the involvement of spermidine in processes other than redox metabolism in Leishmania parasites. Moreover, the work provides a novel scaffold, i.e., hypericin, as a potent antileishmanial molecule.


Subject(s)
Enzyme Inhibitors/pharmacology , Leishmania donovani/drug effects , Perylene/analogs & derivatives , Spermidine Synthase/antagonists & inhibitors , Spermidine/metabolism , Animals , Anthracenes , Antiprotozoal Agents/pharmacology , Glutathione/analogs & derivatives , Glutathione/metabolism , Glutathione/pharmacology , Leishmania donovani/metabolism , Macrophages/drug effects , Oxidation-Reduction , Perylene/pharmacology , Reactive Oxygen Species/metabolism , Spermidine/analogs & derivatives , Spermidine/pharmacology
19.
BMC Bioinformatics ; 15 Suppl 16: S13, 2014.
Article in English | MEDLINE | ID: mdl-25521597

ABSTRACT

BACKGROUND: Interaction of the small peptide hormone glucagon with glucagon receptor (GCGR) stimulates the release of glucose from the hepatic cells during fasting; hence GCGR performs a significant function in glucose homeostasis. Inhibiting the interaction between glucagon and its receptor has been reported to control hepatic glucose overproduction and thus GCGR has evolved as an attractive therapeutic target for the treatment of type II diabetes mellitus. RESULTS: In the present study, a large library of natural compounds was screened against 7 transmembrane domain of GCGR to identify novel therapeutic molecules that can inhibit the binding of glucagon with GCGR. Molecular dynamics simulations were performed to study the dynamic behaviour of the docked complexes and the molecular interactions between the screened compounds and the ligand binding residues of GCGR were analysed in detail. The top scoring compounds were also compared with already documented GCGR inhibitors- MK-0893 and LY2409021 for their binding affinity and other ADME properties. Finally, we have reported two natural drug like compounds PIB and CAA which showed good binding affinity for GCGR and are potent inhibitor of its functional activity. CONCLUSION: This study contributes evidence for application of these compounds as prospective small ligand molecules against type II diabetes. Novel natural drug like inhibitors against the 7 transmembrane domain of GCGR have been identified which showed high binding affinity and potent inhibition of GCGR.


Subject(s)
Computational Biology/methods , Diabetes Mellitus, Type 2/drug therapy , Diabetes Mellitus, Type 2/metabolism , Glucagon/antagonists & inhibitors , Pharmaceutical Preparations/metabolism , Receptors, Glucagon/antagonists & inhibitors , Blood Glucose/analysis , Glucagon/metabolism , High-Throughput Screening Assays , Humans , Liver/drug effects , Liver/metabolism , Molecular Dynamics Simulation , Peptide Library , Pharmaceutical Preparations/chemistry , Protein Binding/drug effects , Protein Conformation , Pyrazoles/pharmacology , Receptors, Glucagon/metabolism , beta-Alanine/analogs & derivatives , beta-Alanine/pharmacology
20.
ACS Omega ; 9(28): 30645-30653, 2024 Jul 16.
Article in English | MEDLINE | ID: mdl-39035912

ABSTRACT

Cancer is a lethal disease that affects numerous people worldwide. Chemotherapy stands as one of the most effective treatment regimens to combat cancer. Nevertheless, anticancer drugs face a high failure rate due to safety and efficacy issues. Drug failure could be subdued by instigating drug leads with reduced toxicity and enhanced efficacy. Computer-aided drug discovery endorses drug leads in manoeuvring protein and ligand structures or representations. Simplified molecular input line entry system (SMILES) is a linear notation representing the three-dimensional structure of a molecule using symbols and alphanumeric characters. SMILES representation hoards rings and scaffold structures in its depiction. Mining ring and scaffold patterns from molecular SMILES would assist in ascertaining biological properties based on molecular patterns. Moreover, the emergence of artificial intelligence (AI) technologies would accelerate identification of efficient anticancer drug leads. AI algorithms proclaimed for their pattern recognition ability could be employed for identifying molecular patterns from SMILES representation, thereby enabling property prediction. Consequently, we developed a multilayer perceptron (MLP) model for the prediction of anticancer activity using SMILES of NCI-60 cancer growth inhibition data. Furthermore, the top 8 frequent scaffolds were identified on preliminary analysis of cancer growth inhibition data and ChEMBL drugs. The developed MLP model classified anticancer and nonanticancer compounds with a classification accuracy of 0.92. Also, benchmarking of the developed model with machine learning algorithms exhibited better performance of the MLP model.

SELECTION OF CITATIONS
SEARCH DETAIL