Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 61
Filter
1.
Brief Bioinform ; 24(3)2023 05 19.
Article in English | MEDLINE | ID: mdl-36988160

ABSTRACT

Small open reading frames (smORFs) encoding proteins less than 100 amino acids (aa) are known to be important regulators of key cellular processes. However, their computational identification remains a challenge. Based on a comprehensive analysis of known prokaryotic small ORFs, we have developed the ProsmORF-pred resource which uses a machine learning (ML)-based method for prediction of smORFs in the prokaryotic genome sequences. ProsmORF-pred consists of two ML models, one for initiation site recognition in nucleic acid sequences upstream of putative start codons and the other uses translated amino acid sequences to decipher functional protein like sequences. The nucleotide sequence-based initiation site recognition model has been trained using longer ORFs (>100 aa) in the same genome while the ML model for identification of protein like sequences has been trained using annotated smORFs from Escherichia coli. Comprehensive benchmarking of ProsmORF-pred reveals that its performance is comparable to other state-of-the-art approaches on the annotated smORF set derived from 32 prokaryotic genomes. Its performance is distinctly superior to other tools like PRODIGAL and RANSEPS for prediction of newly identified smORFs which have a length range of 10-30 aa, where prediction of smORFs has been a major challenge. Apart from identification of smORFs in genomic sequences, ProsmORF-pred can also aid in functional annotation of the predicted smORFs based on sequence similarity and genomic neighbourhood similarity searches in ProsmORFDB, a well-curated database of known smORFs. ProsmORF-pred along with its backend database ProsmORFDB is available as a user-friendly web server (http://www.nii.ac.in/prosmorfpred.html).


Subject(s)
Genome , Proteins , Open Reading Frames , Proteins/genetics , Genomics , Amino Acid Sequence
2.
Microbiol Spectr ; 11(1): e0259722, 2023 02 14.
Article in English | MEDLINE | ID: mdl-36507669

ABSTRACT

Type III polyketide synthases (PKSs) found across Streptomyces species are primarily known for synthesis of a vast repertoire of clinically and industrially relevant secondary metabolites. However, our understanding of the functional relevance of these bioactive metabolites in Streptomyces physiology is still limited. Recently, a role of type III PKS harboring gene cluster in producing alternate electron carrier, polyketide quinone (PkQ) was established in a related member of the Actinobacteria, Mycobacteria, highlighting the critical role these secondary metabolites play in primary cellular metabolism of the producer organism. Here, we report the developmental stage-specific transcriptional regulation of homologous type III PKS containing gene cluster in freshwater Streptomyces sp. strain MNU77. Gene expression analysis revealed the type III PKS gene cluster to be stringently regulated, with significant upregulation observed during the dormant sporulation stage of Streptomyces sp. MNU77. In contrast, the expression levels of only known electron carrier, menaquinone biosynthetic genes were interestingly found to be downregulated. Our liquid chromatography-high-resolution mass spectrometry (LC-HRMS) analysis of a metabolite extract from the Streptomyces sp. MNU77 spores also showed 10 times more metabolic abundance of PkQs than menaquinones. Furthermore, through heterologous complementation studies, we demonstrate that Streptomyces sp. MNU77 type III PKS rescues a respiratory defect of the Mycobacterium smegmatis type III PKS deletion mutant. Together, our studies reveal that freshwater Streptomyces sp. MNU77 robustly produces novel PkQs during the sporulation stage, suggesting utilization of PkQs as alternate electron carriers across Actinobacteria during dormant hypoxic conditions. IMPORTANCE The complex developmental life cycle of Streptomyces sp. mandates efficient cellular respiratory reconfiguration for a smooth transition from aerated nutrient-rich vegetative hyphal growth to the hypoxic-dormant sporulation stage. Polyketide quinones (PkQs) have recently been identified as a class of alternate electron carriers from a related member of the Actinobacteria, Mycobacteria, that facilitates maintenance of membrane potential in oxygen-deficient niches. Our studies with the newly identified freshwater Streptomyces sp. strain MNU77 show conditional transcriptional upregulation and metabolic abundance of PkQs in the spore state of the Streptomyces life cycle. In parallel, the levels of menaquinones, the only known Streptomyces electron carrier, were downregulated, suggesting deployment of PkQs as universal electron carriers in low-oxygen, unfavorable conditions across the Actinobacteria family.


Subject(s)
Polyketides , Streptomyces , Streptomyces/genetics , Streptomyces/metabolism , Vitamin K 2/metabolism , Polyketides/metabolism , Quinones/metabolism
3.
Int J Biol Macromol ; 215: 489-500, 2022 Aug 31.
Article in English | MEDLINE | ID: mdl-35709874

ABSTRACT

The aim of the current study is to investigate the role of the CAD domain in the activation mechanism of calcium dependent protein kinase-1 of Plasmodium falciparum (PfCDPK1) and explore the possibility of allosteric inhibition of this kinase. PfCDPK1 belongs to CDPK family of apicomplexan kinases which have a C-terminal CAD domain. Microsecond scale MD simulations were performed on modeled structures of complete PfCDPK1 and its kinase domain alone. The simulations revealed that in absence of CAD the salt bridge between Glu116 in αC-helix and Lys85 in ß3-sheet of kinase breaks after 200 ns resulting in inactive conformation of the kinase, but the salt bridge stays intact in the complete protein stabilizing it in active conformation. These results highlight the novel CAD mediated allosteric stabilization of the crucial salt bridge which is a hallmark of active conformation of kinase domains. The mechanistic details of the allosteric activation revealed by our study, opens up the possibility for design of allosteric inhibitors of PfCDPK1 kinase by disrupting the kinase:CAD interactions. Using a combination of machine learning and structure-based in silico screening, we have identified novel PPI modulators for allosteric inactivation of PfCDPK1 kinase.


Subject(s)
Plasmodium falciparum , Protozoan Proteins , Allosteric Regulation , Molecular Conformation , Plasmodium falciparum/metabolism , Protein Conformation , Protozoan Proteins/chemistry
4.
Brief Bioinform ; 23(4)2022 07 18.
Article in English | MEDLINE | ID: mdl-35753700

ABSTRACT

Even though several in silico tools are available for prediction of the phosphorylation sites for mammalian, yeast or plant proteins, currently no software is available for predicting phosphosites for Plasmodium proteins. However, the availability of significant amount of phospho-proteomics data during the last decade and advances in machine learning (ML) algorithms have opened up the opportunities for deciphering phosphorylation patterns of plasmodial system and developing ML-based phosphosite prediction tools for Plasmodium. We have developed Pf-Phospho, an ML-based method for prediction of phosphosites by training Random Forest classifiers using a large data set of 12 096 phosphosites of Plasmodium falciparum and Plasmodium bergei. Of the 12 096 known phosphosites, 75% of sites have been used for training/validation of the classifier, while remaining 25% have been used as completely unseen test data for blind testing. It is encouraging to note that Pf-Phospho can predict the kinase-independent phosphosites with 84% sensitivity, 75% specificity and 78% precision. In addition, it can also predict kinase-specific phosphosites for five plasmodial kinases-PfPKG, Plasmodium falciparum, PfPKA, PfPK7 and PbCDPK4 with high accuracy. Pf-Phospho (http://www.nii.ac.in/pfphospho.html) outperforms other widely used phosphosite prediction tools, which have been trained using mammalian phosphoproteome data. It also has been integrated with other widely used resources such as PlasmoDB, MPMP, Pfam and recently available ML-based predicted structures by AlphaFold2. Currently, Pf-phospho is the only bioinformatics resource available for ML-based prediction of phospho-signaling networks of Plasmodium and is a user-friendly platform for integrative analysis of phospho-signaling along with metabolic and protein-protein interaction networks.


Subject(s)
Proteome , Software , Animals , Machine Learning , Mammals , Phosphorylation , Plasmodium falciparum
5.
Proc Natl Acad Sci U S A ; 119(8)2022 02 22.
Article in English | MEDLINE | ID: mdl-35193957

ABSTRACT

Mycobacterium tuberculosis (Mtb) endures a combination of metal scarcity and toxicity throughout the human infection cycle, contributing to complex clinical manifestations. Pathogens counteract this paradoxical dysmetallostasis by producing specialized metal trafficking systems. Capture of extracellular metal by siderophores is a widely accepted mode of iron acquisition, and Mtb iron-chelating siderophores, mycobactin, have been known since 1965. Currently, it is not known whether Mtb produces zinc scavenging molecules. Here, we characterize low-molecular-weight zinc-binding compounds secreted and imported by Mtb for zinc acquisition. These molecules, termed kupyaphores, are produced by a 10.8 kbp biosynthetic cluster and consists of a dipeptide core of ornithine and phenylalaninol, where amino groups are acylated with isonitrile-containing fatty acyl chains. Kupyaphores are stringently regulated and support Mtb survival under both nutritional deprivation and intoxication conditions. A kupyaphore-deficient Mtb strain is unable to mobilize sufficient zinc and shows reduced fitness upon infection. We observed early induction of kupyaphores in Mtb-infected mice lungs after infection, and these metabolites disappeared after 2 wk. Furthermore, we identify an Mtb-encoded isonitrile hydratase, which can possibly mediate intracellular zinc release through covalent modification of the isonitrile group of kupyaphores. Mtb clinical strains also produce kupyaphores during early passages. Our study thus uncovers a previously unknown zinc acquisition strategy of Mtb that could modulate host-pathogen interactions and disease outcome.


Subject(s)
Lipopeptides/metabolism , Mycobacterium tuberculosis/metabolism , Zinc/metabolism , Animals , Bacterial Proteins/metabolism , Biological Transport , Chelating Agents/metabolism , Disease Models, Animal , Homeostasis , Host-Pathogen Interactions , Metals/metabolism , Mice , Mice, Inbred BALB C , Mycobacterium tuberculosis/growth & development , Siderophores/metabolism , Tuberculosis/microbiology
6.
Mol Inform ; 41(2): e2100178, 2022 02.
Article in English | MEDLINE | ID: mdl-34633768

ABSTRACT

Recent fragment-based drug design efforts have generated huge amounts of information on water and small molecule fragment binding sites on SARS-CoV-2 Mpro and preference of the sites for various types of chemical moieties. However, this information has not been effectively utilized to develop automated tools for in silico drug discovery which are routinely used for screening large compound libraries. Utilization of this information in the development of pharmacophore models can help in bridging this gap. In this study, information on water and small molecule fragments bound to Mpro has been utilized to develop a novel Water Pharmacophore (Waterphore) model. The Waterphore model can also implicitly represent the conformational flexibilities of binding pockets in terms of pharmacophore features. The Waterphore model derived from 173 apo- or small molecule fragment-bound structures of Mpro has been validated by using a dataset of 68 known bioactive inhibitors and 78 crystal structure bound inhibitors of SARS-CoV-2 Mpro . It is encouraging to note that, even though no inhibitor data has been used in developing the Waterphore model, it could successfully identify the known inhibitors from a library of decoys with a ROC-AUC of 0.81 and active hit rate (AHR) of 70 %. The Waterphore model is also general enough for potential applications for other drug targets.


Subject(s)
Antiviral Agents/chemistry , Coronavirus 3C Proteases/antagonists & inhibitors , Protease Inhibitors , SARS-CoV-2/drug effects , COVID-19 , Humans , Molecular Docking Simulation , Protease Inhibitors/chemistry , Water
7.
J Mol Biol ; 433(11): 166887, 2021 05 28.
Article in English | MEDLINE | ID: mdl-33972022

ABSTRACT

RiPPMiner-Genome is a unique bioinformatics resource for identifying Biosynthetic Gene Clusters (BGC) for RiPPs (Ribosomally Synthesized and Post-translationally Modified Peptides) and automated prediction of crosslinked chemical structures of RiPPs starting from genomic sequences. It is a major update of the RiPPMiner webserver, which used only peptide sequence of RiPP precursors as input for predicting RiPP class and crosslinked chemical structures. Other major improvements are, machine learning (ML) based identification of correct RiPP precursor peptide from among multiple small ORFs (Open Reading Frames) in a BGC, prediction of the cleavage site and cross-links in thiopeptides and identification of non-crosslinked modified residues in lanthipeptides. It has been benchmarked on a dataset of 204 experimentally characterized RiPP BGCs. RiPPMiner-Genome also facilitates visualization of the RiPP BGCs and depiction of the chemical structure of crosslinked RiPP. It also has an interface for searching characterized RiPPs, similar to the predicted core peptide sequence or crosslinked chemical structure.


Subject(s)
Cross-Linking Reagents/chemistry , Data Mining , Genome, Bacterial , Internet , Peptides/metabolism , Protein Processing, Post-Translational , Ribosomes/metabolism , Software , Automation , Base Sequence , Lactococcus/genetics , Machine Learning , Reproducibility of Results
8.
Brief Bioinform ; 22(5)2021 09 02.
Article in English | MEDLINE | ID: mdl-33839740

ABSTRACT

Small molecule modulators of protein-protein interactions (PPIs) are being pursued as novel anticancer, antiviral and antimicrobial drug candidates. We have utilized a large data set of experimentally validated PPI modulators and developed machine learning classifiers for prediction of new small molecule modulators of PPI. Our analysis reveals that using random forest (RF) classifier, general PPI Modulators independent of PPI family can be predicted with ROC-AUC higher than 0.9, when training and test sets are generated by random split. The performance of the classifier on data sets very different from those used in training has also been estimated by using different state of the art protocols for removing various types of bias in division of data into training and test sets. The family-specific PPIM predictors developed in this work for 11 clinically important PPI families also have prediction accuracies of above 90% in majority of the cases. All these ML-based predictors have been implemented in a freely available software named SMMPPI for prediction of small molecule modulators for clinically relevant PPIs like RBD:hACE2, Bromodomain_Histone, BCL2-Like_BAX/BAK, LEDGF_IN, LFA_ICAM, MDM2-Like_P53, RAS_SOS1, XIAP_Smac, WDR5_MLL1, KEAP1_NRF2 and CD4_gp120. We have identified novel chemical scaffolds as inhibitors for RBD_hACE PPI involved in host cell entry of SARS-CoV-2. Docking studies for some of the compounds reveal that they can inhibit RBD_hACE2 interaction by high affinity binding to interaction hotspots on RBD. Some of these new scaffolds have also been found in SARS-CoV-2 viral growth inhibitors reported recently; however, it is not known if these molecules inhibit the entry phase.


Subject(s)
Angiotensin-Converting Enzyme 2/antagonists & inhibitors , Machine Learning , Protein Interaction Maps , Proteins/metabolism , SARS-CoV-2/metabolism , Angiotensin-Converting Enzyme 2/metabolism , Humans
9.
Bioinformatics ; 37(5): 603-611, 2021 05 05.
Article in English | MEDLINE | ID: mdl-33010151

ABSTRACT

MOTIVATION: Even though genome mining tools have successfully identified large numbers of non-ribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) biosynthetic gene clusters (BGCs) in bacterial genomes, currently no tool can predict the chemical structure of the secondary metabolites biosynthesized by these BGCs. Lack of algorithms for predicting complex macrocyclization patterns of linear PK/NRP biosynthetic intermediates has been the major bottleneck in deciphering the final bioactive chemical structures of PKs/NRPs by genome mining. RESULTS: Using a large dataset of known chemical structures of macrocyclized PKs/NRPs, we have developed a machine learning (ML) algorithm for distinguishing the correct macrocyclization pattern of PKs/NRPs from the library of all theoretically possible cyclization patterns. Benchmarking of this ML classifier on completely independent datasets has revealed ROC-AUC and PR-AUC values of 0.82 and 0.81, respectively. This cyclization prediction algorithm has been used to develop SBSPKSv3, a genome mining tool for completely automated prediction of macrocyclized structures of NRPs/PKs. SBSPKSv3 has been extensively benchmarked on a dataset of over 100 BGCs with known PKs/NRPs products. AVAILABILITY AND IMPLEMENTATION: The macrocyclization prediction pipeline and all the datasets used in this study are freely available at http://www.nii.ac.in/sbspks3.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Polyketides , Machine Learning , Multigene Family , Peptide Synthases/genetics , Peptides , Polyketide Synthases/genetics
10.
Nat Commun ; 10(1): 1231, 2019 03 15.
Article in English | MEDLINE | ID: mdl-30874556

ABSTRACT

The Mycobacterium tuberculosis kinase PknB is essential for growth and survival of the pathogen in vitro and in vivo. Here we report the results of our efforts to elucidate the mechanism of regulation of PknB activity. The specific residues in the PknB extracytoplasmic domain that are essential for ligand interaction and survival of the bacterium are identified. The extracytoplasmic domain interacts with mDAP-containing LipidII, and this is abolished upon mutation of the ligand-interacting residues. Abrogation of ligand-binding or sequestration of the ligand leads to aberrant localization of PknB. Contrary to the prevailing hypothesis, abrogation of ligand-binding is linked to activation loop hyperphosphorylation, and indiscriminate hyperphosphorylation of PknB substrates as well as other proteins, ultimately causing loss of homeostasis and cell death. We propose that the ligand-kinase interaction directs the appropriate localization of the kinase, coupled to stringently controlled activation of PknB, and consequently the downstream processes thereof.


Subject(s)
Mycobacterium tuberculosis/physiology , Phosphorylation/physiology , Protein Domains/genetics , Protein Serine-Threonine Kinases/metabolism , Uridine Diphosphate N-Acetylmuramic Acid/analogs & derivatives , Homeostasis/physiology , Ligands , Mutation , Protein Binding/genetics , Protein Serine-Threonine Kinases/genetics , Uridine Diphosphate N-Acetylmuramic Acid/metabolism
11.
BMC Struct Biol ; 18(1): 19, 2018 12 19.
Article in English | MEDLINE | ID: mdl-30563492

ABSTRACT

BACKGROUND: Antibody, the primary effector molecule of the immune system, evolves after initial encounter with the antigen from a precursor form to a mature one to effectively deal with the antigen. Antibodies of a lineage diverge through antigen-directed isolated pathways of maturation to exhibit distinct recognition potential. In the context of evolution in immune recognition, diversity of antigen cannot be ignored. While there are reports on antibody lineage, structural perspective with respect to diverse recognition potential in a lineage has never been studied. Hence, it is crucial to evaluate how maturation leads to topological tailoring within a lineage enabling them to interact with significantly distinct antigens. RESULTS: A data-driven approach was undertaken for the study. Global experimental mouse and human antibody-antigen complex structures from PDB were compiled into a coherent database of germline-linked antibodies bound with distinct antigens. Structural analysis of all lineages showed variations in CDRs of both H and L chains. Observations of conformational adaptation made from analysis of static structures were further evaluated by characterizing dynamics of interaction in two lineages, mouse VH1-84 and human VH5-51. Sequence and structure analysis of the lineages explained that somatic mutations altered the geometries of individual antibodies with common structural constraints in some CDRs. Additionally, conformational landscape obtained from molecular dynamics simulations revealed that incoming pathogen led to further conformational divergence in the paratope (as observed across datasets) even while maintaining similar overall backbone topology. MM-GB/SA analysis showed binding energies to be in physiological range. Results of the study are coherent with experimental observations. CONCLUSIONS: The findings of this study highlight basic structural principles shaping the molecular evolution of a lineage for significantly diverse antigens. Antibodies of a lineage follow different developmental pathways while preserving the imprint of the germline. From the study, it can be generalized that structural diversification of the paratope is an outcome of natural selection of a conformation from an available ensemble, which is further optimized for antigen interaction. The study establishes that starting from a common lineage, antibodies can mature to recognize a wide range of antigens. This hypothesis can be further tested and validated experimentally.


Subject(s)
Antibodies/immunology , Antigen-Antibody Complex/chemistry , Amino Acid Sequence , Animals , Antibodies/chemistry , Antibodies/genetics , Databases, Protein , Humans , Mice , Molecular Dynamics Simulation , Protein Structure, Tertiary , Sequence Alignment , Thermodynamics
12.
J Biomol Struct Dyn ; 36(1): 139-151, 2018 01.
Article in English | MEDLINE | ID: mdl-27928938

ABSTRACT

miRNA biogenesis is a multistage process for the generation of a mature miRNA and involves several different proteins. In this work, we have carried out both sequence- and structure-based analysis for crucial proteins involved in miRNA biogenesis, namely Dicer, Drosha, Argonaute (Ago), and Exportin-5 to understand evolution of these proteins in animal kingdom and also to identify key sequence and structural features that are determinants of their function. Our analysis reveals that in animals the miRNA biogenesis pathway first originated in molluscs. The phylogeny of Dicer and Ago indicated evolution through gene duplication followed by sequence divergence that resulted in functional divergence. Our detailed structural analysis also revealed that RIIIDb domains of Drosha and Dicer, share significant similarity in sequence, structure, and substrate-binding pocket. On the other hand, PAZ domains of Dicer and Ago show only conservation of the substrate-binding pockets in the catalytic sites despite significant divergence in sequence and overall structure. Based on a comparative structural analysis of all four human Ago proteins (hAgo1-4) and their known biochemical activity, we have also attempted to identify key residues in Ago2 which are responsible for the unique slicer activity of hAgo2 among all isoforms. We have identified six key residues in N domain of hAgo2, which are located far away from the catalytic pocket, but might be playing a major role in slicer activity of hAgo2 protein because of their involvement in mRNA binding.


Subject(s)
Argonaute Proteins/genetics , DEAD-box RNA Helicases/genetics , Karyopherins/genetics , MicroRNAs/genetics , Ribonuclease III/genetics , Amino Acid Sequence , Animals , Argonaute Proteins/classification , Argonaute Proteins/metabolism , Base Sequence , Binding Sites , DEAD-box RNA Helicases/classification , DEAD-box RNA Helicases/metabolism , Evolution, Molecular , Humans , Karyopherins/classification , Karyopherins/metabolism , MicroRNAs/metabolism , Phylogeny , Protein Binding , Ribonuclease III/classification , Ribonuclease III/metabolism , Sequence Homology, Amino Acid
13.
Nucleic Acids Res ; 45(W1): W80-W88, 2017 07 03.
Article in English | MEDLINE | ID: mdl-28499008

ABSTRACT

Ribosomally synthesized and post-translationally modified peptides (RiPPs) constitute a rapidly growing class of natural products with diverse structures and bioactivities. We have developed RiPPMiner, a novel bioinformatics resource for deciphering chemical structures of RiPPs by genome mining. RiPPMiner derives its predictive power from machine learning based classifiers, trained using a well curated database of more than 500 experimentally characterized RiPPs. RiPPMiner uses Support Vector Machine to distinguish RiPP precursors from other small proteins and classify the precursors into 12 sub-classes of RiPPs. For classes like lanthipeptide, cyanobactin, lasso peptide and thiopeptide, RiPPMiner can predict leader cleavage site and complex cross-links between post-translationally modified residues starting from genome sequences. RiPPMiner can identify correct cross-link pattern in a core peptide from among a very large number of combinatorial possibilities. Benchmarking of prediction accuracy of RiPPMiner on a large lanthipeptide dataset indicated high sensitivity, specificity, accuracy and precision. RiPPMiner also provides interfaces for visualization of the chemical structure, downloading of simplified molecular-input line-entry system and searching for RiPPs having similar sequences or chemical structures. The backend database of RiPPMiner provides information about modification system, precursor sequence, leader and core sequence, modified residues, cross-links and gene cluster for more than 500 experimentally characterized RiPPs. RiPPMiner is available at http://www.nii.ac.in/rippminer.html.


Subject(s)
Peptides/chemistry , Peptides/metabolism , Protein Processing, Post-Translational , Software , Computational Biology , Internet , Machine Learning , Peptides/classification , RNA Cleavage , Sequence Homology, Amino Acid , Support Vector Machine
14.
Nucleic Acids Res ; 45(W1): W72-W79, 2017 07 03.
Article in English | MEDLINE | ID: mdl-28460065

ABSTRACT

Genome guided discovery of novel natural products has been a promising approach for identification of new bioactive compounds. SBSPKS web-server has been a valuable resource for analysis of polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) gene clusters. We have developed an updated version - SBSPKSv2 which is based on comprehensive analysis of sequence, structure and secondary metabolite chemical structure data from 311 experimentally characterized PKS/NRPS gene clusters with known biosynthetic products. A completely new feature of SBSPKSv2 is the inclusion of features for search in chemical space. It allows the user to compare the chemical structure of a given secondary metabolite to the chemical structures of biosynthetic intermediates and final products. For identification of catalytic domains, SBSPKS now uses profile based searches, which are computationally faster and have high sensitivity. HMM profiles have also been added for a number of new domains and motif information has been used for distinguishing condensation (C), epimerization (E) and cyclization (Cy) domains of NRPS. In summary, the new and updated SBSPKSv2 is a versatile tool for genome mining and analysis of polyketide and non-ribosomal peptide biosynthetic pathways in chemical space. The server is available at: http://www.nii.ac.in/sbspks2.html.


Subject(s)
Peptide Synthases/chemistry , Polyketide Synthases/chemistry , Software , Biosynthetic Pathways/genetics , Catalytic Domain , Genomics , Internet , Peptide Synthases/genetics , Polyketide Synthases/genetics , Secondary Metabolism/genetics , Sequence Analysis
15.
Biochemistry ; 56(5): 723-735, 2017 02 07.
Article in English | MEDLINE | ID: mdl-28076679

ABSTRACT

LIN28 protein inhibits biogenesis of miRNAs belonging to the let-7 family by binding to precursor forms of miRNAs. Overexpression of LIN28 and low levels of let-7 miRNAs are associated with several forms of cancer cells. We have performed multiple explicit solvent molecular dynamics simulations ranging from 200 to 500 ns in length on different isoforms of preE-let-7 in complex with LIN28 and also in isolation to identify structural features and key specificity-determining residues (SDRs) that are important for the inhibitory role of LIN28. Our simulations suggest that a conserved structural feature of the loop regions of preE-let-7 miRNAs is more important for LIN28 recognition than sequence conservation among members of the let-7 family or the presence of the GGAG motif in the 3' region. The loop region consisting of a minimum of five nucleotides helps pre-miRNAs to acquire a conformation ideal for binding to LIN28, but pre-let-7c-2 prefers a conformation with a three-nucleotide loop. Thus, our simulations provide a theoretical rationale for the recent experimental observation of the escape of LIN28-mediated repression by pre-let-7c-2. The essential structural and sequence features highlighted in this study might aid in designing synthetic small molecule inhibitors for modulating LIN28-let-7 interaction in malignant cells. We have also identified crucial SDRs of the LIN28-preE-let-7 complex involving 13 residues of LIN28 and 10 residues of the pre-miRNA. On the basis of the conservation profile of these 13 SDRs, we have identified 10 novel proteins that are not annotated as LIN28 like but are similar in sequence, domain, or fold level to LIN28.


Subject(s)
MicroRNAs/chemistry , Molecular Dynamics Simulation , Nucleotides/chemistry , RNA Precursors/chemistry , RNA-Binding Proteins/chemistry , Amino Acid Sequence , Base Sequence , Binding Sites , Conserved Sequence , Gene Expression , Humans , MicroRNAs/genetics , MicroRNAs/metabolism , Nucleic Acid Conformation , Nucleotides/metabolism , Protein Binding , RNA Precursors/genetics , RNA Precursors/metabolism , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , Sequence Alignment
16.
Biol Direct ; 11(1): 48, 2016 Sep 21.
Article in English | MEDLINE | ID: mdl-27655048

ABSTRACT

BACKGROUND: PDZ domains recognize short sequence stretches usually present in C-terminal of their interaction partners. Because of the involvement of PDZ domains in many important biological processes, several attempts have been made for developing bioinformatics tools for genome-wide identification of PDZ interaction networks. Currently available tools for prediction of interaction partners of PDZ domains utilize machine learning approach. Since, they have been trained using experimental substrate specificity data for specific PDZ families, their applicability is limited to PDZ families closely related to the training set. These tools also do not allow analysis of PDZ-peptide interaction interfaces. RESULTS: We have used a structure based approach to develop modPDZpep, a program to predict the interaction partners of human PDZ domains and analyze structural details of PDZ interaction interfaces. modPDZpep predicts interaction partners by using structural models of PDZ-peptide complexes and evaluating binding energy scores using residue based statistical pair potentials. Since, it does not require training using experimental data on peptide binding affinity, it can predict substrates for diverse PDZ families. Because of the use of simple scoring function for binding energy, it is also fast enough for genome scale structure based analysis of PDZ interaction networks. Benchmarking using artificial as well as real negative datasets indicates good predictive power with ROC-AUC values in the range of 0.7 to 0.9 for a large number of human PDZ domains. Another novel feature of modPDZpep is its ability to map novel PDZ mediated interactions in human protein-protein interaction networks, either by utilizing available experimental phage display data or by structure based predictions. CONCLUSIONS: In summary, we have developed modPDZpep, a web-server for structure based analysis of human PDZ domains. It is freely available at http://www.nii.ac.in/modPDZpep.html or http://202.54.226.235/modPDZpep.html . REVIEWERS: This article was reviewed by Michael Gromiha and Zoltán Gáspári.

17.
Sci Rep ; 6: 31418, 2016 08 16.
Article in English | MEDLINE | ID: mdl-27526776

ABSTRACT

Protein-protein interactions mediated by phosphotyrosine binding (PTB) domains play a crucial role in various cellular processes. In order to understand the structural basis of substrate recognition by PTB domains, multiple explicit solvent atomistic simulations of 100ns duration have been carried out on 6 PTB-peptide complexes with known binding affinities. MM/PBSA binding energy values calculated from these MD trajectories and residue based statistical pair potential score show good correlation with the experimental dissociation constants. Our analysis also shows that the modeled structures of PTB domains can be used to develop less compute intensive residue level statistical pair potential based approaches for predicting interaction partners of PTB domains.


Subject(s)
Phosphotyrosine/metabolism , Protein Domains , Models, Molecular , Molecular Dynamics Simulation , Protein Binding , Substrate Specificity
18.
Synth Syst Biotechnol ; 1(2): 80-88, 2016 Jun.
Article in English | MEDLINE | ID: mdl-29062931

ABSTRACT

In silico methods for linking genomic space to chemical space have played a crucial role in genomics driven discovery of new natural products as well as biosynthesis of altered natural products by engineering of biosynthetic pathways. Here we give an overview of available computational tools and then briefly describe a novel computational framework, namely retro-biosynthetic enumeration of biosynthetic reactions, which can add to the repertoire of computational tools available for connecting natural products to their biosynthetic gene clusters. Most of the currently available bioinformatics tools for analysis of secondary metabolite biosynthetic gene clusters utilize the "Genes to Metabolites" approach. In contrast to the "Genes to Metabolites" approach, the "Metabolites to Genes" or retro-biosynthetic approach would involve enumerating the various biochemical transformations or enzymatic reactions which would generate the given chemical moiety starting from a set of precursor molecules and identifying enzymatic domains which can potentially catalyze the enumerated biochemical transformations. In this article, we first give a brief overview of the presently available in silico tools and approaches for analysis of secondary metabolite biosynthetic pathways. We also discuss our preliminary work on development of algorithms for retro-biosynthetic enumeration of biochemical transformations to formulate a novel computational method for identifying genes associated with biosynthesis of a given polyketide or nonribosomal peptide.

19.
Biochemistry ; 54(33): 5209-24, 2015 Aug 25.
Article in English | MEDLINE | ID: mdl-26249842

ABSTRACT

The Fic domain was recently shown to catalyze AMPylation-the transfer of AMP from ATP to hydroxyl side chains of diverse eukaryotic proteins, ranging from RhoGTPases to chaperon BiP. We have carried out a series of explicit solvent molecular dynamics (MD) simulations up to 1 µs duration on the apo, holo, and substrate/product bound IbpA Fic domain (IbpAFic2). Simulations on holo-IbpAFic2 revealed that binding of Mg(2+) to α and ß phosphates is crucial for preserving catalytically important contacts involving ATP. Comparative analysis of the MD trajectories demonstrated how binding of ATP allosterically induces conformational changes in the distal switch II binding region of Fic domains thereby aiding in substrate recognition. Our simulations have also identified crucial aromatic-aromatic interactions which stabilize the orientation of the catalytic histidine for inline nucleophilic attack during AMPylation, thus providing a structural basis for the evolutionary conservation of these aromatic residue pairs in Fic domains. On the basis of analysis of interacting interface residue pairs that persist over the microsecond trajectory, we identified a tetrapeptide stretch involved in substrate recognition. The structure-based genome-wide search revealed a distinct conservation pattern for this segment in different Fic subfamilies, further supporting its proposed role in substrate recognition. In addition, combined use of simulations and phylogenetic analysis has helped in the discovery of a new subfamily of Fic proteins that harbor a conserved Lys/Arg in place of the inhibitory Glu of the regulatory helix. We propose the novel possibility of auto-enhancement of AMPylation activity in this new subfamily via the movement of regulatory helix, in contrast to auto-inhibition seen in most Fic proteins.


Subject(s)
Heat-Shock Proteins/chemistry , Heat-Shock Proteins/metabolism , Molecular Dynamics Simulation , Phylogeny , Protein Processing, Post-Translational , Adenosine Monophosphate/metabolism , Adenosine Triphosphate/metabolism , Amino Acid Motifs , Apoenzymes/antagonists & inhibitors , Apoenzymes/chemistry , Apoenzymes/metabolism , Biocatalysis , Conserved Sequence , Heat-Shock Proteins/antagonists & inhibitors , Magnesium/metabolism , Protein Structure, Tertiary , Solvents/chemistry
20.
Sci Rep ; 5: 10804, 2015 Jun 03.
Article in English | MEDLINE | ID: mdl-26039278

ABSTRACT

AMPylation is a novel post-translational modification (PTM) involving covalent attachment of an AMP moiety to threonine/tyrosine side chains of a protein. AMPylating enzymes belonging to three different families, namely Fic/Doc, GS-ATase and DrrA have been experimentally characterized. Involvement of these novel enzymes in a myriad of biological processes makes them interesting candidates for genome-wide search. We have used SVM and HMM to develop a computational protocol for identification of AMPylation domains and their classification into various functional subfamilies catalyzing AMPylation, deAMPylation, phosphorylation and phosphocholine transfer. Our analysis has not only identified novel PTM catalyzing enzymes among unannotated proteins, but has also revealed how this novel enzyme family has evolved to generate functional diversity by subtle changes in sequence/structures of the proteins. Phylogenetic analysis of Fic/Doc has revealed three new isofunctional subfamilies, thus adding to their functional divergence. Also, frequent occurrence of Fic/Doc proteins on highly mobile and unstable genomic islands indicated their evolution via extensive horizontal gene transfers. On the other hand phylogenetic analyses indicate lateral evolution of GS-ATase family and an early duplication event responsible for AMPylation and deAMPylation activity of GS-ATase. Our analysis also reveals molecular basis of substrate specificity of DrrA proteins.


Subject(s)
Adenosine Monophosphate/metabolism , Biological Evolution , Protein Processing, Post-Translational , Amino Acid Motifs , Conserved Sequence , Gene Transfer, Horizontal , Genomic Islands , Genomics/methods , Humans , Markov Chains , Models, Molecular , Multigene Family , Phylogeny , Protein Conformation , Protein Interaction Domains and Motifs , Substrate Specificity , Support Vector Machine
SELECTION OF CITATIONS
SEARCH DETAIL
...