Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 892
Filter
1.
Int J Mol Sci ; 25(3)2024 Jan 24.
Article in English | MEDLINE | ID: mdl-38338720

ABSTRACT

Estrogens play critical roles in embryonic development, gonadal sex differentiation, behavior, and reproduction in vertebrates and in several human cancers. Estrogens are synthesized from testosterone and androstenedione by the endoplasmic reticulum membrane-bound P450 aromatase/cytochrome P450 oxidoreductase complex (CYP19/CPR). Here, we report the characterization of novel mammalian CYP19 isoforms encoded by CYP19 gene copies. These CYP19 isoforms are all defined by a combination of mutations in the N-terminal transmembrane helix (E42K, D43N) and in helix C of the catalytic domain (P146T, F147Y). The mutant CYP19 isoforms show increased androgen conversion due to the KN transmembrane helix. In addition, the TY substitutions in helix C result in a substrate preference for androstenedione. Our structural models suggest that CYP19 mutants may interact differently with the membrane (affecting substrate uptake) and with CPR (affecting electron transfer), providing structural clues for the catalytic differences.


Subject(s)
Aromatase , Animals , Female , Humans , Pregnancy , Amino Acids , Androstenedione , Aromatase/genetics , Aromatase/metabolism , Estrogens/metabolism , Mammals/metabolism , Protein Isoforms , Protein Structure, Tertiary/genetics , Protein Structure, Secondary/genetics
2.
J Virol ; 97(6): e0046523, 2023 06 29.
Article in English | MEDLINE | ID: mdl-37199624

ABSTRACT

Coronavirus genome replication and expression are mediated by the viral replication-transcription complex (RTC) which is assembled from multiple nonstructural proteins (nsp). Among these, nsp12 represents the central functional subunit. It harbors the RNA-directed RNA polymerase (RdRp) domain and contains, at its N terminus, an additional domain called NiRAN which is widely conserved in coronaviruses and other nidoviruses. In this study, we produced bacterially expressed coronavirus nsp12s to investigate and compare NiRAN-mediated NMPylation activities from representative alpha- and betacoronaviruses. We found that the four coronavirus NiRAN domains characterized to date have a number of conserved properties, including (i) robust nsp9-specific NMPylation activities that appear to operate largely independently of the C-terminal RdRp domain, (ii) nucleotide substrate preference for UTP followed by ATP and other nucleotides, (iii) dependence on divalent metal ions, with Mn2+ being preferred over Mg2+, and (iv) a key role of N-terminal residues (particularly Asn2) of nsp9 for efficient formation of a covalent phosphoramidate bond between NMP and the N-terminal amino group of nsp9. In this context, a mutational analysis confirmed the conservation and critical role of Asn2 across different subfamilies of the family Coronaviridae, as shown by studies using chimeric coronavirus nsp9 variants in which six N-terminal residues were replaced with those from other corona-, pito- and letovirus nsp9 homologs. The combined data of this and previous studies reveal a remarkable degree of conservation among coronavirus NiRAN-mediated NMPylation activities, supporting a key role of this enzymatic activity in viral RNA synthesis and processing. IMPORTANCE There is strong evidence that coronaviruses and other large nidoviruses evolved a number of unique enzymatic activities, including an additional RdRp-associated NiRAN domain, that are conserved in nidoviruses but not in most other RNA viruses. Previous studies of the NiRAN domain mainly focused on severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and suggested different functions for this domain, such as NMPylation/RNAylation of nsp9, RNA guanylyltransferase activities involved in canonical and/or unconventional RNA capping pathways, and other functions. To help resolve partly conflicting information on substrate specificities and metal ion requirements reported previously for the SARS-CoV-2 NiRAN NMPylation activity, we extended these earlier studies by characterizing representative alpha- and betacoronavirus NiRAN domains. The study revealed that key features of NiRAN-mediated NMPylation activities, such as protein and nucleotide specificity and metal ion requirements, are very well conserved among genetically divergent coronaviruses, suggesting potential avenues for future antiviral drug development targeting this essential viral enzyme.


Subject(s)
Coronaviridae , Protein Domains , RNA-Dependent RNA Polymerase , Humans , Nucleotides/metabolism , RNA, Viral/metabolism , RNA-Dependent RNA Polymerase/genetics , RNA-Dependent RNA Polymerase/metabolism , SARS-CoV-2/enzymology , Viral Nonstructural Proteins/metabolism , Coronaviridae/enzymology , Coronaviridae/genetics , Protein Domains/physiology , Viral Proteins/metabolism , Conserved Sequence , Protein Structure, Secondary/genetics , Vero Cells
3.
PLoS One ; 18(3): e0282741, 2023.
Article in English | MEDLINE | ID: mdl-36952491

ABSTRACT

The interaction between human Growth Hormone (hGH) and hGH Receptor (hGHR) has basic relevance to cancer and growth disorders, and hGH is the scaffold for Pegvisomant, an anti-acromegaly therapeutic. For the latter reason, hGH has been extensively engineered by early workers to improve binding and other properties. We are particularly interested in E174 which belongs to the hGH zinc-binding triad; the substitution E174A is known to significantly increase binding, but to now no explanation has been offered. We generated this and several computationally-selected single-residue substitutions at the hGHR-binding site of hGH. We find that, while many successfully slow down dissociation of the hGH-hGHR complex once bound, they also slow down the association of hGH to hGHR. The E174A substitution induces a change in the Circular Dichroism spectrum that suggests the appearance of coiled-coiling. Here we show that E174A increases affinity of hGH against hGHR because the off-rate is slowed down more than the on-rate. For E174Y (and certain mutations at other sites) the slowdown in on-rate was greater than that of the off-rate, leading to decreased affinity. The results point to a link between structure, zinc binding, and hGHR-binding affinity in hGH.


Subject(s)
Human Growth Hormone , Human Growth Hormone/chemistry , Human Growth Hormone/genetics , Human Growth Hormone/metabolism , Humans , Amino Acid Substitution , Protein Binding/genetics , Receptors, Somatotropin/metabolism , Protein Structure, Secondary/genetics , Alanine/chemistry , Alanine/genetics , Glutamic Acid/chemistry , Glutamic Acid/genetics , Zinc/chemistry , Conserved Sequence , Amino Acid Sequence
4.
J Biol Chem ; 299(1): 102777, 2023 01.
Article in English | MEDLINE | ID: mdl-36496072

ABSTRACT

Long QT syndrome (LQTS) is a human inherited heart condition that can cause life-threatening arrhythmia including sudden cardiac death. Mutations in the ubiquitous Ca2+-sensing protein calmodulin (CaM) are associated with LQTS, but the molecular mechanism by which these mutations lead to irregular heartbeats is not fully understood. Here, we use a multidisciplinary approach including protein biophysics, structural biology, confocal imaging, and patch-clamp electrophysiology to determine the effect of the disease-associated CaM mutation E140G on CaM structure and function. We present novel data showing that mutant-regulated CaMKIIδ kinase activity is impaired with a significant reduction in enzyme autophosphorylation rate. We report the first high-resolution crystal structure of a LQTS-associated CaM variant in complex with the CaMKIIδ peptide, which shows significant structural differences, compared to the WT complex. Furthermore, we demonstrate that the E140G mutation significantly disrupted Cav1.2 Ca2+/CaM-dependent inactivation, while cardiac ryanodine receptor (RyR2) activity remained unaffected. In addition, we show that the LQTS-associated mutation alters CaM's Ca2+-binding characteristics, secondary structure content, and interaction with key partners involved in excitation-contraction coupling (CaMKIIδ, Cav1.2, RyR2). In conclusion, LQTS-associated CaM mutation E140G severely impacts the structure-function relationship of CaM and its regulation of CaMKIIδ and Cav1.2. This provides a crucial insight into the molecular factors contributing to CaM-mediated arrhythmias with a central role for CaMKIIδ.


Subject(s)
Calcium Channels, L-Type , Calcium-Calmodulin-Dependent Protein Kinase Type 2 , Calmodulin , Long QT Syndrome , Humans , Arrhythmias, Cardiac/genetics , Arrhythmias, Cardiac/physiopathology , Calcium/metabolism , Calcium Channels, L-Type/genetics , Calcium Channels, L-Type/metabolism , Calmodulin/genetics , Calmodulin/metabolism , Long QT Syndrome/genetics , Myocytes, Cardiac/metabolism , Ryanodine Receptor Calcium Release Channel/genetics , Ryanodine Receptor Calcium Release Channel/metabolism , Calcium-Calmodulin-Dependent Protein Kinase Type 2/genetics , Calcium-Calmodulin-Dependent Protein Kinase Type 2/metabolism , Mutation , Protein Structure, Secondary/genetics , Protein Binding/genetics , Crystallography
5.
Genes (Basel) ; 12(11)2021 10 21.
Article in English | MEDLINE | ID: mdl-34828263

ABSTRACT

PMM2-CDG is a rare disease, causing hypoglycosylation of multiple proteins, hence preventing full functionality. So far, no direct genotype-phenotype correlations have been identified. We carried out a retrospective cohort study on 26 PMM2-CDG patients. We collected the identified genotype, as well as continuous variables indicating the disease severity (based on Nijmegen Pediatric CDG Rating Score or NPCRS) and dichotomous variables reflecting the patients' phenotype. The phenotypic effects of patients' genotype were studied using non-parametric and Chi-Square tests. Seventeen different pathogenic variants have been studied. Variants with zero enzyme activity had no significant impact on the Nijmegen score. Pathogenic variants involving the stabilization/folding domain have a significantly lower total NPCRS (p = 0.017): presence of the p.Cys241Ser mutation had a significantly lower subscore 1,3 and NPCRS (p = 0.04) and thus result in a less severe phenotype. On the other hand, variants involving the dimerization domain, p.Pro113Leu and p.Phe119Leu, resulted in a significantly higher NPCRS score (p = 0.002), which indicates a worse clinical course. These concepts give a better insight in the phenotypic prognosis of PMM2-CDG, according to their molecular base.


Subject(s)
Congenital Disorders of Glycosylation/genetics , Congenital Disorders of Glycosylation/pathology , Genetic Association Studies , Phosphotransferases (Phosphomutases)/deficiency , Adolescent , Adult , Belgium/epidemiology , Child , Child, Preschool , Congenital Disorders of Glycosylation/epidemiology , Female , Genotype , Humans , Infant , Male , Middle Aged , Models, Molecular , Mutation , Phenotype , Phosphotransferases (Phosphomutases)/chemistry , Phosphotransferases (Phosphomutases)/genetics , Protein Structure, Secondary/genetics , Retrospective Studies , Severity of Illness Index , United States/epidemiology , Young Adult
6.
Nat Commun ; 12(1): 5656, 2021 09 27.
Article in English | MEDLINE | ID: mdl-34580305

ABSTRACT

Glycosyltransferases (GTs) play fundamental roles in nearly all cellular processes through the biosynthesis of complex carbohydrates and glycosylation of diverse protein and small molecule substrates. The extensive structural and functional diversification of GTs presents a major challenge in mapping the relationships connecting sequence, structure, fold and function using traditional bioinformatics approaches. Here, we present a convolutional neural network with attention (CNN-attention) based deep learning model that leverages simple secondary structure representations generated from primary sequences to provide GT fold prediction with high accuracy. The model learns distinguishing secondary structure features free of primary sequence alignment constraints and is highly interpretable. It delineates sequence and structural features characteristic of individual fold types, while classifying them into distinct clusters that group evolutionarily divergent families based on shared secondary structural features. We further extend our model to classify GT families of unknown folds and variants of known folds. By identifying families that are likely to adopt novel folds such as GT91, GT96 and GT97, our studies expand the GT fold landscape and prioritize targets for future structural studies.


Subject(s)
Deep Learning , Glycosyltransferases/metabolism , Protein Folding , Amino Acid Sequence/genetics , Computational Biology/methods , Databases, Genetic , Datasets as Topic , Glycosylation , Glycosyltransferases/genetics , Protein Structure, Secondary/genetics , Protein Structure, Tertiary/genetics , Sequence Alignment
7.
Biochemistry ; 60(40): 3007-3015, 2021 10 12.
Article in English | MEDLINE | ID: mdl-34541851

ABSTRACT

Human Pumilio (hPUM) is a structurally well-analyzed RNA-binding protein that has been used recently for artificial RNA binding. Structural analysis revealed that amino acids at positions 12, 13, and 16 in the repeats from R1 to R8 each contact one specific RNA base in the eight-nucleotide RNA target. The functions of the N- and C-terminal flanking repeats R1' and R8', however, remain unclear. Here, we report how the repeats contribute to overall RNA binding. We first prepared three mutants in which R1' and/or R8' were deleted and then analyzed RNA binding using gel shift assays. The assays showed that all deletion mutants bound to their target less than the original hPUM, but that R1' contributed more than R8', unlike Drosophila PUM. We next investigated which amino acid residues of R1' or R8' were responsible for RNA binding. With detailed analysis of the protein tertiary structure, we found a hydrophobic core in each of the repeats. We therefore mutated all hydrophobic amino residues in each core to alanine. The gel shift assays with the resulting mutants revealed that both hydrophobic cores contributed to the RNA binding: especially the hydrophobic core of R1' had a significant influence. In the present study, we demonstrated that the flanking R1' and R8' repeats are indispensable for RNA binding of hPUM and suggest that hydrophobic R1'-R1 interactions may stabilize the whole hPUM structure.


Subject(s)
RNA-Binding Proteins/metabolism , RNA/metabolism , Amino Acid Sequence , Electrophoretic Mobility Shift Assay , Humans , Hydrophobic and Hydrophilic Interactions , Mutagenesis , Mutation , Protein Binding/genetics , Protein Domains/genetics , Protein Structure, Secondary/genetics , RNA/chemistry , RNA-Binding Proteins/chemistry , RNA-Binding Proteins/genetics
8.
Nucleic Acids Res ; 49(16): 9496-9507, 2021 09 20.
Article in English | MEDLINE | ID: mdl-34403479

ABSTRACT

The recent discovery of the bona-fide telomerase RNA (TR) from plants reveals conserved and unique secondary structure elements and the opportunity for new insight into the telomerase RNP. Here we examine how two highly conserved proteins previously implicated in Arabidopsis telomere maintenance, AtPOT1a and AtNAP57 (dyskerin), engage plant telomerase. We report that AtPOT1a associates with Arabidopsis telomerase via interaction with TERT. While loss of AtPOT1a does not impact AtTR stability, the templating domain is more accessible in pot1a mutants, supporting the conclusion that AtPOT1a stimulates telomerase activity but does not facilitate telomerase RNP assembly. We also show, that despite the absence of a canonical H/ACA binding motif within AtTR, dyskerin binds AtTR with high affinity and specificity in vitro via a plant specific three-way junction (TWJ). A core element of the TWJ is the P1a stem, which unites the 5' and 3' ends of AtTR. P1a is required for dyskerin-mediated stimulation of telomerase repeat addition processivity in vitro, and for AtTR accumulation and telomerase activity in vivo. The deployment of vertebrate-like accessory proteins and unique RNA structural elements by Arabidopsis telomerase provides a new platform for exploring telomerase biogenesis and evolution.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis/genetics , Nuclear Proteins/genetics , RNA-Binding Proteins/genetics , RNA/genetics , Telomerase/genetics , Animals , Arabidopsis/growth & development , Phylogeny , Protein Structure, Secondary/genetics , Telomere/genetics , Telomere-Binding Proteins/genetics
9.
Front Endocrinol (Lausanne) ; 12: 698511, 2021.
Article in English | MEDLINE | ID: mdl-34220721

ABSTRACT

Strong efforts have been placed on understanding the physiological roles and therapeutic potential of the proglucagon peptide hormones including glucagon, GLP-1 and GLP-2. However, little is known about the extent and magnitude of variability in the amino acid composition of the proglucagon precursor and its mature peptides. Here, we identified 184 unique missense variants in the human proglucagon gene GCG obtained from exome and whole-genome sequencing of more than 450,000 individuals across diverse sub-populations. This provides an unprecedented source of population-wide genetic variation data on missense mutations and insights into the evolutionary constraint spectrum of proglucagon-derived peptides. We show that the stereotypical peptides glucagon, GLP-1 and GLP-2 display fewer evolutionary alterations and are more likely to be functionally affected by genetic variation compared to the rest of the gene products. Elucidating the spectrum of genetic variations and estimating the impact of how a peptide variant may influence human physiology and pathophysiology through changes in ligand binding and/or receptor signalling, are vital and serve as the first important step in understanding variability in glucose homeostasis, amino acid metabolism, intestinal epithelial growth, bone strength, appetite regulation, and other key physiological parameters controlled by these hormones.


Subject(s)
Glucagon-Like Peptides/genetics , Proglucagon/genetics , Amino Acid Sequence , DNA Mutational Analysis , Datasets as Topic , Gene Frequency , Glucagon/chemistry , Glucagon/genetics , Glucagon-Like Peptide 1/chemistry , Glucagon-Like Peptide 1/genetics , Glucagon-Like Peptide 2/chemistry , Glucagon-Like Peptide 2/genetics , Glucagon-Like Peptides/chemistry , Humans , Models, Molecular , Mutation, Missense , Pharmacogenomic Testing , Proglucagon/chemistry , Protein Precursors/chemistry , Protein Precursors/genetics , Protein Structure, Secondary/genetics
10.
Methods Mol Biol ; 2315: 99-110, 2021.
Article in English | MEDLINE | ID: mdl-34302672

ABSTRACT

Oligomers of G protein-coupled receptors (GPCRs) are closely related to their biochemical and biological functions and have been conserved during the course of molecular evolution. The mechanisms of GPCR interactions and the reason why GPCRs interact between themselves have remained elusive. Accurate interface prediction is useful to generate guidelines for mutation and inhibition experiments and would accelerate investigations of the molecular mechanisms of GPCR oligomerization and signaling. We have developed a method to predict the interfaces for GPCR oligomerization. Our method detects clusters of conserved residues along the surfaces of transmembrane helices, using a multiple sequence alignment and a target GPCR or closely related structure. This chapter outlines our method and introduces some problems that occur with it, along with our future direction to extend the method for interface predictions of general membrane proteins.


Subject(s)
Computational Biology/methods , Membrane Proteins/chemistry , Protein Structure, Secondary/genetics , Receptors, G-Protein-Coupled/chemistry , Evolution, Molecular , Membrane Proteins/genetics , Mutation/genetics , Receptors, G-Protein-Coupled/genetics , Sequence Alignment , Signal Transduction/genetics
11.
FEBS J ; 288(19): 5768-5780, 2021 10.
Article in English | MEDLINE | ID: mdl-33843134

ABSTRACT

Mycophenolic acid (MPA) is a fungal natural product and first-line immunosuppressive drug for organ transplantations and autoimmune diseases. In the compartmentalized biosynthesis of MPA, the acyl-coenzyme A (CoA) hydrolase MpaH' located in peroxisomes catalyzes the highly specific hydrolysis of MPA-CoA to produce the final product MPA. The strict substrate specificity of MpaH' not only averts undesired hydrolysis of various cellular acyl-CoAs, but also prevents MPA-CoA from further peroxisomal ß-oxidation catabolism. To elucidate the structural basis for this important property, in this study, we solve the crystal structures of the substrate-free form of MpaH' and the MpaH'S139A mutant in complex with the product MPA. The MpaH' structure reveals a canonical α/ß-hydrolase fold with an unusually large cap domain and a rare location of the acidic residue D163 of catalytic triad after strand ß6. MpaH' also forms an atypical dimer with the unique C-terminal helices α13 and α14 arming the cap domain of the other protomer and indirectly participating in the substrate binding. With these characteristics, we propose that MpaH' and its homologs form a new subfamily of α/ß hydrolase fold protein. The crystal structure of MpaH'S139A /MPA complex and the modeled structure of MpaH'/MPA-CoA, together with the structure-guided mutagenesis analysis and isothermal titration calorimetry (ITC) measurements, provide important mechanistic insights into the high substrate specificity of MpaH'.


Subject(s)
Acyl Coenzyme A/chemistry , Hydrolases/ultrastructure , Mycophenolic Acid/metabolism , Peroxisomes/ultrastructure , Amino Acid Sequence/genetics , Catalytic Domain/genetics , Hydrolases/chemistry , Hydrolases/genetics , Mycophenolic Acid/chemistry , Penicillium/genetics , Penicillium/ultrastructure , Peroxisomes/enzymology , Protein Structure, Secondary/genetics , Substrate Specificity/genetics
12.
Proc Natl Acad Sci U S A ; 118(17)2021 04 27.
Article in English | MEDLINE | ID: mdl-33875592

ABSTRACT

The amino acid sequences of proteins have evolved over billions of years, preserving their structures and functions while responding to evolutionary forces. Are there conserved sequence and structural elements that preserve the protein folding mechanisms? The functionally diverse and ancient (ßα)1-8 TIM barrel motif may answer this question. We mapped the complex six-state folding free energy surface of a ∼3.6 billion y old, bacterial indole-3-glycerol phosphate synthase (IGPS) TIM barrel enzyme by equilibrium and kinetic hydrogen-deuterium exchange mass spectrometry (HDX-MS). HDX-MS on the intact protein reported exchange in the native basin and the presence of two thermodynamically distinct on- and off-pathway intermediates in slow but dynamic equilibrium with each other. Proteolysis revealed protection in a small (α1ß2) and a large cluster (ß5α5ß6α6ß7) and that these clusters form cores of stability in Ia and Ibp The strongest protection in both states resides in ß4α4 with the highest density of branched aliphatic side chain contacts in the folded structure. Similar correlations were observed previously for an evolutionarily distinct archaeal IGPS, emphasizing a key role for hydrophobicity in stabilizing common high-energy folding intermediates. A bioinformatics analysis of IGPS sequences from the three superkingdoms revealed an exceedingly high hydrophobicity and surprising α-helix propensity for ß4, preceded by a highly conserved ßα-hairpin clamp that links ß3 and ß4. The conservation of the folding mechanisms for archaeal and bacterial IGPS proteins reflects the conservation of key elements of sequence and structure that first appeared in the last universal common ancestor of these ancient proteins.


Subject(s)
Indole-3-Glycerol-Phosphate Synthase/metabolism , Protein Domains/physiology , Protein Structure, Secondary/genetics , Amino Acid Sequence/genetics , Amino Acids/genetics , Bacterial Proteins/chemistry , Hydrogen Bonding , Indole-3-Glycerol-Phosphate Synthase/physiology , Kinetics , Models, Molecular , Protein Conformation , Protein Domains/genetics , Protein Folding , Sequence Homology, Amino Acid , Thermodynamics
13.
Sci Rep ; 11(1): 7526, 2021 04 06.
Article in English | MEDLINE | ID: mdl-33824364

ABSTRACT

The stability of proteins is an important factor for industrial and medical applications. Improving protein stability is one of the main subjects in protein engineering. In a previous study, we improved the stability of a four-helix bundle dimeric de novo protein (WA20) by five mutations. The stabilised mutant (H26L/G28S/N34L/V71L/E78L, SUWA) showed an extremely high denaturation midpoint temperature (Tm). Although SUWA is a remarkably hyperstable protein, in protein design and engineering, it is an attractive challenge to rationally explore more stable mutants. In this study, we predicted stabilising mutations of WA20 by in silico saturation mutagenesis and molecular dynamics simulation, and experimentally confirmed three stabilising mutations of WA20 (N22A, N22E, and H86K). The stability of a double mutant (N22A/H86K, rationally optimised WA20, ROWA) was greatly improved compared with WA20 (ΔTm = 10.6 °C). The model structures suggested that N22A enhances the stability of the α-helices and N22E and H86K contribute to salt-bridge formation for protein stabilisation. These mutations were also added to SUWA and improved its Tm. Remarkably, the most stable mutant of SUWA (N22E/H86K, rationally optimised SUWA, ROSA) showed the highest Tm (129.0 °C). These new thermostable mutants will be useful as a component of protein nanobuilding blocks to construct supramolecular protein complexes.


Subject(s)
Protein Conformation, alpha-Helical/genetics , Protein Engineering/methods , Protein Structure, Secondary/genetics , Amino Acid Sequence/genetics , Molecular Dynamics Simulation , Mutagenesis, Site-Directed/methods , Protein Denaturation , Protein Stability , Protein Structure, Secondary/physiology , Proteins/metabolism
14.
J Comput Biol ; 28(4): 346-361, 2021 04.
Article in English | MEDLINE | ID: mdl-33617347

ABSTRACT

Accurate predictions of protein structure properties, for example, secondary structure and solvent accessibility, are essential in analyzing the structure and function of a protein. Position-specific scoring matrix (PSSM) features are widely used in the structure property prediction. However, some proteins may have low-quality PSSM features due to insufficient homologous sequences, leading to limited prediction accuracy. To address this limitation, we propose an enhancing scheme for PSSM features. We introduce the "Bagging MSA" (multiple sequence alignment) method to calculate PSSM features used to train our model, adopt a convolutional network to capture local context features and bidirectional long short-term memory for long-term dependencies, and integrate them under an unsupervised framework. Structure property prediction models are then built upon such enhanced PSSM features for more accurate predictions. Moreover, we develop two frameworks to evaluate the effectiveness of the enhanced PSSM features, which also bring proposed method into real-world scenarios. Empirical evaluation of CB513, CASP11, and CASP12 data sets indicates that our unsupervised enhancing scheme indeed generates more informative PSSM features for structure property prediction.


Subject(s)
Computational Biology , Deep Learning , Protein Conformation , Proteins/ultrastructure , Algorithms , Neural Networks, Computer , Position-Specific Scoring Matrices , Protein Structure, Secondary/genetics , Proteins/genetics , Sequence Alignment
15.
J Mol Biol ; 433(7): 166846, 2021 04 02.
Article in English | MEDLINE | ID: mdl-33549587

ABSTRACT

Chromosome ends are protected by guanosine-rich telomere DNA that forms stable G-quadruplex (G4) structures. The heterodimeric POT1-TPP1 complex interacts specifically with telomere DNA to shield it from illicit DNA damage repair and to resolve secondary structure that impedes telomere extension. The mechanism by which POT1-TPP1 accomplishes these tasks is poorly understood. Here, we establish the kinetic framework for POT1-TPP1 binding and unfolding of telomere G4 DNA. Our data identify two modes of POT1-TPP1 destabilization of G4 DNA that are governed by protein concentration. At low concentrations, POT1-TPP1 passively captures transiently unfolded G4s. At higher concentrations, POT1-TPP1 proteins bind to G4s to actively destabilize the DNA structures. Cancer-associated POT1-TPP1 mutations impair multiple reaction steps in this process, resulting in less efficient destabilization of G4 structures. The mechanistic insight highlights the importance of cell cycle dependent expression and localization of the POT1-TPP1 complex and distinguishes diverse functions of this complex in telomere maintenance.


Subject(s)
Aminopeptidases/genetics , G-Quadruplexes , Serine Proteases/genetics , Telomere-Binding Proteins/genetics , Telomere/genetics , Humans , Multiprotein Complexes/genetics , Multiprotein Complexes/ultrastructure , Mutation/genetics , Protein Binding/genetics , Protein Conformation , Protein Structure, Secondary/genetics , Shelterin Complex , Telomerase/genetics
16.
Front Immunol ; 12: 763044, 2021.
Article in English | MEDLINE | ID: mdl-35087515

ABSTRACT

Cytolytic T cell responses are predicted to be biased towards membrane proteins. The peptide-binding grooves of most alleles of histocompatibility complex class I (MHC-I) are relatively hydrophobic, therefore peptide fragments derived from human transmembrane helices (TMHs) are predicted to be presented more often as would be expected based on their abundance in the proteome. However, the physiological reason of why membrane proteins might be over-presented is unclear. In this study, we show that the predicted over-presentation of TMH-derived peptides is general, as it is predicted for bacteria and viruses and for both MHC-I and MHC-II, and confirmed by re-analysis of epitope databases. Moreover, we show that TMHs are evolutionarily more conserved, because single nucleotide polymorphisms (SNPs) are present relatively less frequently in TMH-coding chromosomal regions compared to regions coding for extracellular and cytoplasmic protein regions. Thus, our findings suggest that both cytolytic and helper T cells are more tuned to respond to membrane proteins, because these are evolutionary more conserved. We speculate that TMHs are less prone to mutations that enable pathogens to evade T cell responses.


Subject(s)
Antigen Presentation/genetics , Epitopes, T-Lymphocyte/genetics , Histocompatibility Antigens Class II/genetics , Histocompatibility Antigens Class I/genetics , Membrane Proteins/genetics , Protein Structure, Secondary/genetics , Alleles , Antigen Presentation/immunology , Chromosomes/genetics , Chromosomes/immunology , Cytoplasm/genetics , Cytoplasm/immunology , Epitopes, T-Lymphocyte/immunology , Histocompatibility Antigens Class I/immunology , Histocompatibility Antigens Class II/immunology , Humans , Membrane Proteins/immunology , Peptides/genetics , Peptides/immunology , Polymorphism, Single Nucleotide/genetics , Polymorphism, Single Nucleotide/immunology , T-Lymphocytes, Helper-Inducer/immunology
17.
IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2409-2419, 2021.
Article in English | MEDLINE | ID: mdl-32149653

ABSTRACT

Protein Secondary Structural Class (PSSC) information is important in investigating further challenges of protein sequences like protein fold recognition, protein tertiary structure prediction, and analysis of protein functions for drug discovery. Identification of PSSC using biological methods is time-consuming and cost-intensive. Several computational models have been developed to predict the structural class; however, they lack in generalization of the model. Hence, predicting PSSC based on protein sequences is still proving to be an uphill task. In this article, we proposed an effective, novel and generalized prediction model consisting of a feature modeling and an ensemble of classifiers. The proposed feature modeling extracts discriminating information (features) by leveraging three techniques: (i) Embedding - features are extracted on the basis of spatial residue arrangements of the sequences using word embedding approaches; (ii) SkipXGram Bi-gram - various sets of skipped bi-gram features are extracted from the sequences; and (iii) General Statistical (GS) based features are extracted which covers the global information of structural sequences. The combined effective sets of features are trained and classified using an ensemble of three classifiers: Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Machines (GBM). The proposed model when assessed on five benchmark datasets (high and low sequence similarity), viz. z277, z498, 25PDB, 1189, and FC699, reported an overall accuracy of 93.55, 97.58, 81.82, 81.11, and 93.93 percent respectively. The proposed model is further validated on a large-scale updated low similarity ( ≤ 25%) dataset, where it achieved an overall accuracy of 81.11 percent. The proposed generalized model is robust and consistently outperformed several state-of-the-art models on all the five benchmark datasets.


Subject(s)
Computational Biology/methods , Machine Learning , Protein Structure, Secondary/genetics , Proteins , Amino Acid Sequence/genetics , Databases, Protein , Proteins/chemistry , Proteins/classification , Proteins/genetics , Sequence Analysis, Protein , Support Vector Machine
18.
J Comput Biol ; 28(4): 362-364, 2021 04.
Article in English | MEDLINE | ID: mdl-33259717

ABSTRACT

Recently, a deep learning-based enhancing Position-Specific Scoring Matrix (PSSM) method (Bagging Multiple Sequence Alignment [MSA] Learning) Guo et al. has been proposed, and its effectiveness has been empirically proved. Program EPTool is the implementation of Bagging MSA Learning, which provides a complete training and evaluation workflow for the enhancing PSSM model. It is capable of handling different input data set and various computing algorithms to train the enhancing model, then eventually improve the PSSM quality for those proteins with insufficient homologous sequences. In addition, EPTool equips several convenient applications, such as PSSM features calculator, and PSSM features visualization. In this article, we propose designed EPTool and briefly introduce its functionalities and applications. The detailed accessible instructions are also provided.


Subject(s)
Protein Conformation , Protein Structure, Secondary/genetics , Proteins/ultrastructure , Software , Algorithms , Computational Biology , Databases, Protein , Position-Specific Scoring Matrices , Proteins/genetics , Sequence Alignment
19.
FEBS J ; 288(11): 3428-3447, 2021 06.
Article in English | MEDLINE | ID: mdl-33319437

ABSTRACT

Precise control of protein and messenger RNA (mRNA) degradation is essential for cellular metabolism and homeostasis. Controlled and specific degradation of both molecular species necessitates their engagements with the respective degradation machineries; this engagement involves a disordered/unstructured segment of the substrate traversing the degradation tunnel of the machinery and accessing the catalytic sites. However, while molecular factors influencing protein degradation have been extensively explored on a genome scale, and in multiple organisms, such a comprehensive understanding remains missing for mRNAs. Here, we analyzed multiple genome-scale experimental yeast mRNA half-life data in light of experimentally derived mRNA secondary structures and protein binding data, along with high-resolution X-ray crystallographic structures of the RNase machines. Results unraveled a consistent genome-scale trend that mRNAs comprising longer terminal and/or internal unstructured segments have significantly shorter half-lives; the lengths of the 5'-terminal, 3'-terminal, and internal unstructured segments that affect mRNA half-life are compatible with molecular structures of the 5' exo-, 3' exo-, and endoribonuclease machineries. Sequestration into ribonucleoprotein complexes elongates mRNA half-life, presumably by burying ribonuclease engagement sites under oligomeric interfaces. After gene duplication, differences in terminal unstructured lengths, proportions of internal unstructured segments, and oligomerization modes result in significantly altered half-lives of paralogous mRNAs. Side-by-side comparison of molecular principles underlying controlled protein and mRNA degradation in yeast unravels their remarkable mechanistic similarities and suggests how the intrinsic structural features of the two molecular species, at two different levels of the central dogma, regulate their half-lives on genome scale.


Subject(s)
Endoribonucleases/genetics , Nucleic Acid Conformation , RNA Stability/genetics , RNA, Messenger/ultrastructure , Endoribonucleases/ultrastructure , Genome, Fungal/genetics , Half-Life , Protein Structure, Secondary/genetics , RNA, Messenger/genetics , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae/ultrastructure
20.
FEBS Open Bio ; 10(10): 1947-1956, 2020 10.
Article in English | MEDLINE | ID: mdl-33017095

ABSTRACT

Poor immunogenicity of small proteins is a major hurdle in developing vaccines or producing antibodies for biopharmaceutical usage. Here, we systematically analyzed the effects of 10 solubility controlling peptide tags (SCP-tags) on the immunogenicity of a non-immunogenic model protein, bovine pancreatic trypsin inhibitor (BPTI-19A; 6 kDa). CD, fluorescence, DLS, SLS, and AUC measurements indicated that the SCP-tags did not change the secondary structure content nor the tertiary structures of the protein nor its monomeric state. ELISA results indicated that the 5-proline (C5P) and 5-arginine (C5R) tags unexpectedly increased the IgG level of BPTI-19A by 240- and 73-fold, respectively, suggesting that non-oligomerizing SCP-tags may provide a novel method for increasing the immunogenicity of a protein in a highly specific manner.


Subject(s)
Adaptive Immunity/genetics , Peptides/immunology , Protein Engineering/methods , Aprotinin/genetics , Aprotinin/immunology , Models, Molecular , Mutagenesis, Site-Directed/methods , Protein Conformation , Protein Structure, Secondary/genetics , Proteins/genetics , Solubility/drug effects
SELECTION OF CITATIONS
SEARCH DETAIL
...