RESUMO
One-third of the human proteome is comprised of membrane proteins, which are particularly vulnerable to misfolding and often require folding assistance by molecular chaperones. Calnexin (CNX), which engages client proteins via its sugar-binding lectin domain, is one of the most abundant ER chaperones, and plays an important role in membrane protein biogenesis. Based on mass spectrometric analyses, we here show that calnexin interacts with a large number of nonglycosylated membrane proteins, indicative of additional nonlectin binding modes. We find that calnexin preferentially bind misfolded membrane proteins and that it uses its single transmembrane domain (TMD) for client recognition. Combining experimental and computational approaches, we systematically dissect signatures for intramembrane client recognition by calnexin, and identify sequence motifs within the calnexin TMD region that mediate client binding. Building on this, we show that intramembrane client binding potentiates the chaperone functions of calnexin. Together, these data reveal a widespread role of calnexin client recognition in the lipid bilayer, which synergizes with its established lectin-based substrate binding. Molecular chaperones thus can combine different interaction modes to support the biogenesis of the diverse eukaryotic membrane proteome.
Assuntos
Chaperonas Moleculares , Proteoma , Humanos , Calnexina/metabolismo , Proteoma/metabolismo , Chaperonas Moleculares/metabolismo , Lectinas/metabolismo , Proteínas de Membrana/metabolismo , Dobramento de Proteína , Proteínas de Ligação ao Cálcio/metabolismoRESUMO
The intramembrane protease γ-secretase has broad physiological functions, but also contributes to Notch-dependent tumors and Alzheimer's disease. While γ-secretase cleaves numerous membrane proteins, only few nonsubstrates are known. Thus, a fundamental open question is how γ-secretase distinguishes substrates from nonsubstrates and whether sequence-based features or post-translational modifications of membrane proteins contribute to substrate recognition. Using mass spectrometry-based proteomics, we identified several type I membrane proteins with short ectodomains that were inefficiently or not cleaved by γ-secretase, including 'pituitary tumor-transforming gene 1-interacting protein' (PTTG1IP). To analyze the mechanism preventing cleavage of these putative nonsubstrates, we used the validated substrate FN14 as a backbone and replaced its transmembrane domain (TMD), where γ-cleavage occurs, with the one of nonsubstrates. Surprisingly, some nonsubstrate TMDs were efficiently cleaved in the FN14 backbone, demonstrating that a cleavable TMD is necessary, but not sufficient for cleavage by γ-secretase. Cleavage efficiencies varied by up to 200-fold. Other TMDs, including that of PTTG1IP, were still barely cleaved within the FN14 backbone. Pharmacological and mutational experiments revealed that the PTTG1IP TMD is palmitoylated, which prevented cleavage by γ-secretase. We conclude that the TMD sequence of a membrane protein and its palmitoylation can be key factors determining substrate recognition and cleavage efficiency by γ-secretase.
Assuntos
Secretases da Proteína Precursora do Amiloide , Lipoilação , Secretases da Proteína Precursora do Amiloide/genética , Secretases da Proteína Precursora do Amiloide/metabolismo , Proteínas de Membrana/metabolismo , Domínios Proteicos , Processamento de Proteína Pós-Traducional , Precursor de Proteína beta-Amiloide/metabolismoRESUMO
Plant genomics plays a pivotal role in enhancing global food security and sustainability by offering innovative solutions for improving crop yield, disease resistance, and stress tolerance. As the number of sequenced genomes grows and the accuracy and contiguity of genome assemblies improve, structural annotation of plant genomes continues to be a significant challenge due to their large size, polyploidy, and rich repeat content. In this paper, we present an overview of the current landscape in crop genomics research, highlighting the diversity of genomic characteristics across various crop species. We also assessed the accuracy of popular gene prediction tools in identifying genes within crop genomes and examined the factors that impact their performance. Our findings highlight the strengths and limitations of BRAKER2 and Helixer as leading structural genome annotation tools and underscore the impact of genome complexity, fragmentation, and repeat content on their performance. Furthermore, we evaluated the suitability of the predicted proteins as a reliable search space in proteomics studies using mass spectrometry data. Our results provide valuable insights for future efforts to refine and advance the field of structural genome annotation.
Assuntos
Produtos Agrícolas , Genoma de Planta , Anotação de Sequência Molecular , Proteômica , Produtos Agrícolas/genética , Proteômica/métodos , Genômica/métodos , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismoRESUMO
To understand how cells communicate in the nervous system, it is essential to define their secretome, which is challenging for primary cells because of large cell numbers being required. Here, we miniaturized secretome analysis by developing the "high-performance secretome protein enrichment with click sugars" (hiSPECS) method. To demonstrate its broad utility, hiSPECS was used to identify the secretory response of brain slices upon LPS-induced neuroinflammation and to establish the cell type-resolved mouse brain secretome resource using primary astrocytes, microglia, neurons, and oligodendrocytes. This resource allowed mapping the cellular origin of CSF proteins and revealed that an unexpectedly high number of secreted proteins in vitro and in vivo are proteolytically cleaved membrane protein ectodomains. Two examples are neuronally secreted ADAM22 and CD200, which we identified as substrates of the Alzheimer-linked protease BACE1. hiSPECS and the brain secretome resource can be widely exploited to systematically study protein secretion and brain function and to identify cell type-specific biomarkers for CNS diseases.
Assuntos
Secretases da Proteína Precursora do Amiloide/metabolismo , Ácido Aspártico Endopeptidases/metabolismo , Astrócitos/metabolismo , Encéfalo/metabolismo , Microglia/metabolismo , Neurônios/metabolismo , Oligodendroglia/metabolismo , Proteômica/métodos , Software , Proteínas ADAM/líquido cefalorraquidiano , Proteínas ADAM/metabolismo , Secretases da Proteína Precursora do Amiloide/antagonistas & inibidores , Secretases da Proteína Precursora do Amiloide/líquido cefalorraquidiano , Animais , Antígenos CD/líquido cefalorraquidiano , Antígenos CD/metabolismo , Ácido Aspártico Endopeptidases/antagonistas & inibidores , Ácido Aspártico Endopeptidases/líquido cefalorraquidiano , Encéfalo/citologia , Células Cultivadas , Proteínas do Líquido Cefalorraquidiano , Cromatografia Líquida , Ontologia Genética , Lipopolissacarídeos/farmacologia , Camundongos , Camundongos Endogâmicos C57BL , Proteínas do Tecido Nervoso/líquido cefalorraquidiano , Proteínas do Tecido Nervoso/metabolismo , Análise de Componente Principal , Proteoma/metabolismo , Espectrometria de Massas em TandemRESUMO
Membrane proteins are unique in that they interact with lipid bilayers, making them indispensable for transporting molecules and relaying signals between and across cells. Due to the significance of the protein's functions, mutations often have profound effects on the fitness of the host. This is apparent both from experimental studies, which implicated numerous missense variants in diseases, as well as from evolutionary signals that allow elucidating the physicochemical constraints that intermembrane and aqueous environments bring. In this review, we report on the current state of knowledge acquired on missense variants (referred to as to single amino acid variants) affecting membrane proteins as well as the insights that can be extrapolated from data already available. This includes an overview of the annotations for membrane protein variants that have been collated within databases dedicated to the topic, bioinformatics approaches that leverage evolutionary information in order to shed light on previously uncharacterized membrane protein structures or interaction interfaces, tools for predicting the effects of mutations tailored specifically towards the characteristics of membrane proteins as well as two clinically relevant case studies explaining the implications of mutated membrane proteins in cancer and cardiomyopathy.
Assuntos
Cardiomiopatias/genética , Evolução Molecular , Proteínas de Membrana , Mutação de Sentido Incorreto , Proteínas de Neoplasias , Neoplasias/genética , Substituição de Aminoácidos , Biologia Computacional , Humanos , Proteínas de Membrana/química , Proteínas de Membrana/genética , Proteínas de Neoplasias/química , Proteínas de Neoplasias/genética , Conformação ProteicaRESUMO
SUMMARY: The ability of a T cell to recognize foreign peptides is defined by a single α and a single ß hypervariable complementarity determining region (CDR3), which together form the T-cell receptor (TCR) heterodimer. In â¼30-35% of T cells, two α chains are expressed at the mRNA level but only one α chain is part of the functional TCR. This effect can also be observed for ß chains, although it is less common. The identification of functional α/ß chain pairs is instrumental in high-throughput characterization of therapeutic TCRs. TCRpair is the first method that predicts whether an α and ß chain pair forms a functional, HLA-A*02:01 specific TCR without requiring the sequence of a recognized peptide. By taking additional amino acids flanking the CDR3 regions into account, TCRpair achieves an AUC of 0.71. AVAILABILITY AND IMPLEMENTATION: TCRpair is implemented in Python using TensorFlow 2.0 and is freely available at https://www.github.com/amoesch/TCRpair. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Receptores de Antígenos de Linfócitos T alfa-beta , Receptores de Antígenos de Linfócitos T , Sequência de Aminoácidos , Receptores de Antígenos de Linfócitos T alfa-beta/química , Receptores de Antígenos de Linfócitos T alfa-beta/genética , Receptores de Antígenos de Linfócitos T alfa-beta/metabolismo , Receptores de Antígenos de Linfócitos T/química , Linfócitos T/metabolismo , Regiões Determinantes de Complementaridade/química , Regiões Determinantes de Complementaridade/genética , Peptídeos , Antígenos HLA-A/metabolismoRESUMO
Neuronal cell lines are important model systems to study mechanisms of neurodegenerative diseases. One example is the Lund Human Mesencephalic (LUHMES) cell line, which can differentiate into dopaminergic-like neurons and is frequently used to study mechanisms of Parkinson's disease and neurotoxicity. Neuronal differentiation of LUHMES cells is commonly verified with selected neuronal markers, but little is known about the proteome-wide protein abundance changes during differentiation. Using mass spectrometry and label-free quantification (LFQ), the proteome of differentiated and undifferentiated LUHMES cells and of primary murine midbrain neurons are compared. Neuronal differentiation induced substantial changes of the LUHMES cell proteome, with proliferation-related proteins being strongly down-regulated and neuronal and dopaminergic proteins, such as L1CAM and α-synuclein (SNCA) being up to 1,000-fold up-regulated. Several of these proteins, including MAPT and SYN1, may be useful as new markers for experimentally validating neuronal differentiation of LUHMES cells. Primary midbrain neurons are slightly more closely related to differentiated than to undifferentiated LUHMES cells, in particular with respect to the abundance of proteins related to neurodegeneration. In summary, the analysis demonstrates that differentiated LUHMES cells are a suitable model for studies on neurodegeneration and provides a resource of the proteome-wide changes during neuronal differentiation. (ProteomeXchange identifier PXD020044).
Assuntos
Mesencéfalo , Proteoma , Animais , Diferenciação Celular , Humanos , Camundongos , Neurônios , alfa-SinucleínaRESUMO
BACKGROUND: This study is motivated by the following three considerations: a) the physico-chemical properties of transmembrane (TM) proteins are distinctly different from those of globular proteins, necessitating the development of specialized structure prediction techniques, b) for many structural features no specialized predictors for TM proteins are available at all, and c) deep learning algorithms allow to automate the feature engineering process and thus facilitate the development of multi-target methods for predicting several protein properties at once. RESULTS: We present AllesTM, an integrated tool to predict almost all structural features of transmembrane proteins that can be extracted from atomic coordinate data. It blends several machine learning algorithms: random forests and gradient boosting machines, convolutional neural networks in their original form as well as those enhanced by dilated convolutions and residual connections, and, finally, long short-term memory architectures. AllesTM outperforms other available methods in predicting residue depth in the membrane, flexibility, topology, relative solvent accessibility in its bound state, while in torsion angles, secondary structure and monomer relative solvent accessibility prediction it lags only slightly behind the currently leading technique SPOT-1D. High accuracy on a multitude of prediction targets and easy installation make AllesTM a one-stop shop for many typical problems in the structural bioinformatics of transmembrane proteins. CONCLUSIONS: In addition to presenting a highly accurate prediction method and eliminating the need to install and maintain many different software tools, we also provide a comprehensive overview of the impact of different machine learning algorithms and parameter choices on the prediction performance. AllesTM is freely available at https://github.com/phngs/allestm.
Assuntos
Alelos , Biologia Computacional/métodos , Proteínas de Membrana/químicaRESUMO
Membrane proteins are unique in that segments thereof concurrently reside in vastly different physicochemical environments: the extracellular space, the lipid bilayer, and the cytoplasm. Accordingly, the effects of missense variants disrupting their sequence depend greatly on the characteristics of the environment of the protein segment affected as well as the function it performs. Because membrane proteins have many crucial roles (transport, signal transduction, cell adhesion, etc.), compromising their functionality often leads to diseases including cancers, diabetes mellitus or cystic fibrosis. Here, we report a suite of sequence-based computational methods "Pred-MutHTP" for discriminating between disease-causing and neutral alterations in their sequence. With a data set of 11,846 disease-causing and 9,533 neutral mutations, we obtained an accuracy of 74% and 78% with 10-fold group-wise cross-validation and test set, respectively. The features used in the models include evolutionary information, physiochemical properties, neighboring residue information, and specialized membrane protein attributes incorporating the number of transmembrane segments, substitution matrices specific to membrane proteins as well as residue distributions occurring in specific topological regions. Across 11 disease classes, the method achieved accuracies in the range of 75-85%. The model designed specifically for the transmembrane segments achieved an accuracy of 85% on the test set with a sensitivity and specificity of 86% and 83%, respectively. This renders our method the current state-of-the-art with regard to predicting the effects of variants in the transmembrane protein segments. Pred-MutHTP allows predicting the effect of any variant occurring in a membrane protein-available at https://www.iitm.ac.in/bioinfo/PredMutHTP/.
Assuntos
Biologia Computacional/métodos , Estudos de Associação Genética , Predisposição Genética para Doença , Proteínas de Membrana/genética , Mutação , Software , Algoritmos , Fenômenos Químicos , Estudos de Associação Genética/métodos , Humanos , Proteínas de Membrana/química , Curva ROC , Reprodutibilidade dos Testes , NavegadorRESUMO
Accurate prediction of amino acid residue contacts is an important prerequisite for generating high-quality 3D models of transmembrane (TM) proteins. While a large number of compositional, evolutionary, and structural properties of proteins can be used to train contact prediction methods, recent research suggests that coevolution between residues provides the strongest indication of their spatial proximity. We have developed a deep learning approach, DeepHelicon, to predict inter-helical residue contacts in TM proteins by considering only coevolutionary features. DeepHelicon comprises a two-stage supervised learning process by residual neural networks for a gradual refinement of contact maps, followed by variance reduction by an ensemble of models. We present a benchmark study of 12 contact predictors and conclude that DeepHelicon together with the two other state-of-the-art methods DeepMetaPSICOV and Membrain2 outperforms the 10 remaining algorithms on all datasets and at all settings. On a set of 44 TM proteins with an average length of 388 residues DeepHelicon achieves the best performance among all benchmarked methods in predicting the top L/5 and L/2 inter-helical contacts, with the mean precision of 87.42% and 77.84%, respectively. On a set of 57 relatively small TM proteins with an average length of 298 residues DeepHelicon ranks second best after DeepMetaPSICOV. DeepHelicon produces the most accurate predictions for large proteins with more than 10 transmembrane helices. Coevolutionary features alone allow to predict inter-helical residue contacts with an accuracy sufficient for generating acceptable 3D models for up to 30% of proteins using a fully automated modeling method such as CONFOLD2.
Assuntos
Proteínas de Membrana/química , Algoritmos , Aminoácidos/química , Biologia Computacional/métodos , Bases de Dados de Proteínas , Redes Neurais de Computação , Estrutura Secundária de Proteína , Análise de Sequência de Proteína/métodosRESUMO
BACKGROUND: Voltage-gated sodium channels Nav1.x mediate the rising phase of action potential in excitable cells. Variations in gene SCN5A, which encodes the hNav1.5 channel, are associated with arrhythmias and other heart diseases. About 1,400 SCN5A variants are listed in public databases, but for more than 30% of these the clinical significance is unknown and can currently only be derived by bioinformatics approaches. METHODS AND RESULTS: We used the ClinVar, SwissVar, Humsavar, gnomAD, and Ensembl databases to assemble a dataset of 1392 hNav1.5 variants (370 pathogenic variants, 602 benign variants and 420 variants of uncertain significance) as well as a dataset of 1766 damaging variants in 20 human sodium and calcium channel paralogs. Twelve in silico tools were tested for their ability to predict damaging mutations in hNav1.5. The best performing tool, MutPred, correctly predicted 93% of damaging variants in our hNav1.5 dataset. Among the 86 hNav1.5 variants for which electrophysiological data are also available, MutPred correctly predicted 82% of damaging variants. In the subset of 420 uncharacterized hNav1.5 variants MutPred predicted 196 new pathogenic variants. Among these, 74 variants are also annotated as damaging in at least one hNav1.5 paralog. CONCLUSIONS: Using a combination of sequence-based bioinformatics techniques and paralogous annotation we have substantially expanded the knowledge on disease variants in the cardiac sodium channel and assigned a pathogenic status to a number of mutations that so far have been described as variants of uncertain significance. A list of reclassified hNav1.5 variants and their properties is provided.
Assuntos
Mutação , Canal de Sódio Disparado por Voltagem NAV1.5/genética , Simulação por Computador , Predisposição Genética para Doença , Genômica/métodos , Cardiopatias/genética , Humanos , Modelos Moleculares , Canal de Sódio Disparado por Voltagem NAV1.5/química , Conformação ProteicaRESUMO
Many integral membrane proteins, just like their globular counterparts, form either transient or permanent multi-subunit complexes to fulfill specific cellular roles. Although numerous interactions between these proteins have been experientially determined, the structural coverage of the complexes is very low. Therefore, the computational identification of the amino acid residues involved in the interaction interfaces is a crucial step towards the functional annotation of all membrane proteins.Here, we present MBPred, a sequence-based method for predicting the interface residues in transmembrane proteins. An unique feature of our method is that it contains separate random forest models for two different use cases: (a) when the location of transmembrane regions is precisely known from a crystal structure, and (b) when it is predicted from sequence. In stark contrast to the aqueous-exposed protein segments, we found that the interaction sites located in the membrane are not enriched for evolutionary conservation, most likely due to their restricted amino acid composition or their random distribution among buried and exposed residues. On the other hand, residue co-evolution proved to be a very informative feature which has not so far been used for predicting interaction sites in individual proteins. MBPred reaches AUC, precision and recall values of 0.79/0.73, 0.69/0.51 and 0.55/0.48 on the cross-validation and independent test dataset, respectively, thus outperforming the previously published method of Bordner as well as all methods trained on globular proteins. Moreover, we show that for the majority of complete interface patches, the method captures more than 50% of the involved residues.
Assuntos
Evolução Biológica , Proteínas de Membrana/metabolismo , Algoritmos , Sítios de Ligação , Biologia Computacional , Bases de Dados de Proteínas , Ligação ProteicaRESUMO
Mutations in transmembrane proteins (TMPs) have diverse effects on their structure and functions, which may lead to various diseases. In this present study, we have investigated variations in human membrane proteins and found that negatively charged to positively charged/polar and nonpolar to nonpolar changes are dominant in disease-causing and neutral mutations, respectively. Further, we analyzed the top 10 preferred mutations in 14 different disease classes and found that each class has at least two Arg mutations. Moreover, in cardiovascular diseases and congenital disorders of metabolism, Cys mutations occur more frequently in single-pass proteins, whereas Arg and nonpolar residues are more frequently substituted in multi-pass membrane proteins. The immune system diseases are enriched in C â R and C â Y mutations in inside and outside regions. On the other hand, in the membrane region, E â K and R â Q mutations are prevalent. The comparison of mutations in topologically similar regions of globular and membrane proteins showed that Ser and Thr mutations cause deleterious effects in membrane regions, whereas Cys and charged residues, Asp and Arg are prevalent in the buried regions of globular proteins. Our comprehensive analysis of disease-associated mutations in transmembrane proteins will be useful for developing prediction tools.
Assuntos
Proteínas de Membrana/química , Humanos , Proteínas de Membrana/genética , Mutação/genética , Mutação de Sentido Incorreto/genética , Conformação ProteicaRESUMO
Motivation: The V3 loop of the gp120 glycoprotein of the Human Immunodeficiency Virus 1 (HIV-1) is considered to be responsible for viral coreceptor tropism. gp120 interacts with the CD4 receptor of the host cell and subsequently V3 binds either CCR5 or CXCR4. Due to the fact that the CCR5 coreceptor is targeted by entry inhibitors, a reliable prediction of the coreceptor usage of HIV-1 is of great interest for antiretroviral therapy. Although several methods for the prediction of coreceptor tropism are available, almost all of them have been developed based on only subtype B sequences, and it has been shown in several studies that the prediction of non-B sequences, in particular subtype A sequences, are less reliable. Thus, the aim of the current study was to develop a reliable prediction model for subtype A viruses. Results: Our new model SCOTCH is based on a stacking approach of classifier ensembles and shows a significantly better performance for subtype A sequences compared to other available models. In particular for low false positive rates (between 0.05 and 0.2, i.e. recommendation in the German and European Guidelines for tropism prediction), SCOTCH shows significantly better prediction performances in terms of partial area under the curves and diagnostic odds ratios compared to existing tools, and thus can be used to reliably predict coreceptor tropism for subtype A sequences. Availability and implementation: SCOTCH can be downloaded/accessed at http://www.heiderlab.de.
Assuntos
Proteína gp120 do Envelope de HIV/metabolismo , Infecções por HIV/metabolismo , HIV-1/metabolismo , Análise de Sequência de Proteína/métodos , Software , Tropismo Viral , Antagonistas dos Receptores CCR5 , Biologia Computacional/métodos , Infecções por HIV/virologia , HIV-1/fisiologia , Humanos , Receptores CCR5/efeitos dos fármacos , Receptores CCR5/metabolismo , Receptores CXCR4/metabolismoRESUMO
Motivation: Existing sources of experimental mutation data do not consider the structural environment of amino acid substitutions and distinguish between soluble and membrane proteins. They also suffer from a number of further limitations, including data redundancy, lack of disease classification, incompatible information content, and ambiguous annotations (e.g. the same mutation being annotated as disease and benign). Results: We have developed a novel database, MutHTP, which contains information on 183 395 disease-associated and 17 827 neutral mutations in human transmembrane proteins. For each mutation site MutHTP provides a description of its location with respect to the membrane protein topology, structural environment (if available) and functional features. Comprehensive visualization, search, display and download options are available. Availability and implementation: The database is publicly available at http://www.iitm.ac.in/bioinfo/MutHTP/. The website is implemented using HTML, PHP and javascript and supports recent versions of all major browsers, such as Firefox, Chrome and Opera. Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Proteínas de Membrana/genética , Mutação , Software , Bases de Dados Factuais , HumanosRESUMO
Hepatitis delta virus (HDV) is an RNA virus which leads to both acute and chronic forms of hepatitis. At present, HDV isolates have been classified into eight major genotypes distributed over different geographical regions. Recent increase in HDV sequences in Europe and worldwide has enabled us to revisit the taxonomic classification of HDV. A total of 116 large hepatitis delta antigen (L-HDAg) nucleotide sequences and 13 full-length HDV genome sequences belonging to genotype-1 from our European cohort, as well as 621 L-HDAg nucleotide sequences belonging to genotype-1 to genotype-8 retrieved from the GenBank NCBI were included in this study. All 116 isolates of our cohort and 341 of 621 isolates (60%) account for genotype-1, while the remaining 40% of isolates were unevenly distributed across genotype-2 to genotype-8. Phylogenetic analysis of 98 L-HDAg sequences selected after elimination of redundant sequences of all 737 isolates was performed to identify plausible subtypes within HDV genotype-1. Pairwise genetic distances for L-HDAg sequences were calculated to estimate the inter-genotype and inter-subtype differences. The HDV genotype-1 isolates phylogenetically formed five distinct clusters (genotype 1a-1e), each of them corresponding to a distinct geographic region. Two distinct subtypes for HDV genotype-2 and -4 (ie -2a and -2b; -4a and -4b, respectively) could be identified based on isolate sequences from GenBank. The previously defined genotype-1 to genotype-8 have an inter-genotypic difference of ≥10%, while the newly defined subtypes of genotype-1, -2 and -4 show an inter-subtype difference of ≥3% to <10% from the average diversity. In addition, we identified unique amino acid residues, known as specificity-determining positions, amongst the proposed subtypes.
Assuntos
Variação Genética , Genoma Viral , Genótipo , Hepatite D/epidemiologia , Hepatite D/virologia , Vírus Delta da Hepatite/classificação , Vírus Delta da Hepatite/genética , Europa (Continente)/epidemiologia , Humanos , Filogenia , Filogeografia , Recombinação GenéticaRESUMO
Chronic HBV infection results in various clinical manifestations due to different levels of immune response. In recent years, hepatitis B treatment has improved by long-term administration of nucleos(t)ide analogues (NUCs) and peg-interferon. Nucleic acid polymers (NAPs; REP 2139-Ca and REP 2139-Mg) are new antiviral drugs that block the assembly of subviral particles, thus preventing the release of HBsAg and allowing its clearance and restoration of functional control of infection when combined with various immunotherapies. In the REP 102 study (NCT02646189), 9 of 12 patients showed substantial reduction of HBsAg and seroconversion to anti-HBs in response to REP 2139-Ca, whereas 3 of 12 patients did not show responses (>1 log reduction of HBsAg and HBV DNA from baseline). We characterized the dynamic changes of HBV quasispecies (QS) within the major hydrophilic region (MHR) of the 'pre-S/S' open reading frame including the 'a' determinant in responders and nonresponders of the REP 102 study and four untreated matched controls. HBV QS complexity at baseline varied slightly between responders and nonresponders (P = .28). However, these responders showed significant decline in viral complexity (P = .001) as REP 2139-Ca therapy progressed but no significant change in complexity was observed among the nonresponders (P = .99). The MHR mutations were more frequently observed in responders than in nonresponders and matched controls. No mutations were observed in 'a' determinant of major QS population which may interfere with the detection of HBsAg by diagnostic assays. No specific mutations were found within the MHR which could explain patients' poor HBsAg response during REP 2139-Ca therapy.
Assuntos
Antígenos de Superfície da Hepatite B/imunologia , Antígenos E da Hepatite B/imunologia , Vírus da Hepatite B , Hepatite B Crônica/epidemiologia , Adulto , Antivirais/uso terapêutico , DNA Viral , Feminino , Variação Genética , Genótipo , Anticorpos Anti-Hepatite B/imunologia , Vírus da Hepatite B/efeitos dos fármacos , Vírus da Hepatite B/genética , Vírus da Hepatite B/imunologia , Hepatite B Crônica/tratamento farmacológico , Hepatite B Crônica/imunologia , Hepatite B Crônica/virologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Ácidos Nucleicos/uso terapêutico , Polímeros/uso terapêutico , Quase-Espécies/genética , Quase-Espécies/imunologia , Adulto JovemRESUMO
Translation of consecutive prolines causes ribosome stalling, which is alleviated but cannot be fully compensated by the elongation factor P. However, the presence of polyproline motifs in about one third of the E. coli proteins underlines their potential functional importance, which remains largely unexplored. We conducted an evolutionary analysis of polyproline motifs in the proteomes of 43 E. coli strains and found evidence of evolutionary selection against translational stalling, which is especially pronounced in proteins with high translational efficiency. Against the overall trend of polyproline motif loss in evolution, we observed their enrichment in the vicinity of translational start sites, in the inter-domain regions of multi-domain proteins, and downstream of transmembrane helices. Our analysis demonstrates that the time gain caused by ribosome pausing at polyproline motifs might be advantageous in protein regions bracketing domains and transmembrane helices. Polyproline motifs might therefore be crucial for co-translational folding and membrane insertion.
Assuntos
Motivos de Aminoácidos , Escherichia coli/metabolismo , Elongação Traducional da Cadeia Peptídica , Peptídeos/química , Biossíntese de Proteínas , Proteínas de Escherichia coli/metabolismo , Evolução Molecular , Fatores de Alongamento de Peptídeos/metabolismo , Filogenia , Dobramento de Proteína , Proteoma/metabolismo , Proteômica , Ribossomos/metabolismoRESUMO
Secondary structure elements in the coding regions of mRNAs play an important role in gene expression and regulation, but distinguishing functional from non-functional structures remains challenging. Here we investigate the dependence of sequence-structure relationships in the coding regions on temperature based on the recent PARTE data by Wan et al. Our main finding is that the regions with high and low thermostability (high Tm and low Tm regions) are under evolutionary pressure to preserve RNA secondary structure and primary sequence, respectively. Sequences of low Tm regions display a higher degree of evolutionary conservation compared to high Tm regions. Low Tm regions are under strong synonymous constraint, while high Tm regions are not. These findings imply that high Tm regions contain thermo-stable functionally important RNA structures, which impose relaxed evolutionary constraint on sequence as long as the base-pairing patterns remain intact. By contrast, low thermostability regions contain single-stranded functionally important conserved RNA sequence elements accessible for binding by other molecules. We also find that theoretically predicted structures of paralogous mRNA pairs become more similar with growing temperature, while experimentally measured structures tend to diverge, which implies that the melting pathways of RNA structures cannot be fully captured by current computational approaches.