RESUMO
Translation of noncoding regions is ubiquitous and upregulated in disease. Kesner et al.1 elucidate the mechanism by which the BAG6 complex exerts quality control over noncoding translation while targeting stable, noncanonical polypeptides to cellular membranes.
Assuntos
Chaperonas Moleculares , Peptídeos , Controle de QualidadeRESUMO
Proteogenomic identification of translated small open reading frames has revealed thousands of previously unannotated, largely uncharacterized microproteins, or polypeptides of less than 100 amino acids, and alternative proteins (alt-proteins) that are co-encoded with canonical proteins and are often larger. The subcellular localizations of microproteins and alt-proteins are generally unknown but can have significant implications for their functions. Proximity biotinylation is an attractive approach to define the protein composition of subcellular compartments in cells and in animals. Here, we developed a high-throughput technology to map unannotated microproteins and alt-proteins to subcellular localizations by proximity biotinylation with TurboID (MicroID). More than 150 microproteins and alt-proteins are associated with subnuclear organelles. One alt-protein, alt-LAMA3, localizes to the nucleolus and functions in pre-rRNA transcription. We applied MicroID in a mouse model, validating expression of a conserved nuclear microprotein, and establishing MicroID for discovery of microproteins and alt-proteins in vivo.
Assuntos
Peptídeos , Proteínas , Animais , Nucléolo Celular , Camundongos , Fases de Leitura Aberta , Peptídeos/genética , Proteínas/genéticaRESUMO
Many unannotated microproteins and alternative proteins (alt-proteins) are coencoded with canonical proteins, but few of their functions are known. Motivated by the hypothesis that alt-proteins undergoing regulated synthesis could play important cellular roles, we developed a chemoproteomic pipeline to identify nascent alt-proteins in human cells. We identified 22 actively translated alt-proteins or N-terminal extensions, one of which is post-transcriptionally upregulated by DNA damage stress. We further defined a nucleolar, cell-cycle-regulated alt-protein that negatively regulates assembly of the pre-60S ribosomal subunit (MINAS-60). Depletion of MINAS-60 increases the amount of cytoplasmic 60S ribosomal subunit, upregulating global protein synthesis and cell proliferation. Mechanistically, MINAS-60 represses the rate of late-stage pre-60S assembly and export to the cytoplasm. Together, these results implicate MINAS-60 as a potential checkpoint inhibitor of pre-60S assembly and demonstrate that chemoproteomics enables hypothesis generation for uncharacterized alt-proteins.
Assuntos
Proteínas de Saccharomyces cerevisiae , Proteínas de Ciclo Celular/metabolismo , Humanos , RNA Ribossômico , Proteínas Ribossômicas/metabolismo , Subunidades Ribossômicas Maiores de Eucariotos/genética , Subunidades Ribossômicas Maiores de Eucariotos/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismoRESUMO
The annotation of the mammalian protein-coding genome is incomplete. Arbitrary size restriction of open reading frames (ORFs) and the absolute requirement for a methionine codon as the sole initiator of translation have constrained the identification of potentially important transcripts with non-canonical protein-coding potential1,2. Here, using unbiased transcriptomic approaches in macrophages that respond to bacterial infection, we show that ribosomes associate with a large number of RNAs that were previously annotated as 'non-protein coding'. Although the idea that such non-canonical ORFs can encode functional proteins is controversial3,4, we identify a range of short and non-ATG-initiated ORFs that can generate stable and spatially distinct proteins. Notably, we show that the translation of a new ORF 'hidden' within the long non-coding RNA Aw112010 is essential for the orchestration of mucosal immunity during both bacterial infection and colitis. This work expands our interpretation of the protein-coding genome and demonstrates that proteinaceous products generated from non-canonical ORFs are crucial for the immune response in vivo. We therefore propose that the misannotation of non-canonical ORF-containing genes as non-coding RNAs may obscure the essential role of a multitude of previously undiscovered protein-coding genes in immunity and disease.
Assuntos
Imunidade nas Mucosas/genética , Fases de Leitura Aberta/genética , Biossíntese de Proteínas , RNA Longo não Codificante/genética , Animais , Infecções Bacterianas/genética , Infecções Bacterianas/imunologia , Infecções Bacterianas/metabolismo , Infecções Bacterianas/microbiologia , Colite/genética , Colite/imunologia , Colite/metabolismo , Imunidade nas Mucosas/efeitos dos fármacos , Interleucina-12/biossíntese , Lipopolissacarídeos/farmacologia , Macrófagos/imunologia , Macrófagos/metabolismo , Camundongos , Biossíntese de Proteínas/efeitos dos fármacos , Biossíntese de Proteínas/genética , RNA Longo não Codificante/metabolismo , Ribossomos/efeitos dos fármacos , Ribossomos/metabolismo , Salmonella typhimurium/imunologia , Transcriptoma/efeitos dos fármacos , Transcriptoma/genéticaRESUMO
Advances in proteogenomic technologies have revealed hundreds to thousands of translated small open reading frames (sORFs) that encode microproteins in genomes across evolutionary space. While many microproteins have now been shown to play critical roles in biology and human disease, a majority of recently identified microproteins have little or no experimental evidence regarding their functionality. Computational tools have some limitations for analysis of short, poorly conserved microprotein sequences, so additional approaches are needed to determine the role of each member of this recently discovered polypeptide class. A currently underexplored avenue in the study of microproteins is structure prediction and determination, which delivers a depth of functional information. In this review, we provide a brief overview of microprotein discovery methods, then examine examples of microprotein structures (and, conversely, intrinsic disorder) that have been experimentally determined using crystallography, cryo-electron microscopy, and NMR, which provide insight into their molecular functions and mechanisms. Additionally, we discuss examples of predicted microprotein structures that have provided insight or context regarding their function. Analysis of microprotein structure at the angstrom level, and confirmation of predicted structures, therefore, has potential to identify translated microproteins that are of biological importance and to provide molecular mechanism for their in vivo roles.
Assuntos
Micropeptídeos , Proteogenômica , Humanos , Microscopia Crioeletrônica , Peptídeos , Proteogenômica/métodos , Fases de Leitura AbertaRESUMO
Thousands of unannotated small and alternative open reading frames (smORFs and alt-ORFs, respectively) have recently been revealed in mammalian genomes. While hundreds of mammalian smORF- and alt-ORF-encoded proteins (SEPs and alt-proteins, respectively) affect cell proliferation, the overwhelming majority of smORFs and alt-ORFs remain uncharacterized at the molecular level. Complicating the task of identifying the biological roles of smORFs and alt-ORFs, the SEPs and alt-proteins that they encode exhibit limited sequence homology to protein domains of known function. Experimental techniques for the functionalization of these gene classes are therefore required. Approaches combining chemical labeling and quantitative proteomics have greatly advanced our ability to identify and characterize functional SEPs and alt-proteins in high throughput. In this review, we briefly describe the principles of proteomic discovery of SEPs and alt-proteins, then summarize how these technologies interface with chemical labeling for identification of SEPs and alt-proteins with specific properties, as well as in defining the interactome of SEPs and alt-proteins.
Assuntos
Peptídeos , Proteômica , Animais , Fases de Leitura Aberta , Peptídeos/química , Proteínas/genética , Genoma , Mamíferos/metabolismoRESUMO
V(D)J recombination assembles and diversifies Ig and T cell receptor genes in developing B and T lymphocytes. The reaction is initiated by the RAG1-RAG2 protein complex which binds and cleaves at discrete gene segments in the antigen receptor loci. To identify mechanisms that regulate V(D)J recombination, we used proximity-dependent biotin identification to analyze the interactomes of full-length and truncated forms of RAG1 in pre-B cells. This revealed an association of RAG1 with numerous nucleolar proteins in a manner dependent on amino acids 216 to 383 and allowed identification of a motif required for nucleolar localization. Experiments in transformed pre-B cell lines and cultured primary pre-B cells reveal a strong correlation between disruption of nucleoli, reduced association of RAG1 with a nucleolar marker, and increased V(D)J recombination activity. Mutation of the RAG1 nucleolar localization motif boosts recombination while removal of the first 215 amino acids of RAG1, required for efficient egress from nucleoli, reduces recombination activity. Our findings indicate that nucleolar sequestration of RAG1 is a negative regulatory mechanism in V(D)J recombination and identify regions of the RAG1 N-terminal region that control nucleolar association and egress.
Assuntos
Nucléolo Celular/metabolismo , Proteínas de Homeodomínio/metabolismo , Recombinação V(D)J , Motivos de Aminoácidos , Animais , Nucléolo Celular/genética , Células Cultivadas , Proteínas de Homeodomínio/química , Proteínas de Homeodomínio/genética , Camundongos , Células Precursoras de Linfócitos B/metabolismo , Transporte ProteicoRESUMO
Proteogenomic identification of translated small open reading frames in humans has revealed thousands of microproteins, or polypeptides of fewer than 100 amino acids, that were previously invisible to geneticists. Hundreds of microproteins have been shown to be essential for cell growth and proliferation, and many regulate macromolecular complexes. However, the vast majority of microproteins remain functionally uncharacterized, and many lack secondary structure and exhibit limited evolutionary conservation. One such intrinsically disordered microprotein is NBDY, a 68-amino acid component of membraneless organelles known as P-bodies. In this work, we show that NBDY can undergo liquid-liquid phase separation, a biophysical process thought to underlie the formation of membraneless organelles, in the presence of RNA in vitro. Phosphorylation of NBDY drives liquid phase remixing in vitro and macroscopic P-body dissociation in cells undergoing growth factor signaling and cell division. These results suggest that NBDY phosphorylation enables regulation of P-body dynamics during cell proliferation and, more broadly, that intrinsically disordered microproteins may contribute to liquid-liquid phase separation and remixing behavior to affect cellular processes.
Assuntos
Proteínas Intrinsicamente Desordenadas/síntese química , Condensados Biomoleculares , Humanos , Proteínas Intrinsicamente Desordenadas/química , Tamanho da Partícula , FosforilaçãoRESUMO
Recent ribosome profiling and proteomic studies have revealed the presence of thousands of novel coding sequences, referred to as small open reading frames (sORFs), in prokaryotic and eukaryotic genomes. These genes have defied discovery via traditional genomic tools not only because they tend to be shorter than standard gene annotation length cutoffs, but also because they are, as a class, enriched in sequence properties previously assumed to be unusual, including non-AUG start codons. In this review, we summarize what is currently known about the incidence, efficiency, and mechanism of non-AUG start codon usage in prokaryotes and eukaryotes, and provide examples of regulatory and functional sORFs that initiate at non-AUG codons. While only a handful of non-AUG-initiated novel genes have been characterized in detail to date, their participation in important biological processes suggests that an improved understanding of this class of genes is needed.
Assuntos
Códon de Iniciação/química , Genoma , Fases de Leitura Aberta , Iniciação Traducional da Cadeia Peptídica , Proteoma/genética , Ribossomos/genética , Códon de Iniciação/metabolismo , Biologia Computacional/métodos , Eucariotos/genética , Eucariotos/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala , Anotação de Sequência Molecular/métodos , Células Procarióticas/metabolismo , Sinais Direcionadores de Proteínas/genética , Proteoma/classificação , Proteoma/metabolismo , Ribossomos/classificação , Ribossomos/metabolismoRESUMO
Polypeptides generated from proteolytic processing of protein precursors, or proteolytic proteoforms, play an important role in diverse biological functions and diseases. However, their often-small size and intricate post-translational biogenesis preclude the use of simple genetic tagging in their cellular studies. Herein, we develop a labeling strategy for this class of proteoforms, based on residue-specific genetic code expansion labeling with a molecular beacon design. We demonstrate the utility of such a design by creating a molecular beacon reporter to detect amyloid-ß peptides, known to be involved in the pathogenesis of Alzheimer's disease, as they are produced from amyloid precursor protein (APP) along the endocytic pathway of living cells.
Assuntos
Peptídeos beta-Amiloides/metabolismo , Precursor de Proteína beta-Amiloide/metabolismo , Lisina/análogos & derivados , Aminoacil-tRNA Sintetases/genética , Aminoacil-tRNA Sintetases/metabolismo , Peptídeos beta-Amiloides/química , Precursor de Proteína beta-Amiloide/genética , Proteínas Arqueais/genética , Proteínas Arqueais/metabolismo , Código Genético , Células HEK293 , Humanos , Lisina/química , Lisina/metabolismo , Methanosarcina/enzimologia , Microscopia de Fluorescência , Mutagênese Sítio-Dirigida , Processamento de Proteína Pós-TraducionalRESUMO
Decapping is the first committed step in 5'-to-3' RNA decay, and in the cytoplasm of human cells, multiple decapping enzymes regulate the stabilities of distinct subsets of cellular transcripts. However, the complete set of RNAs regulated by any individual decapping enzyme remains incompletely mapped, and no consensus sequence or property is currently known to unambiguously predict decapping enzyme substrates. Dcp2 was the first-identified and best-studied eukaryotic decapping enzyme, but it has been shown to regulate the stability of <400 transcripts in mammalian cells to date. Here, we globally profile changes in the stability of the human transcriptome in Dcp2 knockout cells via TimeLapse-seq. We find that P-body enrichment is the strongest correlate of Dcp2-dependent decay and that modification with m6A exhibits an additive effect with P-body enrichment for Dcp2 targeting. These results are consistent with a model in which P-bodies represent sites where translationally repressed transcripts are sorted for decay by soluble cytoplasmic decay complexes through additional molecular marks.
Assuntos
Endorribonucleases/metabolismo , Animais , Citoplasma/genética , Citoplasma/metabolismo , Endorribonucleases/genética , Humanos , Modelos Biológicos , Estabilidade de RNA/genética , Estabilidade de RNA/fisiologia , Transcriptoma/genética , Transcriptoma/fisiologiaRESUMO
Proteogenomic identification of translated small open reading frames in humans has revealed thousands of microproteins, or polypeptides of fewer than 100 amino acids, that were previously invisible to geneticists. Hundreds of microproteins have been shown to be essential for cell growth and proliferation, and many regulate macromolecular complexes. One such regulatory microprotein is NBDY, a 68-amino acid component of the human cytoplasmic RNA decapping complex. Heterologously expressed NBDY was previously reported to regulate cytoplasmic ribonucleoprotein granules known as P-bodies and reporter gene stability, but the global effect of endogenous NBDY on the cellular transcriptome remained undefined. In this work, we demonstrate that endogenous NBDY directly interacts with the human RNA decapping complex through EDC4 and DCP1A and localizes to P-bodies. Global profiling of RNA stability changes in NBDY knockout (KO) cells reveals dysregulated stability of more than 1400 transcripts. DCP2 substrate transcript half-lives are both increased and decreased in NBDY KO cells, which correlates with 5' UTR length. NBDY deletion additionally alters the stability of non-DCP2 target transcripts, possibly as a result of downregulated expression of nonsense-mediated decay factors in NBDY KO cells. We present a comprehensive model of the regulation of RNA stability by NBDY.
Assuntos
Capuzes de RNA/química , Capuzes de RNA/metabolismo , Células HEK293 , Humanos , Degradação do RNAm Mediada por Códon sem Sentido/genética , Degradação do RNAm Mediada por Códon sem Sentido/fisiologia , Fases de Leitura Aberta/genética , Estabilidade de RNA , RNA Mensageiro/química , RNA Mensageiro/metabolismoRESUMO
Ribosome profiling and mass spectrometry have revealed thousands of small and alternative open reading frames (sm/alt-ORFs) that are translated into polypeptides variously termed as microproteins and alt-proteins in mammalian cells. Some micro-/alt-proteins exhibit stress-, cell-type-, and/or tissue-specific expression; understanding this regulated expression will be critical to elucidating their functions. While differential translation has been inferred by ribosome profiling, quantitative mass spectrometry-based proteomics is needed for direct comparison of microprotein and alt-protein expression between samples and conditions. However, while label-free quantitative proteomics has been applied to detect stress-dependent expression of bacterial microproteins, this approach has not yet been demonstrated for analysis of differential expression of unannotated ORFs in the more complex human proteome. Here, we present global micro-/alt-protein quantitation in two human leukemia cell lines, K562 and MOLT4. We identify 12 unannotated proteins that are differentially expressed in these cell lines. The expression of six micro/alt-proteins from cDNA was validated biochemically, and two were found to localize to the nucleus. Thus, we demonstrate that label-free comparative proteomics enables quantitation of micro-/alt-protein expression between human cell lines. We anticipate that this workflow will enable the discovery of regulated sm/alt-ORF products across many biological conditions in human cells.
Assuntos
Proteoma , Proteômica , Linhagem Celular , Humanos , Espectrometria de Massas , Fases de Leitura Aberta , Proteoma/genéticaRESUMO
Despite decades of accumulated knowledge about proteins and their post-translational modifications (PTMs), numerous questions remain regarding their molecular composition and biological function. One of the most fundamental queries is the extent to which the combinations of DNA-, RNA- and PTM-level variations explode the complexity of the human proteome. Here, we outline what we know from current databases and measurement strategies including mass spectrometry-based proteomics. In doing so, we examine prevailing notions about the number of modifications displayed on human proteins and how they combine to generate the protein diversity underlying health and disease. We frame central issues regarding determination of protein-level variation and PTMs, including some paradoxes present in the field today. We use this framework to assess existing data and to ask the question, "How many distinct primary structures of proteins (proteoforms) are created from the 20,300 human genes?" We also explore prospects for improving measurements to better regularize protein-level biology and efficiently associate PTMs to function and phenotype.
Assuntos
Genoma Humano , Processamento de Proteína Pós-Traducional , Proteínas/química , Proteoma/química , Proteômica/métodos , Bases de Dados de Proteínas , Humanos , Espectrometria de Massas , Fenótipo , Biossíntese de Proteínas , Isoformas de Proteínas/química , Ubiquitina/químicaRESUMO
Proteomic detection of non-annotated microproteins indicates the translation of hundreds of small open reading frames (smORFs) in human cells, but whether these microproteins are functional or not is unknown. Here, we report the discovery and characterization of a 7-kDa human microprotein we named non-annotated P-body dissociating polypeptide (NoBody). NoBody interacts with mRNA decapping proteins, which remove the 5' cap from mRNAs to promote 5'-to-3' decay. Decapping proteins participate in mRNA turnover and nonsense-mediated decay (NMD). NoBody localizes to mRNA-decay-associated RNA-protein granules called P-bodies. Modulation of NoBody levels reveals that its abundance is anticorrelated with cellular P-body numbers and alters the steady-state levels of a cellular NMD substrate. These results implicate NoBody as a novel component of the mRNA decapping complex and demonstrate potential functionality of a newly discovered microprotein.
Assuntos
Proteínas de Transporte/metabolismo , Endorribonucleases/química , Endorribonucleases/metabolismo , RNA Mensageiro/metabolismo , Animais , Células COS , Células Cultivadas , Chlorocebus aethiops , Humanos , Capuzes de RNA/metabolismo , RNA Mensageiro/química , RNA Mensageiro/genéticaRESUMO
Processing bodies (P-bodies) are cytoplasmic ribonucleoprotein (RNP) granules primarily composed of translationally repressed mRNAs and proteins related to mRNA decay, suggesting roles in post-transcriptional regulation. P-bodies are conserved in eukaryotic cells and exhibit properties of liquid droplets. However, the function of P-bodies in translational repression and/or mRNA decay remains contentious. Here we review recent advances in our understanding of the molecular composition of P-bodies, the interactions and processes that regulate P-body liquid-liquid phase separation (LLPS), and the cellular localization of mRNA decay machinery, in the context of how these discoveries refine models of P-body function.
Assuntos
Grânulos Citoplasmáticos/genética , Processamento Pós-Transcricional do RNA/genética , Ribonucleoproteínas/genética , Regulação da Expressão Gênica/genética , Proteínas/genética , Estabilidade de RNA/genética , RNA Mensageiro/genéticaRESUMO
Recent advances in proteomics and genomics have enabled discovery of thousands of previously nonannotated small open reading frames (smORFs) in genomes across evolutionary space. Furthermore, quantitative mass spectrometry has recently been applied to analysis of regulated smORF expression. However, bottom-up proteomics has remained relatively insensitive to membrane proteins, suggesting they may have been underdetected in previous studies. In this report, we add biochemical membrane protein enrichment to our previously developed label-free quantitative proteomics protocol, revealing a never-before-identified heat shock protein in Escherichia coli K12. This putative smORF-encoded heat shock protein, GndA, is likely to be â¼36-55 amino acids in length and contains a predicted transmembrane helix. We validate heat shock-regulated expression of the gndA smORF and demonstrate that a GndA-GFP fusion protein cofractionates with the cell membrane. Quantitative membrane proteomics therefore has the ability to reveal nonannotated small proteins that may play roles in bacterial stress responses.
Assuntos
Escherichia coli K12/fisiologia , Proteínas de Escherichia coli/metabolismo , Proteínas de Choque Térmico Pequenas/metabolismo , Proteínas de Choque Térmico/metabolismo , Resposta ao Choque Térmico , Proteínas de Membrana/metabolismo , Modelos Moleculares , Fases de Leitura Aberta , Cromatografia Líquida de Alta Pressão , Escherichia coli/enzimologia , Escherichia coli/fisiologia , Escherichia coli K12/enzimologia , Escherichia coli K12/crescimento & desenvolvimento , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/genética , Regulação Bacteriana da Expressão Gênica , Proteínas de Fluorescência Verde/química , Proteínas de Fluorescência Verde/genética , Proteínas de Fluorescência Verde/metabolismo , Proteínas de Choque Térmico/química , Proteínas de Choque Térmico/genética , Proteínas de Choque Térmico Pequenas/química , Proteínas de Choque Térmico Pequenas/genética , Proteínas de Membrana/química , Proteínas de Membrana/genética , Anotação de Sequência Molecular , Fosfogluconato Desidrogenase/química , Fosfogluconato Desidrogenase/genética , Fosfogluconato Desidrogenase/metabolismo , Conformação Proteica em alfa-Hélice , Domínios e Motivos de Interação entre Proteínas , Transporte Proteico , Proteogenômica/métodos , Proteômica/métodos , Proteínas Recombinantes de Fusão/química , Proteínas Recombinantes de Fusão/metabolismo , Espectrometria de Massas em TandemRESUMO
Recent advances in mass spectrometry-based proteomics have revealed translation of previously nonannotated microproteins from thousands of small open reading frames (smORFs) in prokaryotic and eukaryotic genomes. Facile methods to determine cellular functions of these newly discovered microproteins are now needed. Here, we couple semiquantitative comparative proteomics with whole-genome database searching to identify two nonannotated, homologous cold shock-regulated microproteins in Escherichia coli K12 substr. MG1655, as well as two additional constitutively expressed microproteins. We apply molecular genetic approaches to confirm expression of these cold shock proteins (YmcF and YnfQ) at reduced temperatures and identify the noncanonical ATT start codons that initiate their translation. These proteins are conserved in related Gram-negative bacteria and are predicted to be structured, which, in combination with their cold shock upregulation, suggests that they are likely to have biological roles in the cell. These results reveal that previously unknown factors are involved in the response of E. coli to lowered temperatures and suggest that further nonannotated, stress-regulated E. coli microproteins may remain to be found. More broadly, comparative proteomics may enable discovery of regulated, and therefore potentially functional, products of smORF translation across many different organisms and conditions.