Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Mol Cell Proteomics ; 18(8 suppl 1): S126-S140, 2019 08 09.
Artículo en Inglés | MEDLINE | ID: mdl-31040227

RESUMEN

PROTEOFORMER is a pipeline that enables the automated processing of data derived from ribosome profiling (RIBO-seq, i.e. the sequencing of ribosome-protected mRNA fragments). As such, genome-wide ribosome occupancies lead to the delineation of data-specific translation product candidates and these can improve the mass spectrometry-based identification. Since its first publication, different upgrades, new features and extensions have been added to the PROTEOFORMER pipeline. Some of the most important upgrades include P-site offset calculation during mapping, comprehensive data pre-exploration, the introduction of two alternative proteoform calling strategies and extended pipeline output features. These novelties are illustrated by analyzing ribosome profiling data of human HCT116 and Jurkat data. The different proteoform calling strategies are used alongside one another and in the end combined together with reference sequences from UniProt. Matching mass spectrometry data are searched against this extended search space with MaxQuant. Overall, besides annotated proteoforms, this pipeline leads to the identification and validation of different categories of new proteoforms, including translation products of up- and downstream open reading frames, 5' and 3' extended and truncated proteoforms, single amino acid variants, splice variants and translation products of so-called noncoding regions. Further, proof-of-concept is reported for the improvement of spectrum matching by including Prosit, a deep neural network strategy that adds extra fragmentation spectrum intensity features to the analysis. In the light of ribosome profiling-driven proteogenomics, it is shown that this allows validating the spectrum matches of newly identified proteoforms with elevated stringency. These updates and novel conclusions provide new insights and lessons for the ribosome profiling-based proteogenomic research field. More practical information on the pipeline, raw code, the user manual (README) and explanations on the different modes of availability can be found at the GitHub repository of PROTEOFORMER: https://github.com/Biobix/proteoformer.


Asunto(s)
Proteogenómica/métodos , Ribosomas/metabolismo , Cromatografía Liquida , Células HCT116 , Humanos , Células Jurkat , Espectrometría de Masas en Tándem
2.
Mol Cell Proteomics ; 16(6): 1064-1080, 2017 06.
Artículo en Inglés | MEDLINE | ID: mdl-28432195

RESUMEN

Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes.


Asunto(s)
Arabidopsis/genética , Arabidopsis/metabolismo , Biosíntesis de Proteínas/genética , Genoma de Planta , Biblioteca de Péptidos , Péptidos/genética , Péptidos/metabolismo , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Proteogenómica
3.
Nucleic Acids Res ; 45(20): e168, 2017 Nov 16.
Artículo en Inglés | MEDLINE | ID: mdl-28977509

RESUMEN

Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated methods depend heavily on sequence composition and often underestimate the complexity of the proteome. We developed RibosomeE Profiling Assisted (re-)AnnotaTION (REPARATION), a de novo machine learning algorithm that takes advantage of experimental protein synthesis evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation (https://github.com/Biobix/REPARATION). REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds based on a growth curve model to screen for spurious ORFs. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel (small) ORFs including variants of previously annotated ORFs and >70% of all (variants of) annotated protein coding ORFs were predicted by REPARATION to be translated. Our predictions are supported by matching mass spectrometry proteomics data, sequence composition and conservation analysis. REPARATION is unique in that it makes use of experimental translation evidence to intrinsically perform a de novo ORF delineation in bacterial genomes irrespective of the sequence features linked to open reading frames.


Asunto(s)
Bacillus subtilis/genética , Biología Computacional/métodos , Escherichia coli K12/genética , Genoma Bacteriano/genética , Anotación de Secuencia Molecular/métodos , Salmonella typhimurium/genética , Algoritmos , Mapeo Cromosómico , Aprendizaje Automático , Sistemas de Lectura Abierta/genética , Ribosomas/genética
4.
Nucleic Acids Res ; 45(13): 7997-8013, 2017 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-28541577

RESUMEN

Alternative translation initiation mechanisms such as leaky scanning and reinitiation potentiate the polycistronic nature of human transcripts. By allowing for reprogrammed translation, these mechanisms can mediate biological responses to stimuli. We combined proteomics with ribosome profiling and mRNA sequencing to identify the biological targets of translation control triggered by the eukaryotic translation initiation factor 1 (eIF1), a protein implicated in the stringency of start codon selection. We quantified expression changes of over 4000 proteins and 10 000 actively translated transcripts, leading to the identification of 245 transcripts undergoing translational control mediated by upstream open reading frames (uORFs) upon eIF1 deprivation. Here, the stringency of start codon selection and preference for an optimal nucleotide context were largely diminished leading to translational upregulation of uORFs with suboptimal start. Interestingly, genes affected by eIF1 deprivation were implicated in energy production and sensing of metabolic stress.


Asunto(s)
Factores Eucarióticos de Iniciación/metabolismo , Proteínas de Neoplasias/metabolismo , Proteínas del Tejido Nervioso/metabolismo , Iniciación de la Cadena Peptídica Traduccional , Línea Celular , Codón Iniciador , Metabolismo Energético/genética , Factores Eucarióticos de Iniciación/antagonistas & inhibidores , Factores Eucarióticos de Iniciación/genética , Expresión Génica , Técnicas de Silenciamiento del Gen , Células HCT116 , Humanos , Proteínas de Neoplasias/antagonistas & inhibidores , Proteínas de Neoplasias/genética , Proteínas del Tejido Nervioso/antagonistas & inhibidores , Proteínas del Tejido Nervioso/genética , Conformación de Ácido Nucleico , Sistemas de Lectura Abierta , ARN Mensajero/química , ARN Mensajero/genética , ARN Mensajero/metabolismo , Ribosomas/genética , Ribosomas/metabolismo , Estrés Fisiológico/genética
5.
BMC Biol ; 15(1): 76, 2017 08 30.
Artículo en Inglés | MEDLINE | ID: mdl-28854918

RESUMEN

BACKGROUND: While methods for annotation of genes are increasingly reliable, the exact identification of translation initiation sites remains a challenging problem. Since the N-termini of proteins often contain regulatory and targeting information, developing a robust method for start site identification is crucial. Ribosome profiling reads show distinct patterns of read length distributions around translation initiation sites. These patterns are typically lost in standard ribosome profiling analysis pipelines, when reads from footprints are adjusted to determine the specific codon being translated. RESULTS: Utilising these signatures in combination with nucleotide sequence information, we build a model capable of predicting translation initiation sites and demonstrate its high accuracy using N-terminal proteomics. Applying this to prokaryotic translatomes, we re-annotate translation initiation sites and provide evidence of N-terminal truncations and extensions of previously annotated coding sequences. These re-annotations are supported by the presence of structural and sequence-based features next to N-terminal peptide evidence. Finally, our model identifies 61 novel genes previously undiscovered in the Salmonella enterica genome. CONCLUSIONS: Signatures within ribosome profiling read length distributions can be used in combination with nucleotide sequence information to provide accurate genome-wide identification of translation initiation sites.


Asunto(s)
Bacterias/metabolismo , Proteínas Bacterianas/metabolismo , Procesamiento Proteico-Postraduccional , Ribosomas/metabolismo
6.
Mol Syst Biol ; 12(2): 858, 2016 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-26893308

RESUMEN

To understand the impact of alternative translation initiation on a proteome, we performed a proteome-wide study on protein turnover using positional proteomics and ribosome profiling to distinguish between N-terminal proteoforms of individual genes. By combining pulsed SILAC with N-terminal COFRADIC, we monitored the stability of 1,941 human N-terminal proteoforms, including 147 N-terminal proteoform pairs that originate from alternative translation initiation, alternative splicing or incomplete processing of the initiator methionine. N-terminally truncated proteoforms were less abundant than canonical proteoforms and often displayed altered stabilities, likely attributed to individual protein characteristics, including intrinsic disorder, but independent of N-terminal amino acid identity or truncation length. We discovered that the removal of initiator methionine by methionine aminopeptidases reduced the stability of processed proteoforms, while susceptibility for N-terminal acetylation did not seem to influence protein turnover rates. Taken together, our findings reveal differences in protein stability between N-terminal proteoforms and point to a role for alternative translation initiation and co-translational initiator methionine removal, next to alternative splicing, in the overall regulation of proteome homeostasis.


Asunto(s)
Proteoma/genética , Proteómica , Acetilación , Empalme Alternativo , Aminoácidos/química , Cromatografía Liquida , Cicloheximida/farmacología , Perfilación de la Expresión Génica , Humanos , Células Jurkat , Iniciación de la Cadena Peptídica Traduccional , Proteolisis , Proteoma/metabolismo , Ribosomas/genética , Ribosomas/metabolismo , Linfocitos T/citología , Linfocitos T/metabolismo , Espectrometría de Masas en Tándem , Ubiquitinación
7.
Nucleic Acids Res ; 43(5): e29, 2015 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-25510491

RESUMEN

An increasing amount of studies integrate mRNA sequencing data into MS-based proteomics to complement the translation product search space. However, several factors, including extensive regulation of mRNA translation and the need for three- or six-frame-translation, impede the use of mRNA-seq data for the construction of a protein sequence search database. With that in mind, we developed the PROTEOFORMER tool that automatically processes data of the recently developed ribosome profiling method (sequencing of ribosome-protected mRNA fragments), resulting in genome-wide visualization of ribosome occupancy. Our tool also includes a translation initiation site calling algorithm allowing the delineation of the open reading frames (ORFs) of all translation products. A complete protein synthesis-based sequence database can thus be compiled for mass spectrometry-based identification. This approach increases the overall protein identification rates with 3% and 11% (improved and new identifications) for human and mouse, respectively, and enables proteome-wide detection of 5'-extended proteoforms, upstream ORF translation and near-cognate translation start sites. The PROTEOFORMER tool is available as a stand-alone pipeline and has been implemented in the galaxy framework for ease of use.


Asunto(s)
Biología Computacional/métodos , Espectrometría de Masas/métodos , Proteoma/metabolismo , Proteómica/métodos , Ribosomas/metabolismo , Secuencia de Aminoácidos , Animales , Células Cultivadas , Bases de Datos de Proteínas , Genoma/genética , Células HCT116 , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Ratones , Datos de Secuencia Molecular , Sistemas de Lectura Abierta/genética , Biosíntesis de Proteínas/genética , Proteoma/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo , Reproducibilidad de los Resultados , Ribosomas/genética , Homología de Secuencia de Aminoácido
8.
Proteomics ; 14(23-24): 2688-98, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25156699

RESUMEN

Next-generation transcriptome sequencing is increasingly integrated with MS to enhance MS-based protein and peptide identification. Recently, a breakthrough in transcriptome analysis was achieved with the development of ribosome profiling (ribo-seq). This technology is based on the deep sequencing of ribosome-protected mRNA fragments, thereby enabling the direct observation of in vivo protein synthesis at the transcript level. In order to explore the impact of a ribo-seq-derived protein sequence search space on MS/MS spectrum identification, we performed a comprehensive proteome study on a human cancer cell line, using both shotgun and N-terminal proteomics, next to ribosome profiling, which was used to delineate (alternative) translational reading frames. By including protein-level evidence of sample-specific genetic variation and alternative translation, this strategy improved the identification score of 69 proteins and identified 22 new proteins in the shotgun experiment. Furthermore, we discovered 18 new alternative translation start sites in the N-terminal proteomics data and observed a correlation between the quantitative measures of ribo-seq and shotgun proteomics with a Pearson correlation coefficient ranging from 0.483 to 0.664. Overall, this study demonstrated the benefits of ribosome profiling for MS-based protein and peptide identification and we believe this approach could develop into a common practice for next-generation proteomics.


Asunto(s)
Biología Computacional/métodos , Proteínas/metabolismo , Proteómica/métodos , Ribosomas/metabolismo , Células HCT116 , Humanos , Biosíntesis de Proteínas/genética , Proteínas/genética , Espectrometría de Masas en Tándem
9.
Front Plant Sci ; 12: 778804, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-35069635

RESUMEN

Alternative translation initiation is a widespread event in biology that can shape multiple protein forms or proteoforms from a single gene. However, the respective contribution of alternative translation to protein complexity remains largely enigmatic. By complementary ribosome profiling and N-terminal proteomics (i.e., riboproteogenomics), we provide clear-cut evidence for ~90 N-terminal proteoform pairs shaped by (alternative) translation initiation in Arabidopsis thaliana. Next to several cases additionally confirmed by directed mutagenesis, identified alternative protein N-termini follow the enzymatic rules of co-translational N-terminal protein acetylation and initiator methionine removal. In contrast to other eukaryotic models, N-terminal acetylation in plants cannot generally be considered as a proxy of translation initiation because of its posttranslational occurrence on mature proteolytic neo-termini (N-termini) localized in the chloroplast stroma. Quantification of N-terminal acetylation revealed differing co- vs. posttranslational N-terminal acetylation patterns. Intriguingly, our data additionally hints to alternative translation initiation serving as a common mechanism to supply protein copies in multiple cellular compartments, as alternative translation sites are often in close proximity to cleavage sites of N-terminal transit sequences of nuclear-encoded chloroplastic and mitochondrial proteins. Overall, riboproteogenomics screening enables the identification of (differential localized) N-terminal proteoforms raised upon alternative translation.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA