Your browser doesn't support javascript.
loading
What can Ribo-seq and proteomics tell us about the non-canonical proteome?
Prensner, John R; Abelin, Jennifer G; Kok, Leron W; Clauser, Karl R; Mudge, Jonathan M; Ruiz-Orera, Jorge; Bassani-Sternberg, Michal; Deutsch, Eric W; van Heesch, Sebastiaan.
Afiliación
  • Prensner JR; Department of Pediatrics, Division of Pediatric Hematology/Oncology, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
  • Abelin JG; Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
  • Kok LW; Princess Máxima Center for Pediatric Oncology, Heidelberglaan 25, 3584 CS, Utrecht, the Netherlands.
  • Clauser KR; Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
  • Mudge JM; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
  • Ruiz-Orera J; Cardiovascular and Metabolic Sciences, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany.
  • Bassani-Sternberg M; Ludwig Institute for Cancer Research, University of Lausanne, Agora Center Bugnon 25A, 1005 Lausanne, Switzerland.
  • Deutsch EW; Department of Oncology, Centre hospitalier universitaire vaudois (CHUV), Rue du Bugnon 46, 1005 Lausanne, Switzerland.
  • van Heesch S; Agora Cancer Research Centre, 1011 Lausanne, Switzerland.
bioRxiv ; 2023 May 18.
Article en En | MEDLINE | ID: mdl-37292611
ABSTRACT
Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding". In brief The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting. Highlights Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.

Texto completo: 1 Base de datos: MEDLINE Idioma: En Revista: BioRxiv Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Base de datos: MEDLINE Idioma: En Revista: BioRxiv Año: 2023 Tipo del documento: Article País de afiliación: Estados Unidos