Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Anal Chem ; 95(28): 10610-10617, 2023 07 18.
Artículo en Inglés | MEDLINE | ID: mdl-37424072

RESUMEN

Alternative splicing allows a small number of human genes to encode large amounts of proteoforms that play essential roles in normal and disease physiology. Some low-abundance proteoforms may remain undiscovered due to limited detection and analysis capabilities. Peptides coencoded by novel exons and annotated exons separated by introns are called novel junction peptides, which are the key to identifying novel proteoforms. Traditional de novo sequencing does not take into account the specificity in the composition of the novel junction peptide and is therefore not as accurate. We first developed a novel de novo sequencing algorithm, CNovo, which outperformed the mainstream PEAKS and Novor in all six test sets. We then built on CNovo to develop a semi-de novo sequencing algorithm, SpliceNovo, specifically for identifying novel junction peptides. SpliceNovo identifies junction peptides with much higher accuracy than CNovo, CJunction, PEAKS, and Novor. Of course, it is also possible to replace the built-in CNovo in SpliceNovo with other more accurate de novo sequencing algorithms to further improve its performance. We also successfully identified and validated two novel proteoforms of the human EIF4G1 and ELAVL1 genes by SpliceNovo. Our results significantly improve the ability to discover novel proteoforms through de novo sequencing.


Asunto(s)
Algoritmos , Péptidos , Humanos , Péptidos/genética , Péptidos/química , Análisis de Secuencia , Exones , Intrones , Análisis de Secuencia de Proteína/métodos
2.
J Proteome Res ; 20(12): 5294-5303, 2021 12 03.
Artículo en Inglés | MEDLINE | ID: mdl-34420305

RESUMEN

In eukaryotes, alternative pre-mRNA splicing allows a single gene to encode different protein isoforms that function in many biological processes, and they are used as biomarkers or therapeutic targets for diseases. Although protein isoforms in the human genome are well annotated, we speculate that some low-abundance protein isoforms may still be under-annotated because most genes have a primary coding product and alternative protein isoforms tend to be under-expressed. A peptide coencoded by a novel exon and an annotated exon separated by an intron is known as a novel junction peptide. In the absence of known transcripts and homologous proteins, traditional whole-genome six-frame translation-based proteogenomics cannot identify novel junction peptides, and it cannot capture novel alternative splice sites. In this article, we first propose a strategy and tool for identifying novel junction peptides, called CJunction, which we then integrate into a proteogenomics process specifically designed for novel protein isoform discovery and apply to the analysis of a deep-coverage HeLa mass spectrometry data set with identifier PXD004452 in ProteomeXchange. We succeeded in identifying and validating three novel protein isoforms of two functionally important genes, NHSL1 (causative gene of Nance-Horan syndrome) and EEF1B2 (translation elongation factor), which validate our hypothesis. These novel protein isoforms have significant sequence differences from the annotated gene-coding products introduced by the novel N-terminal, suggesting that they may play importantly different functions.


Asunto(s)
Empalme Alternativo , Factores de Intercambio de Guanina Nucleótido/genética , Factor 1 de Elongación Peptídica/genética , Proteínas , Proteogenómica , Genoma Humano , Factores de Intercambio de Guanina Nucleótido/metabolismo , Humanos , Espectrometría de Masas , Factor 1 de Elongación Peptídica/metabolismo , Péptidos/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas/genética , Proteínas/metabolismo , Proteogenómica/métodos
3.
J Proteome Res ; 17(7): 2335-2344, 2018 07 06.
Artículo en Inglés | MEDLINE | ID: mdl-29897761

RESUMEN

Microproteins are peptides composed of 100 amino acids (AA) or fewer, encoded by small open reading frames (smORFs). It has been demonstrated that microproteins participate in and regulate a wide range of functions in cells. However, the annotation and identification of microproteins is challenging in part owing to their low molecular weight, low abundancy, and hydrophobicity. These factors have led to the unannotation of smORFs in genome processing and have made their identification at the protein level difficult. Large-scale enrichment of microproteins in proteogenomics has made it possible to efficiently identify microproteins and discover unannotated smORFs in Saccharomyces cerevisiae. We integrated four microprotein-specific enrichment strategies to enhance coverage. We identified 117 microproteins, verified 31 missing proteins (MPs), and discovered 3 novel smORFs. In total, 31 proteins were confirmed as MPs by spectrum quality checking. Three novel smORFs (YKL104W-A, YHR052C-B, and YHR054C-B) were reserved after spectrum quality checking, peptide synthesizing, homologue matching, and so on. This study not only demonstrates that there are potential smORF candidates to be annotated in an extensively studied organism but also presents an efficient strategy for the discovery of small MPs. All MS data sets have been deposited to the ProteomeXchange with identifier PXD008586.


Asunto(s)
Sistemas de Lectura Abierta/genética , Proteogenómica/métodos , Proteínas de Saccharomyces cerevisiae/análisis , Saccharomyces cerevisiae/genética , Conjuntos de Datos como Asunto , Espectrometría de Masas/métodos , Peso Molecular , Saccharomyces cerevisiae/química
4.
J Proteome Res ; 17(12): 4178-4185, 2018 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-30277781

RESUMEN

In 2012, the Chromosome-centric Human Proteome Project (C-HPP) launched an investigation for missing proteins (MPs) to complete the Human Proteome Project (HPP). The majority of the MPs were distributed in low-molecular-weight (LMW) ranges, especially from 0 to 40 kDa. LMW protein identification is challenging, owing to their short length, low abundance, and hydrophobicity. Furthermore, many sequences from trypsin digestion are unlikely to yield detectable peptides or a reasonable quality of MS2 spectrum. Therefore, we focused on small MPs by combining LMW protein enrichment and a pair of complementary proteases strategy with trypsin and LysargiNase for human testis samples. In-depth testis LMW protein profiling resulted in the identification of 4063 proteins, of which 2565 were LMW proteins and 1130 had pairs of peptides generated from both trypsin and LysargiNase. This provided additional mass spectral evidence of further verification of small MPs. Finally, two MPs were verified from the seven MP candidates. One of them, Q8N688 , was verified with two series of continuous and complementary b/y-product ions from the pairs of spectra for tryptic and LysargiNase digested peptides after the "mirror spectrum" matching. This make the confident identification of the representative peptides for the target MPs. On the contrary, the two verified peptides for Q86WR6 were identified with the same strategy from the gel-separation and gel-elution samples, respectively. Although the other five MP candidates showed high-quality spectra, they could not be sufficiently distinguished as PE1s and require further verification. All MS data sets have been deposited in the ProteomeXchange with identifier PXD010093.


Asunto(s)
Péptidos/análisis , Testículo/química , Humanos , Masculino , Espectrometría de Masas/métodos , Peso Molecular , Péptido Hidrolasas/metabolismo
5.
Sheng Wu Gong Cheng Xue Bao ; 34(11): 1860-1869, 2018 Nov 25.
Artículo en Zh | MEDLINE | ID: mdl-30499281

RESUMEN

Small proteins (SPs) are defined as peptides of 100 amino acids or less encoded by short open reading frames (sORFs). SPs participate in a wide range of functions in cells, including gene regulating, cell signaling and metabolism. However, most annotated SPs in all living organisms are currently lacking expression evidence at the protein level and regarded as missing proteins (MPs). High efficient SPs identification is the prerequisite for their functional study and contribution to MPs searching. In this study, we identified 72 SPs and successfully validated 9 MPs from Saccharomyces cerevisiae based on SPs enrichment strategy. In-depth analysis showed that the missing factors of MPs were low molecular weight, low abundant, hydrophobicity, lower codon usage bias and unstable. The small protein-based enrichment can be used as MPs searching strategy, which might provide the foundation for their further function research.


Asunto(s)
Proteínas de Saccharomyces cerevisiae/análisis , Saccharomyces cerevisiae , Codón , Sistemas de Lectura Abierta , Péptidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA