Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Nat Commun ; 13(1): 744, 2022 02 08.
Artigo em Inglês | MEDLINE | ID: mdl-35136070

RESUMO

The integration of genomics and proteomics data (proteogenomics) holds the promise of furthering the in-depth understanding of human disease. However, sample mix-up is a pervasive problem in proteogenomics because of the complexity of sample processing. Here, we present a pipeline for Sample Matching in Proteogenomics (SMAP) to verify sample identity and ensure data integrity. SMAP infers sample-dependent protein-coding variants from quantitative mass spectrometry (MS), and aligns the MS-based proteomic samples with genomic samples by two discriminant scores. Theoretical analysis with simulated data indicates that SMAP is capable of uniquely matching proteomic and genomic samples when ≥20% genotypes of individual samples are available. When SMAP was applied to a large-scale dataset generated by the PsychENCODE BrainGVEX project, 54 samples (19%) were corrected. The correction was further confirmed by ribosome profiling and chromatin sequencing (ATAC-seq) data from the same set of samples. Our results demonstrate that SMAP is an effective tool for sample verification in a large-scale MS-based proteogenomics study. SMAP is publicly available at https://github.com/UND-Wanglab/SMAP , and a web-based version can be accessed at https://smap.shinyapps.io/smap/ .


Assuntos
Conjuntos de Dados como Assunto , Proteogenômica/métodos , Sequenciamento de Cromatina por Imunoprecipitação , Análise de Dados , Feminino , Humanos , Masculino , Espectrometria de Massas/métodos , Espectrometria de Massas/estatística & dados numéricos , Proteogenômica/estatística & dados numéricos , RNA-Seq , Software , Sequenciamento Completo do Genoma
3.
J Proteome Res ; 17(11): 3681-3692, 2018 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-30295032

RESUMO

Modern mass spectrometry now permits genome-scale and quantitative measurements of biological proteomes. However, analysis of specific specimens is currently hindered by the incomplete representation of biological variability of protein sequences in canonical reference proteomes and the technical demands for their construction. Here, we report ProteomeGenerator, a framework for de novo and reference-assisted proteogenomic database construction and analysis based on sample-specific transcriptome sequencing and high-accuracy mass spectrometry proteomics. This enables the assembly of proteomes encoded by actively transcribed genes, including sample-specific protein isoforms resulting from non-canonical mRNA transcription, splicing, or editing. To improve the accuracy of protein isoform identification in non-canonical proteomes, ProteomeGenerator relies on statistical target-decoy database matching calibrated using sample-specific controls. Its current implementation includes automatic integration with MaxQuant mass spectrometry proteomics algorithms. We applied this method for the proteogenomic analysis of splicing factor SRSF2 mutant leukemia cells, demonstrating high-confidence identification of non-canonical protein isoforms arising from alternative transcriptional start sites, intron retention, and cryptic exon splicing as well as improved accuracy of genome-scale proteome discovery. Additionally, we report proteogenomic performance metrics for current state-of-the-art implementations of SEQUEST HT, MaxQuant, Byonic, and PEAKS mass spectral analysis algorithms. Finally, ProteomeGenerator is implemented as a Snakemake workflow within a Singularity container for one-step installation in diverse computing environments, thereby enabling open, scalable, and facile discovery of sample-specific, non-canonical, and neomorphic biological proteomes.


Assuntos
Algoritmos , Peptídeos/química , Proteômica/métodos , RNA Mensageiro/genética , Software , Transcriptoma , Processamento Alternativo , Sequência de Aminoácidos , Linhagem Celular Tumoral , Humanos , Leucócitos/metabolismo , Leucócitos/patologia , Espectrometria de Massas/estatística & dados numéricos , Anotação de Sequência Molecular , Mutação , Mapeamento de Peptídeos/estatística & dados numéricos , Peptídeos/classificação , Peptídeos/isolamento & purificação , Proteogenômica/métodos , Proteogenômica/estatística & dados numéricos , Proteoma , RNA Mensageiro/metabolismo , Fatores de Processamento de Serina-Arginina/genética , Fatores de Processamento de Serina-Arginina/metabolismo
4.
J Proteome Res ; 16(7): 2639-2644, 2017 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-28573858

RESUMO

The introduction of new standard formats, proBAM and proBed, improves the integration of genomics and proteomics information, thus aiding proteogenomics applications. These novel formats enable peptide spectrum matches (PSM) to be stored, inspected, and analyzed within the context of the genome. However, an easy-to-use and transparent tool to convert mass spectrometry identification files to these new formats is indispensable. proBAMconvert enables the conversion of common identification file formats (mzIdentML, mzTab, and pepXML) to proBAM/proBed using an intuitive interface. Furthermore, ProBAMconvert enables information to be output both at the PSM and peptide levels and has a command line interface next to the graphical user interface. Detailed documentation and a completely worked-out tutorial is available at http://probam.biobix.be .


Assuntos
Biologia Computacional/métodos , Genoma , Peptídeos/análise , Proteogenômica/estatística & dados numéricos , Interface Usuário-Computador , Algoritmos , Animais , Mapeamento Cromossômico/estatística & dados numéricos , Humanos , Armazenamento e Recuperação da Informação , Proteogenômica/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA