Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Cell ; 166(3): 755-765, 2016 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-27372738

RESUMO

To provide a detailed analysis of the molecular components and underlying mechanisms associated with ovarian cancer, we performed a comprehensive mass-spectrometry-based proteomic characterization of 174 ovarian tumors previously analyzed by The Cancer Genome Atlas (TCGA), of which 169 were high-grade serous carcinomas (HGSCs). Integrating our proteomic measurements with the genomic data yielded a number of insights into disease, such as how different copy-number alternations influence the proteome, the proteins associated with chromosomal instability, the sets of signaling pathways that diverse genome rearrangements converge on, and the ones most associated with short overall survival. Specific protein acetylations associated with homologous recombination deficiency suggest a potential means for stratifying patients for therapy. In addition to providing a valuable resource, these findings provide a view of how the somatic genome drives the cancer proteome and associations between protein and post-translational modification levels and clinical outcomes in HGSC. VIDEO ABSTRACT.


Assuntos
Proteínas de Neoplasias/genética , Neoplasias Císticas, Mucinosas e Serosas/genética , Neoplasias Ovarianas/genética , Proteoma , Acetilação , Instabilidade Cromossômica , Reparo do DNA , DNA de Neoplasias , Feminino , Dosagem de Genes , Humanos , Espectrometria de Massas , Fosfoproteínas/genética , Processamento de Proteína Pós-Traducional , Análise de Sobrevida
2.
J Proteome Res ; 18(6): 2433-2445, 2019 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-31020842

RESUMO

A high-quality genome annotation greatly facilitates successful cell line engineering. Standard draft genome annotation pipelines are based largely on de novo gene prediction, homology, and RNA-Seq data. However, draft annotations can suffer from incorrect predictions of translated sequence, inaccurate splice isoforms, and missing genes. Here, we generated a draft annotation for the newly assembled Chinese hamster genome and used RNA-Seq, proteomics, and Ribo-Seq to experimentally annotate the genome. We identified 3529 new proteins compared to the hamster RefSeq protein annotation and 2256 novel translational events (e.g., alternative splices, mutations, and novel splices). Finally, we used this pipeline to identify the source of translated retroviruses contaminating recombinant products from Chinese hamster ovary (CHO) cell lines, including 119 type-C retroviruses, thus enabling future efforts to eliminate retroviruses to reduce the costs incurred with retroviral particle clearance. In summary, the improved annotation provides a more accurate resource for CHO cell line engineering, by facilitating the interpretation of omics data, defining of cellular pathways, and engineering of complex phenotypes.


Assuntos
Cricetulus/genética , Genoma/genética , Proteogenômica , Proteômica/métodos , Animais , Células CHO , Cricetinae , Anotação de Sequência Molecular/métodos , RNA-Seq/métodos , Análise de Sequência de RNA/métodos
3.
Mol Cell Proteomics ; 16(12): 2111-2124, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29046389

RESUMO

Immunotherapy is becoming increasingly important in the fight against cancers, using and manipulating the body's immune response to treat tumors. Understanding the immune repertoire-the collection of immunological proteins-of treated and untreated cells is possible at the genomic, but technically difficult at the protein level. Standard protein databases do not include the highly divergent sequences of somatic rearranged immunoglobulin genes, and may lead to miss identifications in a mass spectrometry search. We introduce a novel proteogenomic approach, AbScan, to identify these highly variable antibody peptides, by developing a customized antibody database construction method using RNA-seq reads aligned to immunoglobulin (Ig) genes.AbScan starts by filtering transcript (RNA-seq) reads that match the template for Ig genes. The retained reads are used to construct a repertoire graph using the "split" de Bruijn graph: a graph structure that improves on the standard de Bruijn graph to capture the high diversity of Ig genes in a compact manner. AbScan corrects for sequencing errors, and converts the graph to a format suitable for searching with MS/MS search tools. We used AbScan to create an antibody database from 90 RNA-seq colorectal tumor samples. Next, we used proteogenomic analysis to search MS/MS spectra of matched colorectal samples from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) against the AbScan generated database. AbScan identified 1,940 distinct antibody peptides. Correlating with previously identified Single Amino-Acid Variants (SAAVs) in the tumor samples, we identified 163 pairs (antibody peptide, SAAV) with significant cooccurrence pattern in the 90 samples. The presence of coexpressed antibody and mutated peptides was correlated with survival time of the individuals. Our results suggest that AbScan (https://github.com/csw407/AbScan.git) is an effective tool for a proteomic exploration of the immune response in cancers.


Assuntos
Neoplasias Colorretais/imunologia , Genômica/métodos , Imunoglobulinas/química , Peptídeos/genética , Proteômica/métodos , Algoritmos , Linhagem Celular Tumoral , Neoplasias Colorretais/genética , Bases de Dados Genéticas , Bases de Dados de Proteínas , Humanos , Imunoglobulinas/genética , Peptídeos/química , Análise de Sequência de RNA , Espectrometria de Massas em Tandem
4.
J Proteome Res ; 14(9): 3555-67, 2015 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-26139413

RESUMO

Aiming toward an improved understanding of the regulation of proteins in cancer, recent studies from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) have focused on analyzing cancer tissue using proteomic technologies and workflows. Although many proteogenomics approaches for the study of cancer samples have been proposed, serious methodological challenges remain, especially in the identification of multiple mutational variants or structural variations such as fusion gene events. In addition, although immune system genes play an important role in cancer, identification of IgG peptides remains challenging in proteomic data sets. Here, we describe an integrative proteogenomic method that extends the limit of proteogenomic searches to identify multiple variant peptides as well as immunoglobulin gene variations/rearrangements using customized mining of RNA-seq data. Our results also provide the first extensive characterization of tumor immune response and demonstrate the potential of this method to improve the molecular characterization of tumor subtypes.


Assuntos
Genômica , Imunoglobulinas/química , Mutação , Peptídeos/genética , Proteômica , Processamento Alternativo , Sequência de Aminoácidos , Bases de Dados de Proteínas , Humanos , Dados de Sequência Molecular , Peptídeos/química , Espectrometria de Massas em Tandem
5.
Proteomics ; 14(23-24): 2719-30, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25263569

RESUMO

Cancer is driven by the acquisition of somatic DNA lesions. Distinguishing the early driver mutations from subsequent passenger mutations is key to molecular subtyping of cancers, understanding cancer progression, and the discovery of novel biomarkers. The advances of genomics technologies (whole-genome exome, and transcript sequencing, collectively referred to as NGS (next-generation sequencing)) have fueled recent studies on somatic mutation discovery. However, the vision is challenged by the complexity, redundancy, and errors in genomic data, and the difficulty of investigating the proteome translated portion of aberrant genes using only genomic approaches. Combination of proteomic and genomic technologies are increasingly being employed. Various strategies have been employed to allow the usage of large-scale NGS data for conventional MS/MS searches. This paper provides a discussion of applying different strategies relating to large database search, and FDR (false discovery rate) -based error control, and their implication to cancer proteogenomics. Moreover, it extends and develops the idea of a unified genomic variant database that can be searched by any MS sample. A total of 879 BAM files downloaded from TCGA repository were used to create a 4.34 GB unified FASTA database that contained 2787062 novel splice junctions, 38,464 deletions, 1,105 insertions, and 182,302 substitutions. Proteomic data from a single ovarian carcinoma sample (439,858 spectra) was searched against the database. By applying the most conservative FDR measure, we have identified 524 novel peptides and 65,578 known peptides at 1% FDR threshold. The novel peptides include interesting examples of doubly mutated peptides, frame-shifts, and nonsample-recruited mutations, which emphasize the strength of our approach.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/metabolismo , Proteômica/métodos , Bases de Dados de Proteínas , Humanos , Neoplasias/genética , Peptídeos/genética
6.
J Proteome Res ; 13(1): 21-8, 2014 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-23802565

RESUMO

The advent of inexpensive RNA-seq technologies and other deep sequencing technologies for RNA has the promise to radically improve genomic annotation, providing information on transcribed regions and splicing events in a variety of cellular conditions. Using MS-based proteogenomics, many of these events can be confirmed directly at the protein level. However, the integration of large amounts of redundant RNA-seq data and mass spectrometry data poses a challenging problem. Our paper addresses this by construction of a compact database that contains all useful information expressed in RNA-seq reads. Applying our method to cumulative C. elegans data reduced 496.2 GB of aligned RNA-seq SAM files to 410 MB of splice graph database written in FASTA format. This corresponds to 1000× compression of data size, without loss of sensitivity. We performed a proteogenomics study using the custom data set, using a completely automated pipeline, and identified a total of 4044 novel events, including 215 novel genes, 808 novel exons, 12 alternative splicings, 618 gene-boundary corrections, 245 exon-boundary changes, 938 frame shifts, 1166 reverse strands, and 42 translated UTRs. Our results highlight the usefulness of transcript + proteomic integration for improved genome annotations.


Assuntos
Caenorhabditis elegans/metabolismo , Bases de Dados Genéticas , Bases de Dados de Proteínas , Genoma , Proteoma , Análise de Sequência de RNA , Sequência de Aminoácidos , Animais , Automação , Caenorhabditis elegans/genética , Proteínas de Helminto/química , Proteínas de Helminto/genética , Proteínas de Helminto/metabolismo , Dados de Sequência Molecular
7.
Genome Biol ; 25(1): 163, 2024 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-38902799

RESUMO

BACKGROUND: Copy number variation (CNV) is a key genetic characteristic for cancer diagnostics and can be used as a biomarker for the selection of therapeutic treatments. Using data sets established in our previous study, we benchmark the performance of cancer CNV calling by six most recent and commonly used software tools on their detection accuracy, sensitivity, and reproducibility. In comparison to other orthogonal methods, such as microarray and Bionano, we also explore the consistency of CNV calling across different technologies on a challenging genome. RESULTS: While consistent results are observed for copy gain, loss, and loss of heterozygosity (LOH) calls across sequencing centers, CNV callers, and different technologies, variation of CNV calls are mostly affected by the determination of genome ploidy. Using consensus results from six CNV callers and confirmation from three orthogonal methods, we establish a high confident CNV call set for the reference cancer cell line (HCC1395). CONCLUSIONS: NGS technologies and current bioinformatics tools can offer reliable results for detection of copy gain, loss, and LOH. However, when working with a hyper-diploid genome, some software tools can call excessive copy gain or loss due to inaccurate assessment of genome ploidy. With performance matrices on various experimental conditions, this study raises awareness within the cancer research community for the selection of sequencing platforms, sample preparation, sequencing coverage, and the choice of CNV detection tools.


Assuntos
Biologia Computacional , Variações do Número de Cópias de DNA , Sequenciamento de Nucleotídeos em Larga Escala , Perda de Heterozigosidade , Neoplasias , Software , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/genética , Biologia Computacional/métodos , Diploide , Genoma Humano , Linhagem Celular Tumoral , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos
8.
Oncoimmunology ; 11(1): 2052410, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35371621

RESUMO

Major immunotherapy challenges include a limited number of predictive biomarkers and the unusual imaging features post-therapy, such as pseudo-progression, which denote immune infiltrate-mediated tumor enlargement. Such phenomena confound clinical decision-making, since the cancer may eventually regress, and the patient should stay on treatment. We prospectively evaluated serial, blood-derived cell-free DNA (cfDNA) (baseline and 2-3 weeks post-immune checkpoint inhibitors [ICIs]) for variant allele frequency (VAF) and blood tumor mutation burden (bTMB) changes (next-generation sequencing) (N = 84 evaluable patients, diverse cancers). Low vs. high cfDNA-derived average adjusted ΔVAF (calculated by a machine-learning model) was an independent predictor of higher clinical benefit rate (stable disease ≥6 months/complete/partial response) (69.2% vs. 22.5%), and longer median progression-free (10.1 vs. 2.25 months) and overall survival (not reached vs. 6.1 months) (all P < .001, multivariate). bTMB changes did not correlate with outcomes. Therefore, early dynamic changes in cfDNA-derived VAF were a powerful predictor of pan-cancer immunotherapy outcomes.


Assuntos
Inibidores de Checkpoint Imunológico , Neoplasias , Frequência do Gene , Humanos , Inibidores de Checkpoint Imunológico/farmacologia , Inibidores de Checkpoint Imunológico/uso terapêutico , Biópsia Líquida , Mutação , Neoplasias/tratamento farmacológico , Neoplasias/genética , Neoplasias/patologia
9.
Cell Syst ; 7(4): 412-421.e5, 2018 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-30172843

RESUMO

The increasing throughput and sharing of proteomics mass spectrometry data have now yielded over one-third of a million public mass spectrometry runs. However, these discoveries are not continuously aggregated in an open and error-controlled manner, which limits their utility. To facilitate the reusability of these data, we built the MassIVE Knowledge Base (MassIVE-KB), a community-wide, continuously updating knowledge base that aggregates proteomics mass spectrometry discoveries into an open reusable format with full provenance information for community scrutiny. Reusing >31 TB of public human data stored in a mass spectrometry interactive virtual environment (MassIVE), the MassIVE-KB contains >2.1 million precursors from 19,610 proteins (48% larger than before; 97% of the total) and doubles proteome coverage to 6 million amino acids (54% of the proteome) with strict library-scale false discovery controls, thereby providing evidence for 430 proteins for which sufficient protein-level evidence was previously missing. Furthermore, MassIVE-KB can inform experimental design, helps identify and quantify new data, and provides tools for community construction of specialized spectral libraries.


Assuntos
Espectrometria de Massas/métodos , Proteoma/química , Proteômica/métodos , Algoritmos , Variação Biológica da População , Bases de Dados de Proteínas , Humanos , Proteoma/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA