Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 53
Filtrar
1.
Commun Biol ; 5(1): 1367, 2022 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-36513728

RESUMO

Cancer cell lines have been widely used for decades to study biological processes driving cancer development, and to identify biomarkers of response to therapeutic agents. Advances in genomic sequencing have made possible large-scale genomic characterizations of collections of cancer cell lines and primary tumors, such as the Cancer Cell Line Encyclopedia (CCLE) and The Cancer Genome Atlas (TCGA). These studies allow for the first time a comprehensive evaluation of the comparability of cancer cell lines and primary tumors on the genomic and proteomic level. Here we employ bulk mRNA and micro-RNA sequencing data from thousands of samples in CCLE and TCGA, and proteomic data from partner studies in the MD Anderson Cell Line Project (MCLP) and The Cancer Proteome Atlas (TCPA), to characterize the extent to which cancer cell lines recapitulate tumors. We identify dysregulation of a long non-coding RNA and microRNA regulatory network in cancer cell lines, associated with differential expression between cell lines and primary tumors in four key cancer driver pathways: KRAS signaling, NFKB signaling, IL2/STAT5 signaling and TP53 signaling. Our results emphasize the necessity for careful interpretation of cancer cell line experiments, particularly with respect to therapeutic treatments targeting these important cancer pathways.


Assuntos
Neoplasias , Proteômica , Humanos , Multiômica , Neoplasias/genética , Neoplasias/metabolismo , Aprendizado de Máquina , Linhagem Celular
2.
Clin Exp Metastasis ; 39(1): 85-99, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-33970362

RESUMO

Cancer heterogeneity is a result of genetic mutations within the cancer cells. Their proliferation is not only driven by autocrine functions but also under the influence of cancer microenvironment, which consists of normal stromal cells such as infiltrating immune cells, cancer-associated fibroblasts, endothelial cells, pericytes, vascular and lymphatic channels. The relationship between cancer cells and cancer microenvironment is a critical one and we are just on the verge to understand it on a molecular level. Cancer microenvironment may serve as a selective force to modulate cancer cells to allow them to evolve into more aggressive clones with ability to invade the lymphatic or vascular channels to spread to regional lymph nodes and distant sites. It is important to understand these steps of cancer evolution within the cancer microenvironment towards invasion so that therapeutic strategies can be developed to control or stop these processes.


Assuntos
Neoplasias , Microambiente Tumoral , Células Endoteliais , Genômica , Humanos , Linfonodos/patologia , Neoplasias/irrigação sanguínea , Microambiente Tumoral/genética
3.
Genome Res ; 31(11): 2035-2049, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34667117

RESUMO

Vocal learning, the ability to imitate sounds from conspecifics and the environment, is a key component of human spoken language and learned song in three independently evolved avian groups-oscine songbirds, parrots, and hummingbirds. Humans and each of these three bird clades exhibit specialized behavioral, neuroanatomical, and brain gene expression convergence related to vocal learning, speech, and song. To understand the evolutionary basis of vocal learning gene specializations and convergence, we searched for and identified accelerated genomic regions (ARs), a marker of positive selection, specific to vocal learning birds. We found avian vocal learner-specific ARs, and they were enriched in noncoding regions near genes with known speech functions or brain gene expression specializations in humans and vocal learning birds, including FOXP2, NEUROD6, ZEB2, and MEF2C, and near genes with major neurodevelopmental functions, including NR2F1, NRP2, and BCL11B We also found enrichment near the SFARI class S genes associated with syndromic vocal communication forms of autism spectrum disorders. These findings reveal strong candidate noncoding regions near genes for the evolutionary adaptations that distinguish vocal learning species from their close vocal nonlearning relatives and provide further evidence of molecular convergence between birdsong and human spoken language.


Assuntos
Aves Canoras , Fala , Animais , Encéfalo/metabolismo , Genômica , Humanos , Aprendizagem , Proteínas Repressoras/metabolismo , Aves Canoras/genética , Proteínas Supressoras de Tumor/metabolismo , Vocalização Animal
4.
Gigascience ; 10(3)2021 03 13.
Artigo em Inglês | MEDLINE | ID: mdl-33712853

RESUMO

BACKGROUND: The reproducibility of gene expression measured by RNA sequencing (RNA-Seq) is dependent on the sequencing depth. While unmapped or non-exonic reads do not contribute to gene expression quantification, duplicate reads contribute to the quantification but are not informative for reproducibility. We show that mapped, exonic, non-duplicate (MEND) reads are a useful measure of reproducibility of RNA-Seq datasets used for gene expression analysis. FINDINGS: In bulk RNA-Seq datasets from 2,179 tumors in 48 cohorts, the fraction of reads that contribute to the reproducibility of gene expression analysis varies greatly. Unmapped reads constitute 1-77% of all reads (median [IQR], 3% [3-6%]); duplicate reads constitute 3-100% of mapped reads (median [IQR], 27% [13-43%]); and non-exonic reads constitute 4-97% of mapped, non-duplicate reads (median [IQR], 25% [16-37%]). MEND reads constitute 0-79% of total reads (median [IQR], 50% [30-61%]). CONCLUSIONS: Because not all reads in an RNA-Seq dataset are informative for reproducibility of gene expression measurements and the fraction of reads that are informative varies, we propose reporting a dataset's sequencing depth in MEND reads, which definitively inform the reproducibility of gene expression, rather than total, mapped, or exonic reads. We provide a Docker image containing (i) the existing required tools (RSeQC, sambamba, and samblaster) and (ii) a custom script to calculate MEND reads from RNA-Seq data files. We recommend that all RNA-Seq gene expression experiments, sensitivity studies, and depth recommendations use MEND units for sequencing depth.


Assuntos
Neoplasias , RNA , Criança , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias/genética , Reprodutibilidade dos Testes , Análise de Sequência de RNA , Sequenciamento do Exoma
5.
Gigascience ; 9(12)2020 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-33319914

RESUMO

BACKGROUND: Diffuse midline gliomas with histone H3 K27M (H3K27M) mutations occur in early childhood and are marked by an invasive phenotype and global decrease in H3K27me3, an epigenetic mark that regulates differentiation and development. H3K27M mutation timing and effect on early embryonic brain development are not fully characterized. RESULTS: We analyzed multiple publicly available RNA sequencing datasets to identify differentially expressed genes between H3K27M and non-K27M pediatric gliomas. We found that genes involved in the epithelial-mesenchymal transition (EMT) were significantly overrepresented among differentially expressed genes. Overall, the expression of pre-EMT genes was increased in the H3K27M tumors as compared to non-K27M tumors, while the expression of post-EMT genes was decreased. We hypothesized that H3K27M may contribute to gliomagenesis by stalling an EMT required for early brain development, and evaluated this hypothesis by using another publicly available dataset of single-cell and bulk RNA sequencing data from developing cerebral organoids. This analysis revealed similarities between H3K27M tumors and pre-EMT normal brain cells. Finally, a previously published single-cell RNA sequencing dataset of H3K27M and non-K27M gliomas revealed subgroups of cells at different stages of EMT. In particular, H3.1K27M tumors resemble a later EMT stage compared to H3.3K27M tumors. CONCLUSIONS: Our data analyses indicate that this mutation may be associated with a differentiation stall evident from the failure to proceed through the EMT-like developmental processes, and that H3K27M cells preferentially exist in a pre-EMT cell phenotype. This study demonstrates how novel biological insights could be derived from combined analysis of several previously published datasets, highlighting the importance of making genomic data available to the community in a timely manner.


Assuntos
Glioma , Histonas , Diferenciação Celular/genética , Criança , Pré-Escolar , Transição Epitelial-Mesenquimal/genética , Glioma/genética , Histonas/genética , Humanos , Mutação
6.
Front Immunol ; 11: 483296, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33244314

RESUMO

Somatic mutations in cancers affecting protein coding genes can give rise to potentially therapeutic neoepitopes. These neoepitopes can guide Adoptive Cell Therapies and Peptide- and RNA-based Neoepitope Vaccines to selectively target tumor cells using autologous patient cytotoxic T-cells. Currently, researchers have to independently align their data, call somatic mutations and haplotype the patient's HLA to use existing neoepitope prediction tools. We present ProTECT, a fully automated, reproducible, scalable, and efficient end-to-end analysis pipeline to identify and rank therapeutically relevant tumor neoepitopes in terms of potential immunogenicity starting directly from raw patient sequencing data, or from pre-processed data. The ProTECT pipeline encompasses alignment, HLA haplotyping, mutation calling (single nucleotide variants, short insertions and deletions, and gene fusions), peptide:MHC binding prediction, and ranking of final candidates. We demonstrate the scalability, efficiency, and utility of ProTECT on 326 samples from the TCGA Prostate Adenocarcinoma cohort, identifying recurrent potential neoepitopes from TMPRSS2-ERG fusions, and from SNVs in SPOP. We also compare ProTECT with results from published tools. ProTECT can be run on a standalone computer, a local cluster, or on a compute cloud using a Mesos backend. ProTECT is highly scalable and can process TCGA data in under 30 min per sample (on average) when run in large batches. ProTECT is freely available at https://www.github.com/BD2KGenomics/protect.


Assuntos
Antígenos de Neoplasias , Epitopos de Linfócito T , Imunoterapia , Neoplasias , Software , Linfócitos T Citotóxicos/imunologia , Antígenos de Neoplasias/genética , Antígenos de Neoplasias/imunologia , Epitopos de Linfócito T/genética , Epitopos de Linfócito T/imunologia , Humanos , Neoplasias/genética , Neoplasias/imunologia , Neoplasias/terapia , Valor Preditivo dos Testes
7.
Nat Commun ; 11(1): 3400, 2020 07 07.
Artigo em Inglês | MEDLINE | ID: mdl-32636365

RESUMO

The Pan-Cancer Analysis of Whole Genomes (PCAWG) project generated a vast amount of whole-genome cancer sequencing resource data. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we provide a user's guide to the five publicly available online data exploration and visualization tools introduced in the PCAWG marker paper. These tools are ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout. We detail use cases and analyses for each tool, show how they incorporate outside resources from the larger genomics ecosystem, and demonstrate how the tools can be used together to understand the biology of cancers more deeply. Together, the tools enable researchers to query the complex genomic PCAWG data dynamically and integrate external information, enabling and enhancing interpretation.


Assuntos
Biologia Computacional/métodos , Genoma Humano , Neoplasias/genética , Cromotripsia , Análise de Dados , Bases de Dados Genéticas , Genômica , Humanos , Internet , Mutação , Software , Interface Usuário-Computador , Sequenciamento Completo do Genoma
9.
PLoS Comput Biol ; 16(4): e1007753, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32275708

RESUMO

Precision oncology has primarily relied on coding mutations as biomarkers of response to therapies. While transcriptome analysis can provide valuable information, incorporation into workflows has been difficult. For example, the relative rather than absolute gene expression level needs to be considered, requiring differential expression analysis across samples. However, expression programs related to the cell-of-origin and tumor microenvironment effects confound the search for cancer-specific expression changes. To address these challenges, we developed an unsupervised clustering approach for discovering differential pathway expression within cancer cohorts using gene expression measurements. The hydra approach uses a Dirichlet process mixture model to automatically detect multimodally distributed genes and expression signatures without the need for matched normal tissue. We demonstrate that the hydra approach is more sensitive than widely-used gene set enrichment approaches for detecting multimodal expression signatures. Application of the hydra analysis framework to small blue round cell tumors (including rhabdomyosarcoma, synovial sarcoma, neuroblastoma, Ewing sarcoma, and osteosarcoma) identified expression signatures associated with changes in the tumor microenvironment. The hydra approach also identified an association between ATRX deletions and elevated immune marker expression in high-risk neuroblastoma. Notably, hydra analysis of all small blue round cell tumors revealed similar subtypes, characterized by changes to infiltrating immune and stromal expression signatures.


Assuntos
Perfilação da Expressão Gênica/métodos , Neoplasias/genética , Transcriptoma/genética , Biomarcadores Tumorais , Criança , Análise por Conglomerados , Biologia Computacional/métodos , Regulação Neoplásica da Expressão Gênica/genética , Humanos , Modelos Estatísticos , Neuroblastoma/genética , Medicina de Precisão/métodos , Microambiente Tumoral/genética
10.
J Med Internet Res ; 22(3): e16810, 2020 03 20.
Artigo em Inglês | MEDLINE | ID: mdl-32196460

RESUMO

BACKGROUND: Efficiently sharing health data produced during standard care could dramatically accelerate progress in cancer treatments, but various barriers make this difficult. Not sharing these data to ensure patient privacy is at the cost of little to no learning from real-world data produced during cancer care. Furthermore, recent research has demonstrated a willingness of patients with cancer to share their treatment experiences to fuel research, despite potential risks to privacy. OBJECTIVE: The objective of this study was to design, pilot, and release a decentralized, scalable, efficient, economical, and secure strategy for the dissemination of deidentified clinical and genomic data with a focus on late-stage cancer. METHODS: We created and piloted a blockchain-authenticated system to enable secure sharing of deidentified patient data derived from standard of care imaging, genomic testing, and electronic health records (EHRs), called the Cancer Gene Trust (CGT). We prospectively consented and collected data for a pilot cohort (N=18), which we uploaded to the CGT. EHR data were extracted from both a hospital cancer registry and a common data model (CDM) format to identify optimal data extraction and dissemination practices. Specifically, we scored and compared the level of completeness between two EHR data extraction formats against the gold standard source documentation for patients with available data (n=17). RESULTS: Although the total completeness scores were greater for the registry reports than those for the CDM, this difference was not statistically significant. We did find that some specific data fields, such as histology site, were better captured using the registry reports, which can be used to improve the continually adapting CDM. In terms of the overall pilot study, we found that CGT enables rapid integration of real-world data of patients with cancer in a more clinically useful time frame. We also developed an open-source Web application to allow users to seamlessly search, browse, explore, and download CGT data. CONCLUSIONS: Our pilot demonstrates the willingness of patients with cancer to participate in data sharing and how blockchain-enabled structures can maintain relationships between individual data elements while preserving patient privacy, empowering findings by third-party researchers and clinicians. We demonstrate the feasibility of CGT as a framework to share health data trapped in silos to further cancer research. Further studies to optimize data representation, stream, and integrity are required.


Assuntos
Blockchain/normas , Genômica/métodos , Neoplasias/genética , Estudos de Coortes , Humanos , Projetos Piloto , Estudos Prospectivos , Resultado do Tratamento
11.
Nucleic Acids Res ; 48(D1): D756-D761, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31691824

RESUMO

The University of California Santa Cruz Genome Browser website (https://genome.ucsc.edu) enters its 20th year of providing high-quality genomics data visualization and genome annotations to the research community. In the past year, we have added a new option to our web BLAT tool that allows search against all genomes, a single-cell expression viewer (https://cells.ucsc.edu), a 'lollipop' plot display mode for high-density variation data, a RESTful API for data extraction and a custom-track backup feature. New datasets include Tabula Muris single-cell expression data, GeneHancer regulatory annotations, The Cancer Genome Atlas Pan-Cancer variants, Genome Reference Consortium Patch sequences, new ENCODE transcription factor binding site peaks and clusters, the Database of Genomic Variants Gold Standard Variants, Genomenon Mastermind variants and three new multi-species alignment tracks.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Software , Genômica , Humanos , Internet
12.
Artigo em Inglês | MEDLINE | ID: mdl-31645344

RESUMO

Gliomatosis peritonei is a rare pathologic finding that is associated with ovarian teratomas and malignant mixed germ cell tumors. The occurrence of gliomatosis as a mature glial implant can impart an improved prognosis to patients with immature ovarian teratoma, making prompt and accurate diagnosis important. We describe a case of recurrent immature teratoma in a 10-yr-old female patient, in which comparative analysis of the RNA sequencing gene expression data from the patient's tumor was used effectively to aid in the diagnosis of gliomatosis peritonei.


Assuntos
Neoplasias Peritoneais/diagnóstico , Neoplasias Peritoneais/genética , Teratoma/diagnóstico , Sequência de Bases/genética , Criança , Feminino , Glioma/diagnóstico , Glioma/genética , Humanos , Neoplasias Ovarianas/diagnóstico , Neoplasias Ovarianas/genética , Prognóstico , RNA-Seq/métodos , Doenças Raras/diagnóstico , Doenças Raras/genética , Análise de Sequência de RNA/métodos , Teratoma/genética , Sequenciamento do Exoma
13.
Artigo em Inglês | MEDLINE | ID: mdl-31645349

RESUMO

Genomic data offer valuable insights that can be used to help find treatments and cures for disease. Precision medicine, defined by the NIH as "an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person," is gaining acceptance among physicians, who are beginning to integrate patient-centric data analysis into clinical decision-making. Although precision medicine makes use of various types of data, this piece focuses on molecular characterization data specifically, as the discoveries yielded from these data can advance thinking around clinical care for cancer patients. Our pediatrics genomics team at the University of California Santa Cruz Genomics Institute is uniquely situated to discuss the use of shared genomic data for clinical benefit because our collaborations with hospital partners in the United States and internationally rely on big-data comparative genomic analysis. Using shared data, Treehouse Childhood Cancer Initiative develops methods for comparative analysis of tumor RNA sequencing profiles from single patients for the purposes of identifying overexpressed oncogenes that could be targeted by therapies in the clinic. To enable and improve this analysis, we continuously increase the size of our data compendium by adding public pediatric tumor RNA sequencing data sets. We developed an approach for assessing the quality of shared RNA sequencing data to ensure the integrity of the data. In this approach we calculate the number of mapped exonic nonduplicate (MEND) reads, applying a 10 million MEND read minimum threshold for inclusion in our comparative analysis. In collaboration with Stanford University and Lucile Packard Children's Hospital Stanford, our team at Treehouse Childhood Cancer Initiative explores the value to researchers everywhere of shared genomic data for clinical utility and the challenges of data sharing that threaten to impede otherwise rapid advances in precision medicine. This Perspective offers recommendations for maximizing the use of genomic data to make discoveries that will benefit patients.


Assuntos
Disseminação de Informação/métodos , Análise de Sequência de RNA/métodos , Big Data , Tomada de Decisão Clínica/métodos , Genoma/genética , Genômica/métodos , Humanos , Neoplasias/genética , Medicina de Precisão/métodos
14.
JAMA Netw Open ; 2(10): e1913968, 2019 10 02.
Artigo em Inglês | MEDLINE | ID: mdl-31651965

RESUMO

Importance: Pediatric cancers are epigenetic diseases; therefore, considering tumor gene expression information is necessary for a complete understanding of the tumorigenic processes. Objective: To evaluate the feasibility and utility of incorporating comparative gene expression information into the precision medicine framework for difficult-to-treat pediatric and young adult patients with cancer. Design, Setting, and Participants: This cohort study was conducted as a consortium between the University of California, Santa Cruz (UCSC) Treehouse Childhood Cancer Initiative and clinical genomic trials. RNA sequencing (RNA-Seq) data were obtained from the following 4 clinical sites and analyzed at UCSC: British Columbia Children's Hospital (n = 31), Lucile Packard Children's Hospital at Stanford University (n = 80), CHOC Children's Hospital and Hyundai Cancer Institute (n = 46), and the Pacific Pediatric Neuro-Oncology Consortium (n = 24). The study dates were January 1, 2016, to March 22, 2017. Exposures: Participants underwent tumor RNA-Seq profiling as part of 4 separate clinical trials at partner hospitals. The UCSC either downloaded RNA-Seq data from a partner institution for analysis in the cloud or provided a Docker pipeline that performed the same analysis at a partner institution. The UCSC then compared each participant's tumor RNA-Seq profile with more than 11 000 uniformly analyzed tumor profiles from pediatric and young adult patients with cancer, downloaded from public data repositories. These comparisons were used to identify genes and pathways that are significantly overexpressed in each patient's tumor. Results of the UCSC analysis were presented to clinical partners. Main Outcomes and Measures: Feasibility of a third-party institution (UCSC Treehouse Childhood Cancer Initiative) to obtain tumor RNA-Seq data from patients, conduct comparative analysis, and present analysis results to clinicians; and proportion of patients for whom comparative tumor gene expression analysis provided useful clinical and biological information. Results: Among 144 samples from children and young adults (median age at diagnosis, 9 years; range, 0-26 years; 72 of 118 [61.0%] male [26 patients sex unknown]) with a relapsed, refractory, or rare cancer treated on precision medicine protocols, RNA-Seq-derived gene expression was potentially useful for 99 of 144 samples (68.8%) compared with DNA mutation information that was potentially useful for only 34 of 74 samples (45.9%). Conclusions and Relevance: This study's findings suggest that tumor RNA-Seq comparisons may be feasible and highlight the potential clinical utility of incorporating such comparisons into the clinical genomic interpretation framework for difficult-to-treat pediatric and young adult patients with cancer. The study also highlights for the first time to date the potential clinical utility of harmonized publicly available genomic data sets.


Assuntos
Neoplasias/genética , RNA Neoplásico/análise , Análise de Sequência de RNA , Canadá , Criança , Pré-Escolar , Feminino , Expressão Gênica , Humanos , Lactente , Recém-Nascido , Masculino , Medicina de Precisão , Estados Unidos , Adulto Jovem
16.
Nat Biotechnol ; 37(4): 480, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30894680

RESUMO

In the version of this article initially published, Lena Dolman's second affiliation was given as Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK. The correct second affiliation is Ontario Institute for Cancer Research, Toronto, Ontario, Canada. The error has been corrected in the HTML and PDF versions of the article.

18.
PLoS Genet ; 14(12): e1007752, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30586411

RESUMO

The BRCA Challenge is a long-term data-sharing project initiated within the Global Alliance for Genomics and Health (GA4GH) to aggregate BRCA1 and BRCA2 data to support highly collaborative research activities. Its goal is to generate an informed and current understanding of the impact of genetic variation on cancer risk across the iconic cancer predisposition genes, BRCA1 and BRCA2. Initially, reported variants in BRCA1 and BRCA2 available from public databases were integrated into a single, newly created site, www.brcaexchange.org. The purpose of the BRCA Exchange is to provide the community with a reliable and easily accessible record of variants interpreted for a high-penetrance phenotype. More than 20,000 variants have been aggregated, three times the number found in the next-largest public database at the project's outset, of which approximately 7,250 have expert classifications. The data set is based on shared information from existing clinical databases-Breast Cancer Information Core (BIC), ClinVar, and the Leiden Open Variation Database (LOVD)-as well as population databases, all linked to a single point of access. The BRCA Challenge has brought together the existing international Evidence-based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium expert panel, along with expert clinicians, diagnosticians, researchers, and database providers, all with a common goal of advancing our understanding of BRCA1 and BRCA2 variation. Ongoing work includes direct contact with national centers with access to BRCA1 and BRCA2 diagnostic data to encourage data sharing, development of methods suitable for extraction of genetic variation at the level of individual laboratory reports, and engagement with participant communities to enable a more comprehensive understanding of the clinical significance of genetic variation in BRCA1 and BRCA2.


Assuntos
Bases de Dados Genéticas , Genes BRCA1 , Genes BRCA2 , Variação Genética , Alelos , Neoplasias da Mama/genética , Bases de Dados Genéticas/ética , Feminino , Frequência do Gene , Predisposição Genética para Doença , Humanos , Disseminação de Informação/ética , Disseminação de Informação/legislação & jurisprudência , Masculino , Mutação , Neoplasias Ovarianas/genética , Penetrância , Fenótipo , Fatores de Risco
19.
Genome Biol ; 19(1): 188, 2018 11 06.
Artigo em Inglês | MEDLINE | ID: mdl-30400818

RESUMO

BACKGROUND: The phenotypes of cancer cells are driven in part by somatic structural variants. Structural variants can initiate tumors, enhance their aggressiveness, and provide unique therapeutic opportunities. Whole-genome sequencing of tumors can allow exhaustive identification of the specific structural variants present in an individual cancer, facilitating both clinical diagnostics and the discovery of novel mutagenic mechanisms. A plethora of somatic structural variant detection algorithms have been created to enable these discoveries; however, there are no systematic benchmarks of them. Rigorous performance evaluation of somatic structural variant detection methods has been challenged by the lack of gold standards, extensive resource requirements, and difficulties arising from the need to share personal genomic information. RESULTS: To facilitate structural variant detection algorithm evaluations, we create a robust simulation framework for somatic structural variants by extending the BAMSurgeon algorithm. We then organize and enable a crowdsourced benchmarking within the ICGC-TCGA DREAM Somatic Mutation Calling Challenge (SMC-DNA). We report here the results of structural variant benchmarking on three different tumors, comprising 204 submissions from 15 teams. In addition to ranking methods, we identify characteristic error profiles of individual algorithms and general trends across them. Surprisingly, we find that ensembles of analysis pipelines do not always outperform the best individual method, indicating a need for new ways to aggregate somatic structural variant detection approaches. CONCLUSIONS: The synthetic tumors and somatic structural variant detection leaderboards remain available as a community benchmarking resource, and BAMSurgeon is available at https://github.com/adamewing/bamsurgeon .


Assuntos
Benchmarking , Simulação por Computador , Crowdsourcing , Variação Genética , Genoma Humano , Genômica/métodos , Neoplasias/genética , Algoritmos , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Software
20.
Front Immunol ; 9: 99, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29441070

RESUMO

The identification of recurrent human leukocyte antigen (HLA) neoepitopes driving T cell responses against tumors poses a significant bottleneck in the development of approaches for precision cancer therapeutics. Here, we employ a bioinformatics method, Prediction of T Cell Epitopes for Cancer Therapy, to analyze sequencing data from neuroblastoma patients and identify a recurrent anaplastic lymphoma kinase mutation (ALK R1275Q) that leads to two high affinity neoepitopes when expressed in complex with common HLA alleles. Analysis of the X-ray structures of the two peptides bound to HLA-B*15:01 reveals drastically different conformations with measurable changes in the stability of the protein complexes, while the self-epitope is excluded from binding due to steric hindrance in the MHC groove. To evaluate the range of HLA alleles that could display the ALK neoepitopes, we used structure-based Rosetta comparative modeling calculations, which accurately predict several additional high affinity interactions and compare our results with commonly used prediction tools. Subsequent determination of the X-ray structure of an HLA-A*01:01 bound neoepitope validates atomic features seen in our Rosetta models with respect to key residues relevant for MHC stability and T cell receptor recognition. Finally, MHC tetramer staining of peripheral blood mononuclear cells from HLA-matched donors shows that the two neoepitopes are recognized by CD8+ T cells. This work provides a rational approach toward high-throughput identification and further optimization of putative neoantigen/HLA targets with desired recognition features for cancer immunotherapy.


Assuntos
Quinase do Linfoma Anaplásico/genética , Quinase do Linfoma Anaplásico/imunologia , Antígenos de Neoplasias/genética , Antígenos de Neoplasias/imunologia , Epitopos/genética , Epitopos/imunologia , Mutação , Alelos , Sequência de Aminoácidos , Quinase do Linfoma Anaplásico/metabolismo , Antígenos de Neoplasias/metabolismo , Linfócitos T CD8-Positivos/imunologia , Linfócitos T CD8-Positivos/metabolismo , Biologia Computacional/métodos , Epitopos/química , Epitopos de Linfócito T/química , Epitopos de Linfócito T/genética , Epitopos de Linfócito T/imunologia , Antígenos de Histocompatibilidade Classe I/química , Antígenos de Histocompatibilidade Classe I/imunologia , Antígenos de Histocompatibilidade Classe I/metabolismo , Humanos , Modelos Moleculares , Peptídeos/genética , Peptídeos/imunologia , Peptídeos/metabolismo , Conformação Proteica , Multimerização Proteica , Relação Estrutura-Atividade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA