Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
1.
Immunity ; 48(4): 812-830.e14, 2018 04 17.
Article in English | MEDLINE | ID: mdl-29628290

ABSTRACT

We performed an extensive immunogenomic analysis of more than 10,000 tumors comprising 33 diverse cancer types by utilizing data compiled by TCGA. Across cancer types, we identified six immune subtypes-wound healing, IFN-γ dominant, inflammatory, lymphocyte depleted, immunologically quiet, and TGF-ß dominant-characterized by differences in macrophage or lymphocyte signatures, Th1:Th2 cell ratio, extent of intratumoral heterogeneity, aneuploidy, extent of neoantigen load, overall cell proliferation, expression of immunomodulatory genes, and prognosis. Specific driver mutations correlated with lower (CTNNB1, NRAS, or IDH1) or higher (BRAF, TP53, or CASP8) leukocyte levels across all cancers. Multiple control modalities of the intracellular and extracellular networks (transcription, microRNAs, copy number, and epigenetic processes) were involved in tumor-immune cell interactions, both across and within immune subtypes. Our immunogenomics pipeline to characterize these heterogeneous tumors and the resulting data are intended to serve as a resource for future targeted studies to further advance the field.


Subject(s)
Genomics/methods , Neoplasms , Adolescent , Adult , Aged , Aged, 80 and over , Child , Female , Humans , Interferon-gamma/genetics , Interferon-gamma/immunology , Macrophages/immunology , Male , Middle Aged , Neoplasms/classification , Neoplasms/genetics , Neoplasms/immunology , Prognosis , Th1-Th2 Balance/physiology , Transforming Growth Factor beta/genetics , Transforming Growth Factor beta/immunology , Wound Healing/genetics , Wound Healing/immunology , Young Adult
3.
Proc Natl Acad Sci U S A ; 116(12): 5819-5827, 2019 03 19.
Article in English | MEDLINE | ID: mdl-30833390

ABSTRACT

Preterm birth (PTB) complications are the leading cause of long-term morbidity and mortality in children. By using whole blood samples, we integrated whole-genome sequencing (WGS), RNA sequencing (RNA-seq), and DNA methylation data for 270 PTB and 521 control families. We analyzed this combined dataset to identify genomic variants associated with PTB and secondary analyses to identify variants associated with very early PTB (VEPTB) as well as other subcategories of disease that may contribute to PTB. We identified differentially expressed genes (DEGs) and methylated genomic loci and performed expression and methylation quantitative trait loci analyses to link genomic variants to these expression and methylation changes. We performed enrichment tests to identify overlaps between new and known PTB candidate gene systems. We identified 160 significant genomic variants associated with PTB-related phenotypes. The most significant variants, DEGs, and differentially methylated loci were associated with VEPTB. Integration of all data types identified a set of 72 candidate biomarker genes for VEPTB, encompassing genes and those previously associated with PTB. Notably, PTB-associated genes RAB31 and RBPJ were identified by all three data types (WGS, RNA-seq, and methylation). Pathways associated with VEPTB include EGFR and prolactin signaling pathways, inflammation- and immunity-related pathways, chemokine signaling, IFN-γ signaling, and Notch1 signaling. Progress in identifying molecular components of a complex disease is aided by integrated analyses of multiple molecular data types and clinical data. With these data, and by stratifying PTB by subphenotype, we have identified associations between VEPTB and the underlying biology.


Subject(s)
Genetic Predisposition to Disease/genetics , Premature Birth/genetics , DNA Methylation/genetics , Female , Genomics/methods , Humans , Infant, Newborn , Male , Phenotype , Polymorphism, Single Nucleotide/genetics , Signal Transduction/genetics , Whole Genome Sequencing/methods
4.
PLoS Comput Biol ; 10(11): e1003922, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25375183

ABSTRACT

Herpes simplex virus-2 (HSV-2) is a chronic reactivating infection that leads to recurrent shedding episodes in the genital tract. A minority of episodes are prolonged, and associated with development of painful ulcers. However, currently, available tools poorly predict viral trajectories and timing of reactivations in infected individuals. We employed principal components analysis (PCA) and singular value decomposition (SVD) to interpret HSV-2 genital tract shedding time series data, as well as simulation output from a stochastic spatial mathematical model. Empirical and model-derived, time-series data gathered over >30 days consists of multiple complex episodes that could not be reduced to a manageable number of descriptive features with PCA and SVD. However, single HSV-2 shedding episodes, even those with prolonged duration and complex morphologies consisting of multiple erratic peaks, were consistently described using a maximum of four dominant features. Modeled and clinical episodes had equivalent distributions of dominant features, implying similar dynamics in real and simulated episodes. We applied linear discriminant analysis (LDA) to simulation output and identified that local immune cell density at the viral reactivation site had a predictive effect on episode duration, though longer term shedding suggested chaotic dynamics and could not be predicted based on spatial patterns of immune cell density. These findings suggest that HSV-2 shedding patterns within an individual are impossible to predict over weeks or months, and that even highly complex single HSV-2 episodes can only be partially predicted based on spatial distribution of immune cell density.


Subject(s)
Herpes Genitalis/virology , Herpesvirus 2, Human/physiology , Models, Biological , Virus Shedding/physiology , Computational Biology , Host-Pathogen Interactions , Humans , Principal Component Analysis
5.
Cell Syst ; 6(3): 282-300.e2, 2018 03 28.
Article in English | MEDLINE | ID: mdl-29596783

ABSTRACT

Although the MYC oncogene has been implicated in cancer, a systematic assessment of alterations of MYC, related transcription factors, and co-regulatory proteins, forming the proximal MYC network (PMN), across human cancers is lacking. Using computational approaches, we define genomic and proteomic features associated with MYC and the PMN across the 33 cancers of The Cancer Genome Atlas. Pan-cancer, 28% of all samples had at least one of the MYC paralogs amplified. In contrast, the MYC antagonists MGA and MNT were the most frequently mutated or deleted members, proposing a role as tumor suppressors. MYC alterations were mutually exclusive with PIK3CA, PTEN, APC, or BRAF alterations, suggesting that MYC is a distinct oncogenic driver. Expression analysis revealed MYC-associated pathways in tumor subtypes, such as immune response and growth factor signaling; chromatin, translation, and DNA replication/repair were conserved pan-cancer. This analysis reveals insights into MYC biology and is a reference for biomarkers and therapeutics for cancers with alterations of MYC or the PMN.


Subject(s)
Genes, myc/genetics , Genes, myc/physiology , Proto-Oncogene Proteins c-myc/genetics , Basic Helix-Loop-Helix Leucine Zipper Transcription Factors/genetics , Basic Helix-Loop-Helix Leucine Zipper Transcription Factors/metabolism , Basic Helix-Loop-Helix Transcription Factors/genetics , Basic Helix-Loop-Helix Transcription Factors/metabolism , Biomarkers, Tumor/genetics , Carcinogenesis/genetics , Chromatin , Computational Biology/methods , Genomics , Humans , Neoplasms/genetics , Neoplasms/physiopathology , Oncogenes , Proteomics , Proto-Oncogene Proteins c-myc/physiology , Repressor Proteins/genetics , Repressor Proteins/metabolism , Signal Transduction/genetics , Transcription Factors/genetics
6.
Cancer Res ; 77(21): e7-e10, 2017 11 01.
Article in English | MEDLINE | ID: mdl-29092928

ABSTRACT

The ISB Cancer Genomics Cloud (ISB-CGC) is one of three pilot projects funded by the National Cancer Institute to explore new approaches to computing on large cancer datasets in a cloud environment. With a focus on Data as a Service, the ISB-CGC offers multiple avenues for accessing and analyzing The Cancer Genome Atlas, TARGET, and other important references such as GENCODE and COSMIC using the Google Cloud Platform. The open approach allows researchers to choose approaches best suited to the task at hand: from analyzing terabytes of data using complex workflows to developing new analysis methods in common languages such as Python, R, and SQL; to using an interactive web application to create synthetic patient cohorts and to explore the wealth of available genomic data. Links to resources and documentation can be found at www.isb-cgc.org Cancer Res; 77(21); e7-10. ©2017 AACR.


Subject(s)
Cloud Computing , Computational Biology , Genomics , Neoplasms/genetics , Datasets as Topic , Genome, Human , Humans , Internet , National Cancer Institute (U.S.) , Research/trends , Software , United States
7.
Front Genet ; 7: 34, 2016.
Article in English | MEDLINE | ID: mdl-27047537

ABSTRACT

Most currently available family based association tests are designed to account only for nuclear families with complete genotypes for parents as well as offspring. Due to the availability of increasingly less expensive generation of whole genome sequencing information, genetic studies are able to collect data for more families and from large family cohorts with the goal of improving statistical power. However, due to missing genotypes, many families are not included in the family based association tests, negating the benefits of large scale sequencing data. Here, we present the CIFBAT method to use incomplete families in Family Based Association Test (FBAT) to evaluate robustness against missing data. CIFBAT uses quantile intervals of the FBAT statistic by randomly choosing valid completions of incomplete family genotypes based on Mendelian inheritance rules. By considering all valid completions equally likely and computing quantile intervals over many randomized iterations, CIFBAT avoids assumption of a homogeneous population structure or any particular missingness pattern in the data. Using simulated data, we show that the quantile intervals computed by CIFBAT are useful in validating robustness of the FBAT statistic against missing data and in identifying genomic markers with higher precision. We also propose a novel set of candidate genomic markers for uterine related abnormalities from analysis of familial whole genome sequences, and provide validation for a previously established set of candidate markers for Type 1 diabetes. We have provided a software package that incorporates TDT, robustTDT, FBAT, and CIFBAT. The data format proposed for the software uses half the memory space that the standard FBAT format (PED) files use, making it efficient for large scale genome wide association studies.

8.
Front Genet ; 6: 45, 2015.
Article in English | MEDLINE | ID: mdl-25741365

ABSTRACT

The identification of DNA copy numbers from short-read sequencing data remains a challenge for both technical and algorithmic reasons. The raw data for these analyses are measured in tens to hundreds of gigabytes per genome; transmitting, storing, and analyzing such large files is cumbersome, particularly for methods that analyze several samples simultaneously. We developed a very efficient representation of depth of coverage (150-1000× compression) that enables such analyses. Current methods for analyzing variants in whole-genome sequencing (WGS) data frequently miss copy number variants (CNVs), particularly hemizygous deletions in the 1-100 kb range. To fill this gap, we developed a method to identify CNVs in individual genomes, based on comparison to joint profiles pre-computed from a large set of genomes. We analyzed depth of coverage in over 6000 high quality (>40×) genomes. The depth of coverage has strong sequence-specific fluctuations only partially explained by global parameters like %GC. To account for these fluctuations, we constructed multi-genome profiles representing the observed or inferred diploid depth of coverage at each position along the genome. These Reference Coverage Profiles (RCPs) take into account the diverse technologies and pipeline versions used. Normalization of the scaled coverage to the RCP followed by hidden Markov model (HMM) segmentation enables efficient detection of CNVs and large deletions in individual genomes. Use of pre-computed multi-genome coverage profiles improves our ability to analyze each individual genome. We make available RCPs and tools for performing these analyses on personal genomes. We expect the increased sensitivity and specificity for individual genome analysis to be critical for achieving clinical-grade genome interpretation.

SELECTION OF CITATIONS
SEARCH DETAIL