Search | VHL CLAP/WR-PAHO/WHO

1.

BRCA1 and RNAi factors promote repair mediated by small RNAs and PALB2-RAD52.

Hatchi, Elodie; Goehring, Liana; Landini, Serena; Skourti-Stathaki, Konstantina; DeConti, Derrick K; Abderazzaq, Fieda O; Banerjee, Priyankana; Demers, Timothy M; Wang, Yaoyu E; Quackenbush, John; Livingston, David M.

Nature ; 591(7851): 665-670, 2021 03.

Article in English | MEDLINE | ID: mdl-33536619

ABSTRACT

Strong connections exist between R-loops (three-stranded structures harbouring an RNA:DNA hybrid and a displaced single-strand DNA), genome instability and human disease1-5. Indeed, R-loops are favoured in relevant genomic regions as regulators of certain physiological processes through which homeostasis is typically maintained. For example, transcription termination pause sites regulated by R-loops can induce the synthesis of antisense transcripts that enable the formation of local, RNA interference (RNAi)-driven heterochromation6. Pause sites are also protected against endogenous single-stranded DNA breaks by BRCA17. Hypotheses about how DNA repair is enacted at pause sites include a role for RNA, which is emerging as a normal, albeit unexplained, regulator of genome integrity8. Here we report that a species of single-stranded, DNA-damage-associated small RNA (sdRNA) is generated by a BRCA1-RNAi protein complex. sdRNAs promote DNA repair driven by the PALB2-RAD52 complex at transcriptional termination pause sites that form R-loops and are rich in single-stranded DNA breaks. sdRNA repair operates in both quiescent (G0) and proliferating cells. Thus, sdRNA repair can occur in intact tissue and/or stem cells, and may contribute to tumour suppression mediated by BRCA1.

Subject(s)

BRCA1 Protein/metabolism , DNA Repair , Fanconi Anemia Complementation Group N Protein/metabolism , RNA Interference , Rad52 DNA Repair and Recombination Protein/metabolism , Argonaute Proteins/metabolism , Cell Cycle Proteins/metabolism , DNA Damage , Eukaryotic Initiation Factors/metabolism , HeLa Cells , Humans , RNA, Small Interfering/genetics , RNA, Small Interfering/metabolism , Resting Phase, Cell Cycle , Ribonuclease III/metabolism

2.

Predicting genotype-specific gene regulatory networks.

Weighill, Deborah; Ben Guebila, Marouen; Glass, Kimberly; Quackenbush, John; Platig, John.

Genome Res ; 32(3): 524-533, 2022 03.

Article in English | MEDLINE | ID: mdl-35193937

ABSTRACT

Understanding how each person's unique genotype influences their individual patterns of gene regulation has the potential to improve our understanding of human health and development, and to refine genotype-specific disease risk assessments and treatments. However, the effects of genetic variants are not typically considered when constructing gene regulatory networks, despite the fact that many disease-associated genetic variants are thought to have regulatory effects, including the disruption of transcription factor (TF) binding. We developed EGRET (Estimating the Genetic Regulatory Effect on TFs), which infers a genotype-specific gene regulatory network for each individual in a study population. EGRET begins by constructing a genotype-informed TF-gene prior network derived using TF motif predictions, expression quantitative trait locus (eQTL) data, individual genotypes, and the predicted effects of genetic variants on TF binding. It then uses a technique known as message passing to integrate this prior network with gene expression and TF protein-protein interaction data to produce a refined, genotype-specific regulatory network. We used EGRET to infer gene regulatory networks for two blood-derived cell lines and identified genotype-associated, cell line-specific regulatory differences that we subsequently validated using allele-specific expression, chromatin accessibility QTLs, and differential ChIP-seq TF binding. We also inferred EGRET networks for three cell types from each of 119 individuals and identified cell type-specific regulatory differences associated with diseases related to those cell types. EGRET is, to our knowledge, the first method that infers networks reflective of individual genetic variation in a way that provides insight into the genetic regulatory associations driving complex phenotypes.

Subject(s)

Gene Regulatory Networks , Transcription Factors , Chromatin , Chromatin Immunoprecipitation , Genotype , Humans , Transcription Factors/genetics , Transcription Factors/metabolism

3.

DRAGON: Determining Regulatory Associations using Graphical models on multi-Omic Networks.

Shutta, Katherine H; Weighill, Deborah; Burkholz, Rebekka; Guebila, Marouen Ben; DeMeo, Dawn L; Zacharias, Helena U; Quackenbush, John; Altenbuchinger, Michael.

Nucleic Acids Res ; 51(3): e15, 2023 02 22.

Article in English | MEDLINE | ID: mdl-36533448

ABSTRACT

The increasing quantity of multi-omic data, such as methylomic and transcriptomic profiles collected on the same specimen or even on the same cell, provides a unique opportunity to explore the complex interactions that define cell phenotype and govern cellular responses to perturbations. We propose a network approach based on Gaussian Graphical Models (GGMs) that facilitates the joint analysis of paired omics data. This method, called DRAGON (Determining Regulatory Associations using Graphical models on multi-Omic Networks), calibrates its parameters to achieve an optimal trade-off between the network's complexity and estimation accuracy, while explicitly accounting for the characteristics of each of the assessed omics 'layers.' In simulation studies, we show that DRAGON adapts to edge density and feature size differences between omics layers, improving model inference and edge recovery compared to state-of-the-art methods. We further demonstrate in an analysis of joint transcriptome - methylome data from TCGA breast cancer specimens that DRAGON can identify key molecular mechanisms such as gene regulation via promoter methylation. In particular, we identify Transcription Factor AP-2 Beta (TFAP2B) as a potential multi-omic biomarker for basal-type breast cancer. DRAGON is available as open-source code in Python through the Network Zoo package (netZooPy v0.8; netzoo.github.io).

Subject(s)

Multiomics , Neoplasms , Humans , Software , Computer Simulation , Transcriptome , Neoplasms/genetics , Gene Regulatory Networks

4.

GRAND: a database of gene regulatory network models across human conditions.

Ben Guebila, Marouen; Lopes-Ramos, Camila M; Weighill, Deborah; Sonawane, Abhijeet Rajendra; Burkholz, Rebekka; Shamsaei, Behrouz; Platig, John; Glass, Kimberly; Kuijjer, Marieke L; Quackenbush, John.

Nucleic Acids Res ; 50(D1): D610-D621, 2022 01 07.

Article in English | MEDLINE | ID: mdl-34508353

ABSTRACT

Gene regulation plays a fundamental role in shaping tissue identity, function, and response to perturbation. Regulatory processes are controlled by complex networks of interacting elements, including transcription factors, miRNAs and their target genes. The structure of these networks helps to determine phenotypes and can ultimately influence the development of disease or response to therapy. We developed GRAND (https://grand.networkmedicine.org) as a database for computationally-inferred, context-specific gene regulatory network models that can be compared between biological states, or used to predict which drugs produce changes in regulatory network structure. The database includes 12 468 genome-scale networks covering 36 human tissues, 28 cancers, 1378 unperturbed cell lines, as well as 173 013 TF and gene targeting scores for 2858 small molecule-induced cell line perturbation paired with phenotypic information. GRAND allows the networks to be queried using phenotypic information and visualized using a variety of interactive tools. In addition, it includes a web application that matches disease states to potentially therapeutic small molecule drugs using regulatory network properties.

Subject(s)

Databases, Genetic , Databases, Pharmaceutical , Gene Regulatory Networks/genetics , Software , Gene Expression Regulation/genetics , Genome, Human/genetics , Humans , MicroRNAs/classification , MicroRNAs/genetics , Transcription Factors/classification , Transcription Factors/genetics

5.

Proceedings of the fifth international Molecular Pathological Epidemiology (MPE) meeting.

Yao, Song; Campbell, Peter T; Ugai, Tomotaka; Gierach, Gretchen; Abubakar, Mustapha; Adalsteinsson, Viktor; Almeida, Jonas; Brennan, Paul; Chanock, Stephen; Golub, Todd; Hanash, Samir; Harris, Curtis; Hathaway, Cassandra A; Kelsey, Karl; Landi, Maria Teresa; Mahmood, Faisal; Newton, Christina; Quackenbush, John; Rodig, Scott; Schultz, Nikolaus; Tearney, Guillermo; Tworoger, Shelley S; Wang, Molin; Zhang, Xuehong; Garcia-Closas, Montserrat; Rebbeck, Timothy R; Ambrosone, Christine B; Ogino, Shuji.

Cancer Causes Control ; 33(8): 1107-1120, 2022 Aug.

Article in English | MEDLINE | ID: mdl-35759080

ABSTRACT

Cancer heterogeneities hold the key to a deeper understanding of cancer etiology and progression and the discovery of more precise cancer therapy. Modern pathological and molecular technologies offer a powerful set of tools to profile tumor heterogeneities at multiple levels in large patient populations, from DNA to RNA, protein and epigenetics, and from tumor tissues to tumor microenvironment and liquid biopsy. When coupled with well-validated epidemiologic methodology and well-characterized epidemiologic resources, the rich tumor pathological and molecular tumor information provide new research opportunities at an unprecedented breadth and depth. This is the research space where Molecular Pathological Epidemiology (MPE) emerged over a decade ago and has been thriving since then. As a truly multidisciplinary field, MPE embraces collaborations from diverse fields including epidemiology, pathology, immunology, genetics, biostatistics, bioinformatics, and data science. Since first convened in 2013, the International MPE Meeting series has grown into a dynamic and dedicated platform for experts from these disciplines to communicate novel findings, discuss new research opportunities and challenges, build professional networks, and educate the next-generation scientists. Herein, we share the proceedings of the Fifth International MPE meeting, held virtually online, on May 24 and 25, 2021. The meeting consisted of 21 presentations organized into the three main themes, which were recent integrative MPE studies, novel cancer profiling technologies, and new statistical and data science approaches. Looking forward to the near future, the meeting attendees anticipated continuous expansion and fruition of MPE research in many research fronts, particularly immune-epidemiology, mutational signatures, liquid biopsy, and health disparities.

Subject(s)

Neoplasms , Pathology, Molecular , Humans , Mutation , Neoplasms/epidemiology , Neoplasms/genetics , Neoplasms/therapy , Pathology, Molecular/methods , Tumor Microenvironment

6.

Blood gene expression risk profiles and interstitial lung abnormalities: COPDGene and ECLIPSE cohort studies.

Moll, Matthew; Hobbs, Brian D; Menon, Aravind; Ghosh, Auyon J; Putman, Rachel K; Hino, Takuya; Hata, Akinori; Silverman, Edwin K; Quackenbush, John; Castaldi, Peter J; Hersh, Craig P; McGeachie, Michael J; Sin, Don D; Tal-Singer, Ruth; Nishino, Mizuki; Hatabu, Hiroto; Hunninghake, Gary M; Cho, Michael H.

Respir Res ; 23(1): 157, 2022 Jun 17.

Article in English | MEDLINE | ID: mdl-35715807

ABSTRACT

BACKGROUND: Interstitial lung abnormalities (ILA) are radiologic findings that may progress to idiopathic pulmonary fibrosis (IPF). Blood gene expression profiles can predict IPF mortality, but whether these same genes associate with ILA and ILA outcomes is unknown. This study evaluated if a previously described blood gene expression profile associated with IPF mortality is associated with ILA and all-cause mortality. METHODS: In COPDGene and ECLIPSE study participants with visual scoring of ILA and gene expression data, we evaluated the association of a previously described IPF mortality score with ILA and mortality. We also trained a new ILA score, derived using genes from the IPF score, in a subset of COPDGene. We tested the association with ILA and mortality on the remainder of COPDGene and ECLIPSE. RESULTS: In 1469 COPDGene (training n = 734; testing n = 735) and 571 ECLIPSE participants, the IPF score was not associated with ILA or mortality. However, an ILA score derived from IPF score genes was associated with ILA (meta-analysis of test datasets OR 1.4 [95% CI: 1.2-1.6]) and mortality (HR 1.25 [95% CI: 1.12-1.41]). Six of the 11 genes in the ILA score had discordant directions of effects compared to the IPF score. The ILA score partially mediated the effects of age on mortality (11.8% proportion mediated). CONCLUSIONS: An ILA gene expression score, derived from IPF mortality-associated genes, identified genes with concordant and discordant effects on IPF mortality and ILA. These results suggest shared, and unique biologic processes, amongst those with ILA, IPF, aging, and death.

Subject(s)

Idiopathic Pulmonary Fibrosis , Lung Diseases, Interstitial , Cohort Studies , Humans , Idiopathic Pulmonary Fibrosis/diagnosis , Idiopathic Pulmonary Fibrosis/genetics , Lung , Lung Diseases, Interstitial/diagnosis , Lung Diseases, Interstitial/genetics , Tomography, X-Ray Computed , Transcriptome/genetics

7.

PUMA: PANDA Using MicroRNA Associations.

Kuijjer, Marieke L; Fagny, Maud; Marin, Alessandro; Quackenbush, John; Glass, Kimberly.

Bioinformatics ; 36(18): 4765-4773, 2020 09 15.

Article in English | MEDLINE | ID: mdl-32860050

ABSTRACT

MOTIVATION: Conventional methods to analyze genomic data do not make use of the interplay between multiple factors, such as between microRNAs (miRNAs) and the messenger RNA (mRNA) transcripts they regulate, and thereby often fail to identify the cellular processes that are unique to specific tissues. We developed PUMA (PANDA Using MicroRNA Associations), a computational tool that uses message passing to integrate a prior network of miRNA target predictions with target gene co-expression information to model genome-wide gene regulation by miRNAs. We applied PUMA to 38 tissues from the Genotype-Tissue Expression project, integrating RNA-Seq data with two different miRNA target predictions priors, built on predictions from TargetScan and miRanda, respectively. We found that while target predictions obtained from these two different resources are considerably different, PUMA captures similar tissue-specific miRNA-target regulatory interactions in the different network models. Furthermore, the tissue-specific functions of miRNAs we identified based on regulatory profiles (available at: https://kuijjer.shinyapps.io/puma_gtex/) are highly similar between networks modeled on the two target prediction resources. This indicates that PUMA consistently captures important tissue-specific miRNA regulatory processes. In addition, using PUMA we identified miRNAs regulating important tissue-specific processes that, when mutated, may result in disease development in the same tissue. AVAILABILITY AND IMPLEMENTATION: PUMA is available in C++, MATLAB and Python on GitHub (https://github.com/kuijjerlab and https://netzoo.github.io/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

MicroRNAs , Apoptosis Regulatory Proteins/genetics , Computational Biology , Gene Expression Regulation , Gene Regulatory Networks , MicroRNAs/genetics , RNA, Messenger , RNA-Seq

8.

Transparency and reproducibility in artificial intelligence.

Haibe-Kains, Benjamin; Adam, George Alexandru; Hosny, Ahmed; Khodakarami, Farnoosh; Waldron, Levi; Wang, Bo; McIntosh, Chris; Goldenberg, Anna; Kundaje, Anshul; Greene, Casey S; Broderick, Tamara; Hoffman, Michael M; Leek, Jeffrey T; Korthauer, Keegan; Huber, Wolfgang; Brazma, Alvis; Pineau, Joelle; Tibshirani, Robert; Hastie, Trevor; Ioannidis, John P A; Quackenbush, John; Aerts, Hugo J W L.

Nature ; 586(7829): E14-E16, 2020 10.

Article in English | MEDLINE | ID: mdl-33057217

Subject(s)

Algorithms , Artificial Intelligence , Reproducibility of Results

9.

DNA Methylation Is Predictive of Mortality in Current and Former Smokers.

Morrow, Jarrett D; Make, Barry; Regan, Elizabeth; Han, MeiLan; Hersh, Craig P; Tal-Singer, Ruth; Quackenbush, John; Choi, Augustine M K; Silverman, Edwin K; DeMeo, Dawn L.

Am J Respir Crit Care Med ; 201(9): 1099-1109, 2020 05 01.

Article in English | MEDLINE | ID: mdl-31995399

ABSTRACT

Rationale: Smoking results in at least a decade lower life expectancy. Mortality among current smokers is two to three times as high as never smokers. DNA methylation is an epigenetic modification of the human genome that has been associated with both cigarette smoking and mortality.Objectives: We sought to identify DNA methylation marks in blood that are predictive of mortality in a subset of the COPDGene (Genetic Epidemiology of COPD) study, representing 101 deaths among 667 current and former smokers.Methods: We assayed genome-wide DNA methylation in non-Hispanic white smokers with and without chronic obstructive pulmonary disease (COPD) using blood samples from the COPDGene enrollment visit. We tested whether DNA methylation was associated with mortality in models adjusted for COPD status, age, sex, current smoking status, and pack-years of cigarette smoking. Replication was performed in a subset of 231 individuals from the ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints) study.Measurements and Main Results: We identified seven CpG sites associated with mortality (false discovery rate < 20%) that replicated in the ECLIPSE cohort (P < 0.05). None of these marks were associated with longitudinal lung function decline in survivors, smoking history, or current smoking status. However, differential methylation of two replicated PIK3CD (phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit delta) sites were associated with lung function at enrollment (P < 0.05). We also observed associations between DNA methylation and gene expression for the PIK3CD sites.Conclusions: This study is the first to identify variable DNA methylation associated with all-cause mortality in smokers with and without COPD. Evaluating predictive epigenomic marks of smokers in peripheral blood may allow for targeted risk stratification and aid in delivery of future tailored therapeutic interventions.

Subject(s)

Biomarkers, Tumor/blood , DNA Methylation , Predictive Value of Tests , Pulmonary Disease, Chronic Obstructive/genetics , Pulmonary Disease, Chronic Obstructive/mortality , Smoking/genetics , Smoking/mortality , Adult , Aged , Aged, 80 and over , Cohort Studies , Epigenesis, Genetic , Female , Humans , Male , Middle Aged

10.

Nongenic cancer-risk SNPs affect oncogenes, tumour-suppressor genes, and immune function.

Fagny, Maud; Platig, John; Kuijjer, Marieke Lydia; Lin, Xihong; Quackenbush, John.

Br J Cancer ; 122(4): 569-577, 2020 02.

Article in English | MEDLINE | ID: mdl-31806877

ABSTRACT

BACKGROUND: Genome-wide association studies (GWASes) have identified many noncoding germline single-nucleotide polymorphisms (SNPs) that are associated with an increased risk of developing cancer. However, how these SNPs affect cancer risk is still largely unknown. METHODS: We used a systems biology approach to analyse the regulatory role of cancer-risk SNPs in thirteen tissues. By using data from the Genotype-Tissue Expression (GTEx) project, we performed an expression quantitative trait locus (eQTL) analysis. We represented both significant cis- and trans-eQTLs as edges in tissue-specific eQTL bipartite networks. RESULTS: Each tissue-specific eQTL network is organised into communities that group sets of SNPs and functionally related genes. When mapping cancer-risk SNPs to these networks, we find that in each tissue, these SNPs are significantly overrepresented in communities enriched for immune response processes, as well as tissue-specific functions. Moreover, cancer-risk SNPs are more likely to be 'cores' of their communities, influencing the expression of many genes within the same biological processes. Finally, cancer-risk SNPs preferentially target oncogenes and tumour-suppressor genes, suggesting that they may alter the expression of these key cancer genes. CONCLUSIONS: This approach provides a new way of understanding genetic effects on cancer risk and provides a biological context for interpreting the results of GWAS cancer studies.

Subject(s)

Genes, Tumor Suppressor , Genetic Predisposition to Disease/genetics , Neoplasms/genetics , Neoplasms/immunology , Oncogenes/genetics , Polymorphism, Single Nucleotide , Humans , Quantitative Trait Loci

11.

An online notebook resource for reproducible inference, analysis and publication of gene regulatory networks.

Ben Guebila, Marouen; Weighill, Deborah; Lopes-Ramos, Camila M; Burkholz, Rebekka; Pop, Romana T; Palepu, Kalyan; Shapoval, Mia; Fagny, Maud; Schlauch, Daniel; Glass, Kimberly; Altenbuchinger, Michael; Kuijjer, Marieke L; Platig, John; Quackenbush, John.

Nat Methods ; 19(5): 511-513, 2022 05.

Article in English | MEDLINE | ID: mdl-35459940

Subject(s)

Gene Regulatory Networks , Software , Algorithms , Gene Expression Profiling

12.

Identification of differentially expressed gene sets using the Generalized Berk-Jones statistic.

Gaynor, Sheila M; Sun, Ryan; Lin, Xihong; Quackenbush, John.

Bioinformatics ; 35(22): 4568-4576, 2019 11 01.

Article in English | MEDLINE | ID: mdl-31062858

ABSTRACT

MOTIVATION: Cancer genomics studies frequently aim to identify genes that are differentially expressed between clinically distinct patient subgroups, generally by testing single genes one at a time. However, the results of any individual transcriptomic study are often not fully reproducible. A particular challenge impeding statistical analysis is the difficulty of distinguishing between differential expression comprising part of the genomic disease etiology and that induced by downstream effects. More robust analytical approaches that are well-powered to detect potentially causative genes, are less prone to discovering spurious associations, and can deliver reproducible findings across different studies are needed. RESULTS: We propose a set-based procedure for testing of differential expression and show that this set-based approach can produce more robust results by aggregating information across multiple, correlated genomic markers. Specifically, we adapt the Generalized Berk-Jones statistic to test for the transcription factors that may contribute to the progression of estrogen receptor positive breast cancer. We demonstrate the ability of our method to produce reproducible findings by applying the same analysis to 21 publicly available datasets, producing a similar list of significant transcription factors across most studies. Our Generalized Berk-Jones approach produces results that show improved consistency over three set-based testing algorithms: Generalized Higher Criticism, Gene Set Analysis and Gene Set Enrichment Analysis. AVAILABILITY AND IMPLEMENTATION: Data are in the MetaGxBreast R package. Code is available at github.com/ryanrsun/gaynor_sun_GBJ_breast_cancer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Gene Expression Profiling , Algorithms , Breast Neoplasms , Genome , Humans , Transcriptome

13.

Reverse transcriptase kinetics for one-step RT-PCR.

Rejali, Nick A; Zuiter, Aisha M; Quackenbush, John F; Wittwer, Carl T.

Anal Biochem ; 601: 113768, 2020 07 15.

Article in English | MEDLINE | ID: mdl-32416095

ABSTRACT

Understanding reverse transcriptase (RT) activity is critical for designing fast one-step RT-PCRs. We report a stopped-flow assay that monitors SYBR Green I fluorescence to investigate RT activity in PCR conditions. We studied the influence of PCR conditions on RT activity and assessed the accuracy of cDNA synthesis predictions for one-step RT-PCR. Nucleotide incorporation increased from 26 to 89 s-1 between 1.5 and 6 mM MgCl2 but was largely unaffected by changes in KCl. Conversely, increasing KCl from 15 to 75 mM increased apparent rate constants for RT-oligonucleotide binding (0.010-0.026 nM-1 s-1) and unbinding (0.2-1.5 s-1). All rate constants increased between 22 and 42 °C. When evaluated by PCR quantification cycle, cDNA predictions differed from experiments using RNase H+ RT (average 1.7 cycles) and RNase H- (average 4.5 cycles). Decreasing H+ RT concentrations 10 to 104-fold from manufacturer recommendations improved cDNA predictions (average 0.8 cycles) and increased RT-PCR assay efficiency. RT activity assays and models can be used to aid assay design and improve the speed of RT-PCRs. RT type and concentration must be selected to promote rapid cDNA synthesis but minimize nonspecific amplification. We demonstrate 2-min one-step RT-PCR of a Zika virus target using reduced RT concentrations and extreme PCR.

Subject(s)

RNA-Directed DNA Polymerase/genetics , RNA-Directed DNA Polymerase/metabolism , Reverse Transcriptase Polymerase Chain Reaction , Benzothiazoles , Diamines , Fluorescence , Humans , Kinetics , Organic Chemicals/chemistry , Quinolines

14.

Exploring regulation in tissues with eQTL networks.

Fagny, Maud; Paulson, Joseph N; Kuijjer, Marieke L; Sonawane, Abhijeet R; Chen, Cho-Yi; Lopes-Ramos, Camila M; Glass, Kimberly; Quackenbush, John; Platig, John.

Proc Natl Acad Sci U S A ; 114(37): E7841-E7850, 2017 09 12.

Article in English | MEDLINE | ID: mdl-28851834

ABSTRACT

Characterizing the collective regulatory impact of genetic variants on complex phenotypes is a major challenge in developing a genotype to phenotype map. Using expression quantitative trait locus (eQTL) analyses, we constructed bipartite networks in which edges represent significant associations between genetic variants and gene expression levels and found that the network structure informs regulatory function. We show, in 13 tissues, that these eQTL networks are organized into dense, highly modular communities grouping genes often involved in coherent biological processes. We find communities representing shared processes across tissues, as well as communities associated with tissue-specific processes that coalesce around variants in tissue-specific active chromatin regions. Node centrality is also highly informative, with the global and community hubs differing in regulatory potential and likelihood of being disease associated.

Subject(s)

Genome-Wide Association Study/methods , Organ Specificity/genetics , Quantitative Trait Loci/genetics , Gene Expression/genetics , Gene Expression Regulation/genetics , Gene Regulatory Networks/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide/genetics , Quantitative Trait Loci/physiology , Transcriptome/genetics

15.

Smooth quantile normalization.

Hicks, Stephanie C; Okrah, Kwame; Paulson, Joseph N; Quackenbush, John; Irizarry, Rafael A; Bravo, Héctor Corrada.

Biostatistics ; 19(2): 185-198, 2018 04 01.

Article in English | MEDLINE | ID: mdl-29036413

ABSTRACT

Between-sample normalization is a critical step in genomic data analysis to remove systematic bias and unwanted technical variation in high-throughput data. Global normalization methods are based on the assumption that observed variability in global properties is due to technical reasons and are unrelated to the biology of interest. For example, some methods correct for differences in sequencing read counts by scaling features to have similar median values across samples, but these fail to reduce other forms of unwanted technical variation. Methods such as quantile normalization transform the statistical distributions across samples to be the same and assume global differences in the distribution are induced by only technical variation. However, it remains unclear how to proceed with normalization if these assumptions are violated, for example, if there are global differences in the statistical distributions between biological conditions or groups, and external information, such as negative or control features, is not available. Here, we introduce a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions, but allowing that they may differ between groups. We illustrate the advantages of our method on several high-throughput datasets with global differences in distributions corresponding to different biological conditions. We also perform a Monte Carlo simulation study to illustrate the bias-variance tradeoff and root mean squared error of qsmooth compared to other global normalization methods. A software implementation is available from https://github.com/stephaniehicks/qsmooth.

Subject(s)

Biostatistics/methods , Data Interpretation, Statistical , Genomics/statistics & numerical data , High-Throughput Nucleotide Sequencing/statistics & numerical data , Models, Statistical , Humans

16.

Proceedings of the fourth international molecular pathological epidemiology (MPE) meeting.

Campbell, Peter T; Ambrosone, Christine B; Nishihara, Reiko; Aerts, Hugo J W L; Bondy, Melissa; Chatterjee, Nilanjan; Garcia-Closas, Montserrat; Giannakis, Marios; Golden, Jeffrey A; Heng, Yujing J; Kip, N Sertac; Koshiol, Jill; Liu, X Shirley; Lopes-Ramos, Camila M; Mucci, Lorelei A; Nowak, Jonathan A; Phipps, Amanda I; Quackenbush, John; Schoen, Robert E; Sholl, Lynette M; Tamimi, Rulla M; Wang, Molin; Weijenberg, Matty P; Wu, Catherine J; Wu, Kana; Yao, Song; Yu, Kun-Hsing; Zhang, Xuehong; Rebbeck, Timothy R; Ogino, Shuji.

Cancer Causes Control ; 30(8): 799-811, 2019 Aug.

Article in English | MEDLINE | ID: mdl-31069578

ABSTRACT

An important premise of epidemiology is that individuals with the same disease share similar underlying etiologies and clinical outcomes. In the past few decades, our knowledge of disease pathogenesis has improved, and disease classification systems have evolved to the point where no complex disease processes are considered homogenous. As a result, pathology and epidemiology have been integrated into the single, unified field of molecular pathological epidemiology (MPE). Advancing integrative molecular and population-level health sciences and addressing the unique research challenges specific to the field of MPE necessitates assembling experts in diverse fields, including epidemiology, pathology, biostatistics, computational biology, bioinformatics, genomics, immunology, and nutritional and environmental sciences. Integrating these seemingly divergent fields can lead to a greater understanding of pathogenic processes. The International MPE Meeting Series fosters discussion that addresses the specific research questions and challenges in this emerging field. The purpose of the meeting series is to: discuss novel methods to integrate pathology and epidemiology; discuss studies that provide pathogenic insights into population impact; and educate next-generation scientists. Herein, we share the proceedings of the Fourth International MPE Meeting, held in Boston, MA, USA, on 30 May-1 June, 2018. Major themes of this meeting included 'integrated genetic and molecular pathologic epidemiology', 'immunology-MPE', and 'novel disease phenotyping'. The key priority areas for future research identified by meeting attendees included integration of tumor immunology and cancer disparities into epidemiologic studies, further collaboration between computational and population-level scientists to gain new insight on exposure-disease associations, and future pooling projects of studies with comparable data.

Subject(s)

Epidemiology , Pathology, Molecular , Humans , Neoplasms/epidemiology , Neoplasms/genetics , Neoplasms/immunology , Neoplasms/pathology

17.

lionessR: single sample network inference in R.

Kuijjer, Marieke L; Hsieh, Ping-Han; Quackenbush, John; Glass, Kimberly.

BMC Cancer ; 19(1): 1003, 2019 Oct 25.

Article in English | MEDLINE | ID: mdl-31653243

ABSTRACT

BACKGROUND: In biomedical research, network inference algorithms are typically used to infer complex association patterns between biological entities, such as between genes or proteins, using data from a population. This resulting aggregate network, in essence, averages over the networks of those individuals in the population. LIONESS (Linear Interpolation to Obtain Network Estimates for Single Samples) is a method that can be used together with a network inference algorithm to extract networks for individual samples in a population. The method's key characteristic is that, by modeling networks for individual samples in a data set, it can capture network heterogeneity in a population. LIONESS was originally made available as a function within the PANDA (Passing Attributes between Networks for Data Assimilation) regulatory network reconstruction framework. However, the LIONESS algorithm is generalizable and can be used to model single sample networks based on a wide range of network inference algorithms. RESULTS: In this software article, we describe lionessR, an R implementation of LIONESS that can be applied to any network inference method in R that outputs a complete, weighted adjacency matrix. As an example, we provide a vignette of an application of lionessR to model single sample networks based on correlated gene expression in a bone cancer dataset. We show how the tool can be used to identify differential patterns of correlation between two groups of patients. CONCLUSIONS: We developed lionessR, an open source R package to model single sample networks. We show how lionessR can be used to inform us on potential precision medicine applications in cancer. The lionessR package is a user-friendly tool to perform such analyses. The package, which includes a vignette describing the application, is freely available at: https://github.com/kuijjerlab/lionessR and at: http://bioconductor.org/packages/lionessR .

Subject(s)

Algorithms , Computational Biology/methods , Computer Simulation , Precision Medicine/methods , Software , Biopsy , Bone Neoplasms/genetics , Bone Neoplasms/pathology , Gene Regulatory Networks , Humans , Neoplasms/therapy , Osteosarcoma/genetics , Osteosarcoma/pathology , Survival Analysis , Transcriptome

18.

Ensemble genomic analysis in human lung tissue identifies novel genes for chronic obstructive pulmonary disease.

Morrow, Jarrett D; Cho, Michael H; Platig, John; Zhou, Xiaobo; DeMeo, Dawn L; Qiu, Weiliang; Celli, Bartholome; Marchetti, Nathaniel; Criner, Gerard J; Bueno, Raphael; Washko, George R; Glass, Kimberly; Quackenbush, John; Silverman, Edwin K; Hersh, Craig P.

Hum Genomics ; 12(1): 1, 2018 01 15.

Article in English | MEDLINE | ID: mdl-29335020

ABSTRACT

BACKGROUND: Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) significantly associated with chronic obstructive pulmonary disease (COPD). However, many genetic variants show suggestive evidence for association but do not meet the strict threshold for genome-wide significance. Integrative analysis of multiple omics datasets has the potential to identify novel genes involved in disease pathogenesis by leveraging these variants in a functional, regulatory context. RESULTS: We performed expression quantitative trait locus (eQTL) analysis using genome-wide SNP genotyping and gene expression profiling of lung tissue samples from 86 COPD cases and 31 controls, testing for SNPs associated with gene expression levels. These results were integrated with a prior COPD GWAS using an ensemble statistical and network methods approach to identify relevant genes and observe them in the context of overall genetic control of gene expression to highlight co-regulated genes and disease pathways. We identified 250,312 unique SNPs and 4997 genes in the cis(local)-eQTL analysis (5% false discovery rate). The top gene from the integrative analysis was MAPT, a gene recently identified in an independent GWAS of lung function. The genes HNRNPAB and PCBP2 with RNA binding activity and the gene ACVR1B were identified in network communities with validated disease relevance. CONCLUSIONS: The integration of lung tissue gene expression with genome-wide SNP genotyping and subsequent intersection with prior GWAS and omics studies highlighted candidate genes within COPD loci and in communities harboring known COPD genes. This integration also identified novel disease genes in sub-threshold regions that would otherwise have been missed through GWAS.

Subject(s)

Genetic Predisposition to Disease , Genome, Human/genetics , Genome-Wide Association Study , Pulmonary Disease, Chronic Obstructive/genetics , Activin Receptors, Type I/genetics , Adult , Aged , Female , Gene Expression Regulation , Genomics , Heterogeneous-Nuclear Ribonucleoprotein Group A-B/genetics , Humans , Lung/metabolism , Male , Middle Aged , Polymorphism, Single Nucleotide/genetics , Pulmonary Disease, Chronic Obstructive/pathology , Quantitative Trait Loci/genetics , RNA-Binding Proteins/genetics , tau Proteins/genetics

19.

Inconsistency in large pharmacogenomic studies.

Haibe-Kains, Benjamin; El-Hachem, Nehme; Birkbak, Nicolai Juul; Jin, Andrew C; Beck, Andrew H; Aerts, Hugo J W L; Quackenbush, John.

Nature ; 504(7480): 389-93, 2013 Dec 19.

Article in English | MEDLINE | ID: mdl-24284626

ABSTRACT

Two large-scale pharmacogenomic studies were published recently in this journal. Genomic data are well correlated between studies; however, the measured drug response data are highly discordant. Although the source of inconsistencies remains uncertain, it has potential implications for using these outcome measures to assess gene-drug associations or select potential anticancer drugs on the basis of their reported results.

Subject(s)

Antineoplastic Agents/pharmacology , Pharmacogenetics , Area Under Curve , Cell Line , Drug Resistance, Neoplasm/drug effects , Drug Resistance, Neoplasm/genetics , Gene Expression Profiling , Genome, Human/genetics , Humans , Inhibitory Concentration 50 , Neoplasms/drug therapy , Neoplasms/genetics , Neoplasms/pathology , Reproducibility of Results

20.

Human Lung DNA Methylation Quantitative Trait Loci Colocalize with Chronic Obstructive Pulmonary Disease Genome-Wide Association Loci.

Morrow, Jarrett D; Glass, Kimberly; Cho, Michael H; Hersh, Craig P; Pinto-Plata, Victor; Celli, Bartolome; Marchetti, Nathaniel; Criner, Gerard; Bueno, Raphael; Washko, George; Choi, Augustine M K; Quackenbush, John; Silverman, Edwin K; DeMeo, Dawn L.

Am J Respir Crit Care Med ; 197(10): 1275-1284, 2018 05 15.

Article in English | MEDLINE | ID: mdl-29313708

ABSTRACT

RATIONALE: As the third leading cause of death in the United States, the impact of chronic obstructive pulmonary disease (COPD) makes identification of its molecular mechanisms of great importance. Genome-wide association studies (GWASs) have identified multiple genomic regions associated with COPD. However, genetic variation only explains a small fraction of the susceptibility to COPD, and sub-genome-wide significant loci may play a role in pathogenesis. OBJECTIVES: Regulatory annotation with epigenetic evidence may give priority for further investigation, particularly for GWAS associations in noncoding regions. We performed integrative genomics analyses using DNA methylation profiling and genome-wide SNP genotyping from lung tissue samples from 90 subjects with COPD and 36 control subjects. METHODS: We performed methylation quantitative trait loci (mQTL) analyses, testing for SNPs associated with percent DNA methylation and assessed the colocalization of these results with previous COPD GWAS findings using Bayesian methods in the R package coloc to highlight potential regulatory features of the loci. MEASUREMENTS AND MAIN RESULTS: We identified 942,068 unique SNPs and 33,996 unique CpG sites among the significant (5% false discovery rate) cis-mQTL results. The genome-wide significant and subthreshold (P < 10-4) GWAS SNPs were enriched in the significant mQTL SNPs (hypergeometric test P < 0.00001). We observed enrichment for sites located in CpG shores and shelves, but not CpG islands. Using Bayesian colocalization, we identified loci in regions near KCNK3, EEFSEC, PIK3CD, DCDC2C, TCERG1L, FRMD4B, and IL27. CONCLUSIONS: Colocalization of mQTL and GWAS loci provides regulatory characterization of significant and subthreshold GWAS findings, supporting a role for genetic control of methylation in COPD pathogenesis.

Subject(s)

DNA Methylation/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Lung/physiopathology , Pulmonary Disease, Chronic Obstructive/genetics , Adult , Aged , Aged, 80 and over , Epigenomics , Female , Gene Expression Regulation , Humans , Male , Middle Aged , Pulmonary Disease, Chronic Obstructive/epidemiology , Quantitative Trait Loci , United States/epidemiology

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL