Your browser doesn't support javascript.
loading
Bayesian copy number detection and association in large-scale studies.
Cristiano, Stephen; McKean, David; Carey, Jacob; Bracci, Paige; Brennan, Paul; Chou, Michael; Du, Mengmeng; Gallinger, Steven; Goggins, Michael G; Hassan, Manal M; Hung, Rayjean J; Kurtz, Robert C; Li, Donghui; Lu, Lingeng; Neale, Rachel; Olson, Sara; Petersen, Gloria; Rabe, Kari G; Fu, Jack; Risch, Harvey; Rosner, Gary L; Ruczinski, Ingo; Klein, Alison P; Scharpf, Robert B.
Affiliation
  • Cristiano S; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
  • McKean D; Department of Oncology The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
  • Carey J; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
  • Bracci P; Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA.
  • Brennan P; Genetics Section, International Agency for Research on Cancer, Lyon, France.
  • Chou M; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
  • Du M; Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, 10065, NY, USA.
  • Gallinger S; Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, M5G 1x5, Ontario, Canada.
  • Goggins MG; Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
  • Hassan MM; Department of Pathology, Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins School of Medicine, Baltimore, MD, USA.
  • Hung RJ; Department of Epidemiology, Cancer Prevention & Population Sciences, UT MD Anderson Cancer Center, Houston, 77030, TX, USA.
  • Kurtz RC; Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, M5G 1x5, Ontario, Canada.
  • Li D; Department of Gastroenterology, Hepatology, and Nutrition Service, Memorial Sloan Kettering Cancer Center, New York, 10065, NY, USA.
  • Lu L; Department of Gastrointestinal Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, 77030, TX, USA.
  • Neale R; Department of Chronic Disease Epidemiology, Yale School of Public Health, Yale Cancer Center, New Haven, CT, USA.
  • Olson S; Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, 4029, Australia.
  • Petersen G; Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, 10065, NY, USA.
  • Rabe KG; Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, 55905, MN, USA.
  • Fu J; Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, 55905, MN, USA.
  • Risch H; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
  • Rosner GL; Department of Chronic Disease Epidemiology, Yale School of Public Health, Yale Cancer Center, New Haven, CT, USA.
  • Ruczinski I; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
  • Klein AP; Department of Epidemiology, Cancer Prevention & Population Sciences, UT MD Anderson Cancer Center, Houston, 77030, TX, USA.
  • Scharpf RB; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
BMC Cancer ; 20(1): 856, 2020 Sep 07.
Article in En | MEDLINE | ID: mdl-32894098
ABSTRACT

BACKGROUND:

Germline copy number variants (CNVs) increase risk for many diseases, yet detection of CNVs and quantifying their contribution to disease risk in large-scale studies is challenging due to biological and technical sources of heterogeneity that vary across the genome within and between samples.

METHODS:

We developed an approach called CNPBayes to identify latent batch effects in genome-wide association studies involving copy number, to provide probabilistic estimates of integer copy number across the estimated batches, and to fully integrate the copy number uncertainty in the association model for disease.

RESULTS:

Applying a hidden Markov model (HMM) to identify CNVs in a large multi-site Pancreatic Cancer Case Control study (PanC4) of 7598 participants, we found CNV inference was highly sensitive to technical noise that varied appreciably among participants. Applying CNPBayes to this dataset, we found that the major sources of technical variation were linked to sample processing by the centralized laboratory and not the individual study sites. Modeling the latent batch effects at each CNV region hierarchically, we developed probabilistic estimates of copy number that were directly incorporated in a Bayesian regression model for pancreatic cancer risk. Candidate associations aided by this approach include deletions of 8q24 near regulatory elements of the tumor oncogene MYC and of Tumor Suppressor Candidate 3 (TUSC3).

CONCLUSIONS:

Laboratory effects may not account for the major sources of technical variation in genome-wide association studies. This study provides a robust Bayesian inferential framework for identifying latent batch effects, estimating copy number, and evaluating the role of copy number in heritable diseases.
Subject(s)
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Pancreatic Neoplasms / Genome, Human / Genetic Predisposition to Disease / DNA Copy Number Variations Type of study: Diagnostic_studies / Observational_studies / Prognostic_studies / Risk_factors_studies Limits: Humans Language: En Journal: BMC Cancer Journal subject: NEOPLASIAS Year: 2020 Type: Article Affiliation country: United States

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Pancreatic Neoplasms / Genome, Human / Genetic Predisposition to Disease / DNA Copy Number Variations Type of study: Diagnostic_studies / Observational_studies / Prognostic_studies / Risk_factors_studies Limits: Humans Language: En Journal: BMC Cancer Journal subject: NEOPLASIAS Year: 2020 Type: Article Affiliation country: United States