ABSTRACT
BACKGROUND: Cell-free DNA's (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in cfDNA. Using this approach, we first generated a model to classify and score candidate variants for inclusion on a prostate cancer targeted sequencing panel. We then used this panel to screen tumor variants from prostate cancer patients with localized disease in both in silico and hybrid capture settings. METHODS: Whole Genome Sequence (WGS) data from 550 prostate tumors was analyzed to build a targeted sequencing panel of single point and small (< 200 bp) indel mutations, which was subsequently screened in silico against prostate tumor sequences from 5 patients to assess performance against commonly used alternative panel designs. The panel's ability to detect tumor-derived cfDNA variants was then assessed using prospectively collected cfDNA and tumor foci from a test set 18 prostate cancer patients with localized disease undergoing radical proctectomy. RESULTS: The panel generated from this approach identified as top candidates mutations in known driver genes (e.g. HRAS) and prostate cancer related transcription factor binding sites (e.g. MYC, AR). It outperformed two commonly used designs in detecting somatic mutations found in the cfDNA of 5 prostate cancer patients when analyzed in an in silico setting. Additionally, hybrid capture and 2500X sequencing of cfDNA molecules using the panel resulted in detection of tumor variants in all 18 patients of a test set, where 15 of the 18 patients had detected variants found in multiple foci. CONCLUSION: Machine learning-prioritized targeted sequencing panels may prove useful for broad and sensitive variant detection in the cfDNA of heterogeneous diseases. This strategy has implications for disease detection and monitoring when applied to the cfDNA isolated from prostate cancer patients.
Subject(s)
Base Sequence/genetics , Circulating Tumor DNA/genetics , Genome, Human , Machine Learning , Prostatic Neoplasms/genetics , Adult , Aged , Aged, 80 and over , Biomarkers, Tumor/genetics , Biomarkers, Tumor/isolation & purification , Circulating Tumor DNA/isolation & purification , Cohort Studies , Humans , Male , Middle Aged , Mutation , Sequence Analysis, DNA/methods , Whole Genome Sequencing/methodsABSTRACT
Breast cancer is the most common solid organ malignancy and the most frequent cause of cancer death among women worldwide. Previous research has yielded insights into its genetic etiology, but there remains a gap in the understanding of genetic factors that contribute to risk, and particularly in the biological mechanisms by which genetic variation modulates risk. The National Cancer Institute's "Up for a Challenge" (U4C) competition provided an opportunity to further elucidate the genetic basis of the disease. Our group leveraged the seven datasets made available by the U4C organizers and data from the publicly available UK Biobank cohort to examine associations between imputed gene expression and breast cancer risk. In particular, we used reference datasets describing the breast tissue and whole blood transcriptomes to impute expression levels in breast cancer cases and controls. In trans-ethnic meta-analyses of U4C and UK Biobank data, we found significant associations between breast cancer risk and the expression of RCCD1 (joint p-value: 3.6x10-06) and DHODH (p-value: 7.1x10-06) in breast tissue, as well as a suggestive association for ANKLE1 (p-value: 9.3x10-05). Expression of RCCD1 in whole blood was also suggestively associated with disease risk (p-value: 1.2x10-05), as were expression of ACAP1 (p-value: 1.9x10-05) and LRRC25 (p-value: 5.2x10-05). While genome-wide association studies (GWAS) have implicated RCCD1 and ANKLE1 in breast cancer risk, they have not identified the remaining three genes. Among the genetic variants that contributed to the predicted expression of the five genes, we found 23 nominally (p-value < 0.05) associated with breast cancer risk, among which 15 are not in high linkage disequilibrium with risk variants previously identified by GWAS. In summary, we used a transcriptome-based approach to investigate the genetic underpinnings of breast carcinogenesis. This approach provided an avenue for deciphering the functional relevance of genes and genetic variants involved in breast cancer.
Subject(s)
Breast Neoplasms/genetics , Carrier Proteins/genetics , GTPase-Activating Proteins/genetics , Genetic Predisposition to Disease , Membrane Proteins/genetics , Quantitative Trait Loci/genetics , Breast/metabolism , Breast/pathology , Breast Neoplasms/blood , Breast Neoplasms/pathology , Carrier Proteins/blood , Endonucleases/blood , Endonucleases/genetics , Ethnicity , Female , GTPase-Activating Proteins/blood , Genome-Wide Association Study , Humans , Membrane Proteins/blood , Polymorphism, Single Nucleotide , Risk Factors , Transcriptome/geneticsABSTRACT
BACKGROUND: Benign tissue from a tumor-containing organ is commonly the only available source for obtaining a patient's unmutated genome for use in cancer research. While it is critical to identify histologically normal tissue that is independent of the tumor lineage, few additional considerations are applied to the choice of this material for such measurements. METHODS: Normal formalin-fixed, paraffin-embedded seminal vesicle, and urethral tissues, in addition to whole blood, were collected from 31 prostate cancer patients having undergone radical prostatectomy. Genotype concordance was evaluated for DNA from each tissue source in relation to whole blood. RESULTS: Overall, there was a greater genotype call rate for DNA derived from urethral tissue (97.0%) in comparison with patient-matched seminal vesicle tissues (95.9%, P = 0.0015). Furthermore, with reference to patient-matched whole blood, urethral samples exhibited higher genotype concordance (94.1%) than that of seminal vesicle samples (92.5%, P = 0.035). CONCLUSIONS: These findings highlight the heterogeneity between diverse sources of DNA in genotype measurement and motivate the consideration of normal tissue biases in tumor-normal analyses. Prostate 77: 425-434, 2017. © 2016 Wiley Periodicals, Inc.
Subject(s)
DNA/genetics , Genotype , Prostatectomy/standards , Prostatic Neoplasms/genetics , Prostatic Neoplasms/surgery , Adult Germline Stem Cells/physiology , Aged , Humans , Male , Middle Aged , Prostatic Neoplasms/diagnosis , Seminal Vesicles/pathology , Seminal Vesicles/physiology , Seminal Vesicles/surgery , Treatment Outcome , Urethra/pathology , Urethra/physiology , Urethra/surgeryABSTRACT
Prostate cancer is the most commonly diagnosed neoplasm in American men. Although existing biomarkers may detect localized prostate cancer, additional strategies are necessary for improving detection and identifying aggressive disease that may require further intervention. One promising, minimally invasive biomarker is cell-free DNA (cfDNA), which consist of short DNA fragments released into circulation by dying or lysed cells that may reflect underlying cancer. Here we investigated whether differences in cfDNA concentration and cfDNA fragment size could improve the sensitivity for detecting more advanced and aggressive prostate cancer. This study included 268 individuals: 34 healthy controls, 112 men with localized prostate cancer who underwent radical prostatectomy (RP), and 122 men with metastatic castration-resistant prostate cancer (mCRPC). Plasma cfDNA concentration and fragment size were quantified with the Qubit 3.0 and the 2100 Bioanalyzer. The potential relationship between cfDNA concentration or fragment size and localized or mCRPC prostate cancer was evaluated with descriptive statistics, logistic regression, and area under the curve analysis with cross-validation. Plasma cfDNA concentrations were elevated in mCRPC patients in comparison to localized disease (OR5ng/mL = 1.34, P = 0.027) or to being a control (OR5ng/mL = 1.69, P = 0.034). Decreased average fragment size was associated with an increased risk of localized disease compared to controls (OR5bp = 0.77, P = 0.0008). This study suggests that while cfDNA concentration can identify mCRPC patients, it is unable to distinguish between healthy individuals and patients with localized prostate cancer. In addition to PSA, average cfDNA fragment size may be an alternative that can differentiate between healthy individuals and those with localized disease, but the low sensitivity and specificity results in an imperfect diagnostic marker. While quantification of cfDNA may provide a quick, cost-effective approach to help guide treatment decisions in advanced disease, its use is limited in the setting of localized prostate cancer.
Subject(s)
Biomarkers, Tumor/genetics , Cell-Free Nucleic Acids/genetics , Kallikreins/genetics , Prostate-Specific Antigen/genetics , Prostatectomy/methods , Prostatic Neoplasms, Castration-Resistant/diagnosis , Prostatic Neoplasms/diagnosis , Adult , Aged , Aged, 80 and over , Area Under Curve , Biomarkers, Tumor/blood , Case-Control Studies , Cell-Free Nucleic Acids/blood , Humans , Kallikreins/blood , Logistic Models , Male , Middle Aged , Prostate/metabolism , Prostate/pathology , Prostate/surgery , Prostate-Specific Antigen/blood , Prostatic Neoplasms/blood , Prostatic Neoplasms/genetics , Prostatic Neoplasms/surgery , Prostatic Neoplasms, Castration-Resistant/blood , Prostatic Neoplasms, Castration-Resistant/genetics , Prostatic Neoplasms, Castration-Resistant/surgery , ROC CurveABSTRACT
Cell-free DNA (cfDNA) may allow for minimally invasive identification of biologically relevant genomic alterations and genetically distinct tumor subclones. Although existing biomarkers may detect localized prostate cancer, additional strategies interrogating genomic heterogeneity are necessary for identifying and monitoring aggressive disease. In this study, we aimed to evaluate whether circulating tumor DNA can detect genomic alterations present in multiple regions of localized prostate tumor tissue. METHODS: Low-pass whole-genome and targeted sequencing with a machine-learning guided 2.5-Mb targeted panel were used to identify single nucleotide variants, small insertions and deletions (indels), and copy-number alterations in cfDNA. The majority of this study focuses on the subset of 21 patients with localized disease, although 45 total individuals were evaluated, including 15 healthy controls and nine men with metastatic castration-resistant prostate cancer. Plasma cfDNA was barcoded with duplex unique molecular identifiers. For localized cases, matched tumor tissue was collected from multiple regions (one to nine samples per patient) for comparison. RESULTS: Somatic tumor variants present in heterogeneous tumor foci from patients with localized disease were detected in cfDNA, and cfDNA mutational burden was found to track with disease severity. Somatic tissue alterations were identified in cfDNA, including nonsynonymous variants in FOXA1, PTEN, MED12, and ATM. Detection of these overlapping variants was associated with seminal vesicle invasion (P = .019) and with the number of variants initially found in the matched tumor tissue samples (P = .0005). CONCLUSION: Our findings demonstrate the potential of targeted cfDNA sequencing to detect somatic tissue alterations in heterogeneous, localized prostate cancer, especially in a setting where matched tumor tissue may be unavailable (ie, active surveillance or treatment monitoring).
Subject(s)
Cell-Free Nucleic Acids/blood , Cell-Free Nucleic Acids/genetics , Mutation , Prostatic Neoplasms/blood , Prostatic Neoplasms/genetics , Adult , Aged , Genome , Humans , Male , Middle Aged , Sequence Analysis, DNA , Young AdultABSTRACT
Even distinct cancer types share biological hallmarks. Here, we investigate polygenic risk score (PRS)-specific pleiotropy across 16 cancers in European ancestry individuals from the Genetic Epidemiology Research on Adult Health and Aging cohort (16,012 cases, 50,552 controls) and UK Biobank (48,969 cases, 359,802 controls). Within cohorts, each PRS is evaluated in multivariable logistic regression models against all other cancer types. Results are then meta-analyzed across cohorts. Ten positive and one inverse cross-cancer associations are found after multiple testing correction. Two pairs show bidirectional associations; the melanoma PRS is positively associated with oral cavity/pharyngeal cancer and vice versa, whereas the lung cancer PRS is positively associated with oral cavity/pharyngeal cancer, and the oral cavity/pharyngeal cancer PRS is inversely associated with lung cancer. Overall, we validate known, and uncover previously unreported, patterns of pleiotropy that have the potential to inform investigations of risk prediction, shared etiology, and precision cancer prevention strategies.