RESUMO
Cell-free DNA in the blood provides a non-invasive diagnostic avenue for patients with cancer1. However, characteristics of the origins and molecular features of cell-free DNA are poorly understood. Here we developed an approach to evaluate fragmentation patterns of cell-free DNA across the genome, and found that profiles of healthy individuals reflected nucleosomal patterns of white blood cells, whereas patients with cancer had altered fragmentation profiles. We used this method to analyse the fragmentation profiles of 236 patients with breast, colorectal, lung, ovarian, pancreatic, gastric or bile duct cancer and 245 healthy individuals. A machine learning model that incorporated genome-wide fragmentation features had sensitivities of detection ranging from 57% to more than 99% among the seven cancer types at 98% specificity, with an overall area under the curve value of 0.94. Fragmentation profiles could be used to identify the tissue of origin of the cancers to a limited number of sites in 75% of cases. Combining our approach with mutation-based cell-free DNA analyses detected 91% of patients with cancer. The results of these analyses highlight important properties of cell-free DNA and provide a proof-of-principle approach for the screening, early detection and monitoring of human cancer.
Assuntos
DNA Tumoral Circulante/sangue , DNA Tumoral Circulante/genética , Fragmentação do DNA , Genoma Humano/genética , Neoplasias/diagnóstico , Neoplasias/genética , Estudos de Casos e Controles , Estudos de Coortes , Análise Mutacional de DNA , Humanos , Aprendizado de Máquina , Mutação , Neoplasias/sangue , Neoplasias/patologiaRESUMO
BACKGROUND: Germline copy number variants (CNVs) increase risk for many diseases, yet detection of CNVs and quantifying their contribution to disease risk in large-scale studies is challenging due to biological and technical sources of heterogeneity that vary across the genome within and between samples. METHODS: We developed an approach called CNPBayes to identify latent batch effects in genome-wide association studies involving copy number, to provide probabilistic estimates of integer copy number across the estimated batches, and to fully integrate the copy number uncertainty in the association model for disease. RESULTS: Applying a hidden Markov model (HMM) to identify CNVs in a large multi-site Pancreatic Cancer Case Control study (PanC4) of 7598 participants, we found CNV inference was highly sensitive to technical noise that varied appreciably among participants. Applying CNPBayes to this dataset, we found that the major sources of technical variation were linked to sample processing by the centralized laboratory and not the individual study sites. Modeling the latent batch effects at each CNV region hierarchically, we developed probabilistic estimates of copy number that were directly incorporated in a Bayesian regression model for pancreatic cancer risk. Candidate associations aided by this approach include deletions of 8q24 near regulatory elements of the tumor oncogene MYC and of Tumor Suppressor Candidate 3 (TUSC3). CONCLUSIONS: Laboratory effects may not account for the major sources of technical variation in genome-wide association studies. This study provides a robust Bayesian inferential framework for identifying latent batch effects, estimating copy number, and evaluating the role of copy number in heritable diseases.
Assuntos
Variações do Número de Cópias de DNA/genética , Predisposição Genética para Doença , Genoma Humano/genética , Neoplasias Pancreáticas/genética , Teorema de Bayes , Estudos de Casos e Controles , Estudo de Associação Genômica Ampla , Humanos , Proteínas de Membrana/genética , Neoplasias Pancreáticas/patologia , Proteínas Proto-Oncogênicas c-myc/genética , Proteínas Supressoras de Tumor/genéticaRESUMO
By sequencing the exomes of distantly related individuals in multiplex families, rare mutational and structural changes to coding DNA can be characterized and their relationship to disease risk can be assessed. Recently, several rare single nucleotide variants (SNVs) were associated with an increased risk of nonsyndromic oral cleft, highlighting the importance of rare sequence variants in oral clefts and illustrating the strength of family-based study designs. However, the extent to which rare deletions in coding regions of the genome occur and contribute to risk of nonsyndromic clefts is not well understood. To identify putative structural variants underlying risk, we developed a pipeline for rare hemizygous deletions in families from whole exome sequencing and statistical inference based on rare variant sharing. Among 56 multiplex families with 115 individuals, we identified 53 regions with one or more rare hemizygous deletions. We found 45 of the 53 regions contained rare deletions occurring in only one family member. Members of the same family shared a rare deletion in only eight regions. We also devised a scalable global test for enrichment of shared rare deletions.
Assuntos
Biomarcadores/análise , Fissura Palatina/genética , Exoma/genética , Deleção de Genes , Variação Genética/genética , Algoritmos , Família , Feminino , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , MasculinoRESUMO
BACKGROUND: Hyperuricemia is associated with multiple diseases, including gout, cardiovascular disease, and renal disease. Serum urate is highly heritable, yet association studies of single nucleotide polymorphisms (SNPs) and serum uric acid explain a small fraction of the heritability. Whether copy number polymorphisms (CNPs) contribute to uric acid levels is unknown. RESULTS: We assessed copy number on a genome-wide scale among 8,411 individuals of European ancestry (EA) who participated in the Atherosclerosis Risk in Communities (ARIC) study. CNPs upstream of the urate transporter SLC2A9 on chromosome 4p16.1 are associated with uric acid (χ2df2=3545, p=3.19×10-23). Effect sizes, expressed as the percentage change in uric acid per deleted copy, are most pronounced among women (3.974.935.87 [ 2.55097.5 denoting percentiles], p=4.57×10-23) and independent of previously reported SNPs in SLC2A9 as assessed by SNP and CNP regression models and the phasing SNP and CNP haplotypes (χ2df2=3190,p=7.23×10-08). Our finding is replicated in the Framingham Heart Study (FHS), where the effect size estimated from 4,089 women is comparable to ARIC in direction and magnitude (1.414.707.88, p=5.46×10-03). CONCLUSIONS: This is the first study to characterize CNPs in ARIC and the first genome-wide analysis of CNPs and uric acid. Our findings suggests a novel, non-coding regulatory mechanism for SLC2A9-mediated modulation of serum uric acid, and detail a bioinformatic approach for assessing the contribution of CNPs to heritable traits in large population-based studies where technical sources of variation are substantial.
Assuntos
Variações do Número de Cópias de DNA , Proteínas Facilitadoras de Transporte de Glucose/genética , Ácido Úrico/sangue , Feminino , Frequência do Gene , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Transportadores de Ânions Orgânicos/genética , Polimorfismo de Nucleotídeo Único , Análise de Regressão , População Branca/genéticaRESUMO
Circulating cell-free DNA (cfDNA) is emerging as an avenue for cancer detection, but the characteristics of cfDNA fragmentation in the blood are poorly understood. We evaluate the effect of DNA methylation and gene expression on genome-wide cfDNA fragmentation through analysis of 969 individuals. cfDNA fragment ends more frequently contained CCs or CGs, and fragments ending with CGs or CCGs are enriched or depleted, respectively, at methylated CpG positions. Higher levels and larger sizes of cfDNA fragments are associated with CpG methylation and reduced gene expression. These effects are validated in mice with isogenic tumors with or without the mutant IDH1, and are associated with genome-wide changes in cfDNA fragmentation in patients with cancer. Tumor-related hypomethylation and increased gene expression are associated with decrease in cfDNA fragment size that may explain smaller cfDNA fragments in human cancers. These results provide a connection between epigenetic changes and cfDNA fragmentation with implications for disease detection.
Assuntos
Ácidos Nucleicos Livres , Ilhas de CpG , Fragmentação do DNA , Metilação de DNA , Neoplasias , Humanos , Ácidos Nucleicos Livres/genética , Ácidos Nucleicos Livres/sangue , Animais , Camundongos , Ilhas de CpG/genética , Neoplasias/genética , Epigênese Genética , Feminino , Isocitrato Desidrogenase/genética , Masculino , Regulação Neoplásica da Expressão GênicaRESUMO
Circulating cell-free DNA (cfDNA) assays for monitoring individuals with cancer typically rely on prior identification of tumor-specific mutations. Here, we develop a tumor-independent and mutation-independent approach (DELFI-tumor fraction, DELFI-TF) using low-coverage whole genome sequencing to determine the cfDNA tumor fraction and validate the method in two independent cohorts of patients with colorectal or lung cancer. DELFI-TF scores strongly correlate with circulating tumor DNA levels (ctDNA) (r = 0.90, p < 0.0001, Pearson correlation) even in cases where mutations are undetectable. DELFI-TF scores prior to therapy initiation are associated with clinical response and are independent predictors of overall survival (HR = 9.84, 95% CI = 1.72-56.10, p < 0.0001). Patients with lower DELFI-TF scores during treatment have longer overall survival (62.8 vs 29.1 months, HR = 3.12, 95% CI 1.62-6.00, p < 0.001) and the approach predicts clinical outcomes more accurately than imaging. These results demonstrate the potential of using cfDNA fragmentomes to estimate tumor burden in cfDNA for treatment response monitoring and clinical outcome prediction.
Assuntos
Biomarcadores Tumorais , Ácidos Nucleicos Livres , DNA Tumoral Circulante , Neoplasias Colorretais , Neoplasias Pulmonares , Humanos , DNA Tumoral Circulante/genética , DNA Tumoral Circulante/sangue , Feminino , Masculino , Biomarcadores Tumorais/genética , Ácidos Nucleicos Livres/genética , Ácidos Nucleicos Livres/sangue , Neoplasias Colorretais/genética , Neoplasias Colorretais/mortalidade , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/mortalidade , Pessoa de Meia-Idade , Mutação , Idoso , Sequenciamento Completo do Genoma/métodos , Prognóstico , Neoplasias/genética , Neoplasias/terapia , Neoplasias/mortalidadeRESUMO
Ovarian cancer is a leading cause of death for women worldwide in part due to ineffective screening methods. In this study, we used whole-genome cell-free DNA (cfDNA) fragmentome and protein biomarker (CA-125 and HE4) analyses to evaluate 591 women with ovarian cancer, benign adnexal masses, or without ovarian lesions. Using a machine learning model with the combined features, we detected ovarian cancer with specificity >99% and sensitivity of 72%, 69%, 87%, and 100% for stages I-IV, respectively. At the same specificity, CA-125 alone detected 34%, 62%, 63%, and 100% of ovarian cancers for stages I-IV. Our approach differentiated benign masses from ovarian cancers with high accuracy (AUC=0.88, 95% CI=0.83-0.92). These results were validated in an independent population. These findings show that integrated cfDNA fragmentome and protein analyses detect ovarian cancers with high performance, enabling a new accessible approach for noninvasive ovarian cancer screening and diagnostic evaluation.
RESUMO
BACKGROUND: The diagnostic workup of individuals suspected of having lung cancer can be complex and protracted because conventional symptoms of lung cancer have low specificity and sensitivity. RESEARCH QUESTION: Among individuals with symptoms of lung cancer, can a blood-based approach to analyze cell-free DNA (cfDNA) fragmentation (the DNA evaluation of fragments for early interception [DELFI] score) enhance evaluation for the possible presence of lung cancer? STUDY DESIGN AND METHODS: Adults were referred to Bispebjerg Hospital (Copenhagen, Denmark) for diagnostic evaluation of initial imaging anomalies and symptoms consistent with lung cancer. Numbers and types of symptoms were extracted from medical records. cfDNA from plasma samples obtained at the prediagnostic visit was isolated, sequenced, and analyzed for genome-wide cfDNA fragmentation patterns. The relationships among clinical presentation, cancer status, and DELFI score were examined. RESULTS: A total of 296 individuals were analyzed. Median DELFI scores were higher for those with lung cancer (n = 98) than those without cancer (n = 198; 0.94 vs 0.19; P < .001). In a multivariate model adjusted for age, smoking history, and presenting symptoms, the addition of the DELFI score improved the prediction of lung cancer for those who demonstrated symptoms (area under the receiver operating characteristic curve, 0.74-0.94). INTERPRETATION: The DELFI score distinguishes individuals with lung cancer from those without cancer better than suspicious symptoms do. These results represent proof-of-concept support that fragmentation-based biomarker approaches may facilitate diagnostic resolution for patients with concerning symptoms of lung cancer.
Assuntos
Ácidos Nucleicos Livres , Neoplasias Pulmonares , Adulto , Humanos , Neoplasias Pulmonares/genética , Biomarcadores , DNA , Curva ROC , Biomarcadores TumoraisRESUMO
Somatic mutations are a hallmark of tumorigenesis and may be useful for non-invasive diagnosis of cancer. We analyzed whole-genome sequencing data from 2,511 individuals in the Pan-Cancer Analysis of Whole Genomes (PCAWG) study as well as 489 individuals from four prospective cohorts and found distinct regional mutation type-specific frequencies in tissue and cell-free DNA from patients with cancer that were associated with replication timing and other chromatin features. A machine-learning model using genome-wide mutational profiles combined with other features and followed by CT imaging detected >90% of patients with lung cancer, including those with stage I and II disease. The fixed model was validated in an independent cohort, detected patients with cancer earlier than standard approaches and could be used to monitor response to therapy. This approach lays the groundwork for non-invasive cancer detection using genome-wide mutation features that may facilitate cancer screening and monitoring.
Assuntos
Ácidos Nucleicos Livres , Neoplasias Pulmonares , Neoplasias , Humanos , Estudos Prospectivos , Mutação , Neoplasias/diagnóstico , Neoplasias/genética , Taxa de Mutação , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genéticaRESUMO
Liver cancer is a major cause of cancer mortality worldwide. Screening individuals at high risk, including those with cirrhosis and viral hepatitis, provides an avenue for improved survival, but current screening methods are inadequate. In this study, we used whole-genome cell-free DNA (cfDNA) fragmentome analyses to evaluate 724 individuals from the United States, the European Union, or Hong Kong with hepatocellular carcinoma (HCC) or who were at average or high-risk for HCC. Using a machine learning model that incorporated multifeature fragmentome data, the sensitivity for detecting cancer was 88% in an average-risk population at 98% specificity and 85% among high-risk individuals at 80% specificity. We validated these results in an independent population. cfDNA fragmentation changes reflected genomic and chromatin changes in liver cancer, including from transcription factor binding sites. These findings provide a biological basis for changes in cfDNA fragmentation in patients with liver cancer and provide an accessible approach for noninvasive cancer detection. SIGNIFICANCE: There is a great need for accessible and sensitive screening approaches for HCC worldwide. We have developed an approach for examining genome-wide cfDNA fragmentation features to provide a high-performing and cost-effective approach for liver cancer detection. See related commentary Rolfo and Russo, p. 532. This article is highlighted in the In This Issue feature, p. 517.
Assuntos
Carcinoma Hepatocelular , Ácidos Nucleicos Livres , Neoplasias Hepáticas , Humanos , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/patologia , Carcinoma Hepatocelular/diagnóstico , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/patologia , Ácidos Nucleicos Livres/genética , Cirrose Hepática/genética , Cirrose Hepática/patologiaRESUMO
Non-invasive approaches for cell-free DNA (cfDNA) assessment provide an opportunity for cancer detection and intervention. Here, we use a machine learning model for detecting tumor-derived cfDNA through genome-wide analyses of cfDNA fragmentation in a prospective study of 365 individuals at risk for lung cancer. We validate the cancer detection model using an independent cohort of 385 non-cancer individuals and 46 lung cancer patients. Combining fragmentation features, clinical risk factors, and CEA levels, followed by CT imaging, detected 94% of patients with cancer across stages and subtypes, including 91% of stage I/II and 96% of stage III/IV, at 80% specificity. Genome-wide fragmentation profiles across ~13,000 ASCL1 transcription factor binding sites distinguished individuals with small cell lung cancer from those with non-small cell lung cancer with high accuracy (AUC = 0.98). A higher fragmentation score represented an independent prognostic indicator of survival. This approach provides a facile avenue for non-invasive detection of lung cancer.
Assuntos
DNA Tumoral Circulante/metabolismo , Fragmentação do DNA , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Apoptose , Carcinoma Pulmonar de Células não Pequenas/diagnóstico , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/patologia , Linhagem Celular Tumoral , Diagnóstico Diferencial , Detecção Precoce de Câncer , Feminino , Genoma Humano , Humanos , Neoplasias Pulmonares/patologia , Masculino , Pessoa de Meia-Idade , Modelos Biológicos , Metástase Neoplásica , Estadiamento de Neoplasias , Carcinoma de Pequenas Células do Pulmão/diagnóstico , Carcinoma de Pequenas Células do Pulmão/genética , Carcinoma de Pequenas Células do Pulmão/patologia , Adulto JovemRESUMO
With the advent of precision oncology, there is an urgent need to develop improved methods for rapidly detecting responses to targeted therapies. Here, we have developed an ultrasensitive measure of cell-free tumor load using targeted and whole-genome sequencing approaches to assess responses to tyrosine kinase inhibitors in patients with advanced lung cancer. Analyses of 28 patients treated with anti-EGFR or HER2 therapies revealed a bimodal distribution of cell-free circulating tumor DNA (ctDNA) after therapy initiation, with molecular responders having nearly complete elimination of ctDNA (>98%). Molecular nonresponders displayed limited changes in ctDNA levels posttreatment and experienced significantly shorter progression-free survival (median 1.6 vs. 13.7 months, P < 0.0001; HR = 66.6; 95% confidence interval, 13.0-341.7), which was detected on average 4 weeks earlier than CT imaging. ctDNA analyses of patients with radiographic stable or nonmeasurable disease improved prediction of clinical outcome compared with CT imaging. These analyses provide a rapid approach for evaluating therapeutic response to targeted therapies and have important implications for the management of patients with cancer and the development of new therapeutics.Significance: Cell-free tumor load provides a novel approach for evaluating longitudinal changes in ctDNA during systemic treatment with tyrosine kinase inhibitors and serves an unmet clinical need for real-time, noninvasive detection of tumor response to targeted therapies before radiographic assessment.See related commentary by Zou and Meyerson, p. 1038.
Assuntos
Biomarcadores Tumorais/análise , Carcinoma Pulmonar de Células não Pequenas/patologia , DNA Tumoral Circulante/análise , DNA de Neoplasias/análise , Terapia de Alvo Molecular , Mutação , Inibidores de Proteínas Quinases/uso terapêutico , Adenocarcinoma/tratamento farmacológico , Adenocarcinoma/genética , Adenocarcinoma/patologia , Idoso , Idoso de 80 Anos ou mais , Biomarcadores Tumorais/genética , Carcinoma Pulmonar de Células não Pequenas/tratamento farmacológico , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma de Células Escamosas/tratamento farmacológico , Carcinoma de Células Escamosas/genética , Carcinoma de Células Escamosas/patologia , DNA Tumoral Circulante/genética , DNA de Neoplasias/genética , Feminino , Seguimentos , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Masculino , Pessoa de Meia-Idade , Prognóstico , Taxa de Sobrevida , Carga TumoralRESUMO
Genome-wide association studies (GWAS) using single nucleotide polymorphisms (SNPs) have identified more than 50 loci associated with estimated glomerular filtration rate (eGFR), a measure of kidney function. However, significant SNPs account for a small proportion of eGFR variability. Other forms of genetic variation have not been comprehensively evaluated for association with eGFR. In this study, we assess whether changes in germline DNA copy number are associated with GFR estimated from serum creatinine, eGFRcrea. We used hidden Markov models (HMMs) to identify copy number polymorphic regions (CNPs) from high-throughput SNP arrays for 2,514 African (AA) and 8,645 European ancestry (EA) participants in the Atherosclerosis Risk in Communities (ARIC) study. Separately for the EA and AA cohorts, we used Bayesian Gaussian mixture models to estimate copy number at regions identified by the HMM or previously reported in the HapMap Project. We identified 312 and 464 autosomal CNPs among individuals of EA and AA, respectively. Multivariate models adjusted for SNP-derived covariates of population structure identified one CNP in the EA cohort near genome-wide statistical significance (Bonferroni-adjusted p = 0.067) located on chromosome 5 (876-880kb). Overall, our findings suggest a limited role of CNPs in explaining eGFR variability.
Assuntos
Variações do Número de Cópias de DNA/genética , Estudo de Associação Genômica Ampla , Rim/fisiologia , Polimorfismo de Nucleotídeo Único/genética , Aterosclerose/genética , População Negra/genética , Feminino , Predisposição Genética para Doença , Taxa de Filtração Glomerular/genética , Humanos , Testes de Função Renal , Modelos Lineares , Masculino , Pessoa de Meia-Idade , Fatores de Risco , População Branca/genéticaRESUMO
Early detection and intervention are likely to be the most effective means for reducing morbidity and mortality of human cancer. However, development of methods for noninvasive detection of early-stage tumors has remained a challenge. We have developed an approach called targeted error correction sequencing (TEC-Seq) that allows ultrasensitive direct evaluation of sequence changes in circulating cell-free DNA using massively parallel sequencing. We have used this approach to examine 58 cancer-related genes encompassing 81 kb. Analysis of plasma from 44 healthy individuals identified genomic changes related to clonal hematopoiesis in 16% of asymptomatic individuals but no alterations in driver genes related to solid cancers. Evaluation of 200 patients with colorectal, breast, lung, or ovarian cancer detected somatic mutations in the plasma of 71, 59, 59, and 68%, respectively, of patients with stage I or II disease. Analyses of mutations in the circulation revealed high concordance with alterations in the tumors of these patients. In patients with resectable colorectal cancers, higher amounts of preoperative circulating tumor DNA were associated with disease recurrence and decreased overall survival. These analyses provide a broadly applicable approach for noninvasive detection of early-stage tumors that may be useful for screening and management of patients with cancer.