RESUMO
BACKGROUND: Cell free DNA (cfDNA)-based assays hold great potential in detecting early cancer signals yet determining the tissue-of-origin (TOO) for cancer signals remains a challenging task. Here, we investigated the contribution of a methylation atlas to TOO detection in low depth cfDNA samples. METHODS: We constructed a tumor-specific methylation atlas (TSMA) using whole-genome bisulfite sequencing (WGBS) data from five types of tumor tissues (breast, colorectal, gastric, liver and lung cancer) and paired white blood cells (WBC). TSMA was used with a non-negative least square matrix factorization (NNLS) deconvolution algorithm to identify the abundance of tumor tissue types in a WGBS sample. We showed that TSMA worked well with tumor tissue but struggled with cfDNA samples due to the overwhelming amount of WBC-derived DNA. To construct a model for TOO, we adopted the multi-modal strategy and used as inputs the combination of deconvolution scores from TSMA with other features of cfDNA. RESULTS: Our final model comprised of a graph convolutional neural network using deconvolution scores and genome-wide methylation density features, which achieved an accuracy of 69% in a held-out validation dataset of 239 low-depth cfDNA samples. CONCLUSIONS: In conclusion, we have demonstrated that our TSMA in combination with other cfDNA features can improve TOO detection in low-depth cfDNA samples.
Assuntos
Metilação de DNA , Genoma Humano , Neoplasias , Redes Neurais de Computação , Humanos , Metilação de DNA/genética , Neoplasias/genética , Neoplasias/sangue , Neoplasias/diagnóstico , Ácidos Nucleicos Livres/sangue , Ácidos Nucleicos Livres/genética , Especificidade de Órgãos/genética , AlgoritmosRESUMO
Despite their promise, circulating tumor DNA (ctDNA)-based assays for multi-cancer early detection face challenges in test performance, due mostly to the limited abundance of ctDNA and its inherent variability. To address these challenges, published assays to date demanded a very high-depth sequencing, resulting in an elevated price of test. Herein, we developed a multimodal assay called SPOT-MAS (screening for the presence of tumor by methylation and size) to simultaneously profile methylomics, fragmentomics, copy number, and end motifs in a single workflow using targeted and shallow genome-wide sequencing (~0.55×) of cell-free DNA. We applied SPOT-MAS to 738 non-metastatic patients with breast, colorectal, gastric, lung, and liver cancer, and 1550 healthy controls. We then employed machine learning to extract multiple cancer and tissue-specific signatures for detecting and locating cancer. SPOT-MAS successfully detected the five cancer types with a sensitivity of 72.4% at 97.0% specificity. The sensitivities for detecting early-stage cancers were 73.9% and 62.3% for stages I and II, respectively, increasing to 88.3% for non-metastatic stage IIIA. For tumor-of-origin, our assay achieved an accuracy of 0.7. Our study demonstrates comparable performance to other ctDNA-based assays while requiring significantly lower sequencing depth, making it economically feasible for population-wide screening.
Assuntos
DNA Tumoral Circulante , Detecção Precoce de Câncer , Neoplasias , Humanos , Biomarcadores Tumorais/sangue , Biomarcadores Tumorais/genética , Ácidos Nucleicos Livres/sangue , Ácidos Nucleicos Livres/genética , DNA Tumoral Circulante/sangue , DNA Tumoral Circulante/genética , DNA de Neoplasias/sangue , DNA de Neoplasias/genética , Detecção Precoce de Câncer/métodos , Neoplasias Hepáticas , Neoplasias/sangue , Neoplasias/diagnóstico , Neoplasias/genéticaRESUMO
Introduction: Breast cancer causes the most cancer-related death in women and is the costliest cancer in the US regarding medical service and prescription drug expenses. Breast cancer screening is recommended by health authorities in the US, but current screening efforts are often compromised by high false positive rates. Liquid biopsy based on circulating tumor DNA (ctDNA) has emerged as a potential approach to screen for cancer. However, the detection of breast cancer, particularly in early stages, is challenging due to the low amount of ctDNA and heterogeneity of molecular subtypes. Methods: Here, we employed a multimodal approach, namely Screen for the Presence of Tumor by DNA Methylation and Size (SPOT-MAS), to simultaneously analyze multiple signatures of cell free DNA (cfDNA) in plasma samples of 239 nonmetastatic breast cancer patients and 278 healthy subjects. Results: We identified distinct profiles of genome-wide methylation changes (GWM), copy number alterations (CNA), and 4-nucleotide oligomer (4-mer) end motifs (EM) in cfDNA of breast cancer patients. We further used all three signatures to construct a multi-featured machine learning model and showed that the combination model outperformed base models built from individual features, achieving an AUC of 0.91 (95% CI: 0.87-0.95), a sensitivity of 65% at 96% specificity. Discussion: Our findings showed that a multimodal liquid biopsy assay based on analysis of cfDNA methylation, CNA and EM could enhance the accuracy for the detection of early- stage breast cancer.
RESUMO
BACKGROUND: Late detection of hepatocellular carcinoma (HCC) results in an overall 5-year survival rate of less than 16%. Liquid biopsy (LB) assays based on detecting circulating tumor DNA (ctDNA) might provide an opportunity to detect HCC early noninvasively. Increasing evidence indicates that ctDNA detection using mutation-based assays is significantly challenged by the abundance of white blood cell-derived mutations, non-tumor tissue-derived somatic mutations in plasma, and the mutational tumor heterogeneity. METHODS: Here, we employed concurrent analysis of cancer-related mutations, and their fragment length profiles to differentiate mutations from different sources. To distinguish persons with HCC (PwHCC) from healthy participants, we built a classification model using three fragmentomic features of ctDNA through deep sequencing of thirteen genes associated with HCC. RESULTS: Our model achieved an area under the curve (AUC) of 0.88, a sensitivity of 89%, and a specificity of 82% in the discovery cohort consisting of 55 PwHCC and 55 healthy participants. In an independent validation cohort of 54 PwHCC and 53 healthy participants, the established model achieved comparable classification performance with an AUC of 0.86 and yielded a sensitivity and specificity of 81%. CONCLUSIONS: Our study provides a rationale for subsequent clinical evaluation of our assay performance in a large-scale prospective study.