RESUMO
We report a personalized tumor-informed technology, Patient-specific pROgnostic and Potential tHErapeutic marker Tracking (PROPHET) using deep sequencing of 50 patient-specific variants to detect molecular residual disease (MRD) with a limit of detection of 0.004%. PROPHET and state-of-the-art fixed-panel assays were applied to 760 plasma samples from 181 prospectively enrolled early stage non-small cell lung cancer patients. PROPHET shows higher sensitivity of 45% at baseline with circulating tumor DNA (ctDNA). It outperforms fixed-panel assays in prognostic analysis and demonstrates a median lead-time of 299 days to radiologically confirmed recurrence. Personalized non-canonical variants account for 98.2% with prognostic effects similar to canonical variants. The proposed tumor-node-metastasis-blood (TNMB) classification surpasses TNM staging for prognostic prediction at the decision point of adjuvant treatment. PROPHET shows potential to evaluate the effect of adjuvant therapy and serve as an arbiter of the equivocal radiological diagnosis. These findings highlight the potential advantages of personalized cancer techniques in MRD detection.
Assuntos
Carcinoma Pulmonar de Células não Pequenas , Ácidos Nucleicos Livres , DNA Tumoral Circulante , Neoplasias Pulmonares , Carcinoma de Pequenas Células do Pulmão , Humanos , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/cirurgia , Carcinoma Pulmonar de Células não Pequenas/patologia , DNA Tumoral Circulante/análise , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/cirurgia , DNA de Neoplasias , Neoplasia Residual/genética , Biomarcadores Tumorais/genética , Recidiva Local de Neoplasia/genéticaRESUMO
Copy number variation (CNV) is a kind of chromosomal structural reorganization that has been detected, in this decade, mainly by high-throughput biological technology. Researchers have found that CNVs are ubiquitous in many species and accumulating evidence indicates that CNVs are closely related with complex diseases. The investigation of chromosomal structural alterations has begun to reveal some important clues to the pathologic causes of diseases and to the disease process. However, many of the published studies have focused on a single disease and, so far, the experimental results have not been systematically collected or organized. Manual text mining from 6301 published papers was used to build the Copy Number Variation in Disease database (CNVD). CNVD contains CNV information for 792 diseases in 22 species from diverse types of experiments, thus, ensuring high confidence and comprehensive representation of the relationship between the CNVs and the diseases. In addition, multiple query modes and visualized results are provided in the CNVD database. With its user-friendly interface and the integrated CNV information for different diseases, CNVD will offer a truly comprehensive platform for disease research based on chromosomal structural variations. The CNVD interface is accessible at http://bioinfo.hrbmu.edu.cn/CNVD.
Assuntos
Variações do Número de Cópias de DNA , Mineração de Dados , Bases de Dados de Ácidos Nucleicos , Doença/genética , Feminino , Genoma Humano , Humanos , Masculino , Gravidez , Software , Interface Usuário-ComputadorRESUMO
Copy number variations (CNVs) are one type of the human genetic variations and are pervasive in the human genome. It has been confirmed that they can play a causal role in complex diseases. Previous studies of CNVs focused more on identifying the disease-specific CNV regions or candidate genes on these CNV regions, but less on the synergistic actions between genes on CNV regions and other genes. Our research combined the CNVs with related gene co-expression to reconstruct gene co-expression network by using single nucleotide polymorphism microarray datasets and gene microarray datasets of breast cancer, and then extracted the modules which connected densely inside and analyzed the functions of modules. Interestingly, all of these modules' functions were related to breast cancer according to our enrichment analysis, and most of the genes in these modules have been reported to be involved in breast cancer. Our findings suggested that integrating CNVs and gene co-expressed relations was an available way to analyze the roles of CNV genes and their synergistic genes in breast cancer, and provided a novel insight into the pathological mechanism of breast cancer.
Assuntos
Neoplasias da Mama/genética , Variações do Número de Cópias de DNA/genética , Regulação Neoplásica da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Genes/genética , Neoplasias da Mama/metabolismo , Feminino , Humanos , Análise em Microsséries , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
BACKGROUND: Clinical laboratories routinely use formalin-fixed paraffin-embedded (FFPE) tissue or cell block cytology samples in oncology panel sequencing to identify mutations that can predict patient response to targeted therapy. To understand the technical error due to FFPE processing, a robustly characterized diploid cell line was used to create FFPE samples with four different pre-tissue processing formalin fixation times. A total of 96 FFPE sections were then distributed to different laboratories for targeted sequencing analysis by four oncopanels, and variants resulting from technical error were identified. RESULTS: Tissue sections that fail more frequently show low cellularity, lower than recommended library preparation DNA input, or target sequencing depth. Importantly, sections from block surfaces are more likely to show FFPE-specific errors, akin to "edge effects" seen in histology, while the inner samples display no quality degradation related to fixation time. CONCLUSIONS: To assure reliable results, we recommend avoiding the block surface portion and restricting mutation detection to genomic regions of high confidence.
Assuntos
Formaldeído , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Inclusão em Parafina , Análise de Sequência de DNA , Fixação de TecidosRESUMO
The low abundance of circulating tumour DNA (ctDNA) in plasma samples makes the analysis of ctDNA biomarkers for the detection or monitoring of early-stage cancers challenging. Here we show that deep methylation sequencing aided by a machine-learning classifier of methylation patterns enables the detection of tumour-derived signals at dilution factors as low as 1 in 10,000. For a total of 308 patients with surgery-resectable lung cancer and 261 age- and sex-matched non-cancer control individuals recruited from two hospitals, the assay detected 52-81% of the patients at disease stages IA to III with a specificity of 96% (95% confidence interval (CI) 93-98%). In a subgroup of 115 individuals, the assay identified, at 100% specificity (95% CI 91-100%), nearly twice as many patients with cancer as those identified by ultradeep mutation sequencing analysis. The low amounts of ctDNA permitted by machine-learning-aided deep methylation sequencing could provide advantages in cancer screening and the assessment of treatment efficacy.
Assuntos
Biomarcadores Tumorais/genética , DNA Tumoral Circulante/genética , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Aprendizado de Máquina/estatística & dados numéricos , Adulto , Biomarcadores Tumorais/sangue , Estudos de Casos e Controles , DNA Tumoral Circulante/sangue , Metilação de DNA , Detecção Precoce de Câncer/métodos , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias Pulmonares/sangue , Neoplasias Pulmonares/patologia , Masculino , Pessoa de Meia-Idade , Análise de Sequência de DNA/métodosRESUMO
BACKGROUND: Targeted sequencing using oncopanels requires comprehensive assessments of accuracy and detection sensitivity to ensure analytical validity. By employing reference materials characterized by the U.S. Food and Drug Administration-led SEquence Quality Control project phase2 (SEQC2) effort, we perform a cross-platform multi-lab evaluation of eight Pan-Cancer panels to assess best practices for oncopanel sequencing. RESULTS: All panels demonstrate high sensitivity across targeted high-confidence coding regions and variant types for the variants previously verified to have variant allele frequency (VAF) in the 5-20% range. Sensitivity is reduced by utilizing VAF thresholds due to inherent variability in VAF measurements. Enforcing a VAF threshold for reporting has a positive impact on reducing false positive calls. Importantly, the false positive rate is found to be significantly higher outside the high-confidence coding regions, resulting in lower reproducibility. Thus, region restriction and VAF thresholds lead to low relative technical variability in estimating promising biomarkers and tumor mutational burden. CONCLUSION: This comprehensive study provides actionable guidelines for oncopanel sequencing and clear evidence that supports a simplified approach to assess the analytical performance of oncopanels. It will facilitate the rapid implementation, validation, and quality control of oncopanels in clinical use.
Assuntos
Biomarcadores Tumorais , Testes Genéticos/métodos , Genômica/métodos , Neoplasias/genética , Oncogenes , Variações do Número de Cópias de DNA , Testes Genéticos/normas , Genômica/normas , Humanos , Técnicas de Diagnóstico Molecular/métodos , Técnicas de Diagnóstico Molecular/normas , Mutação , Neoplasias/diagnóstico , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
Circulating tumor DNA (ctDNA) sequencing is being rapidly adopted in precision oncology, but the accuracy, sensitivity and reproducibility of ctDNA assays is poorly understood. Here we report the findings of a multi-site, cross-platform evaluation of the analytical performance of five industry-leading ctDNA assays. We evaluated each stage of the ctDNA sequencing workflow with simulations, synthetic DNA spike-in experiments and proficiency testing on standardized, cell-line-derived reference samples. Above 0.5% variant allele frequency, ctDNA mutations were detected with high sensitivity, precision and reproducibility by all five assays, whereas, below this limit, detection became unreliable and varied widely between assays, especially when input material was limited. Missed mutations (false negatives) were more common than erroneous candidates (false positives), indicating that the reliable sampling of rare ctDNA fragments is the key challenge for ctDNA assays. This comprehensive evaluation of the analytical performance of ctDNA assays serves to inform best practice guidelines and provides a resource for precision oncology.
Assuntos
DNA Tumoral Circulante/genética , Oncologia , Neoplasias/genética , Medicina de Precisão , Análise de Sequência de DNA/normas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Limite de Detecção , Guias de Prática Clínica como Assunto , Reprodutibilidade dos TestesRESUMO
LncRNAs are involved in a wide range of biological processes, such as chromatin remodeling, mRNA splicing, mRNA editing and translation. They can either upregulate or downregulate gene expression, and play key roles in the progression of various human cancers. However, the functional mechanisms of most lncRNAs still remain unknown at present. This paper aims to provide space for the understanding of lncRNAs by proposing a new method to obtain protein-coding genes (PCGs) regulated by lncRNAs, thus identifying candidate cancer-related lncRNAs using bioinformatics approaches. This study presents a method based on sample correlation, which is applied to the expression profiles of lncRNAs and PCGs in prostate cancer in combination with protein interaction data to build a lncRNA-PCG bipartite network. Candidate cancer-related lncRNAs were extracted from the bipartite network by using a random walk. 14 prostate cancer-related lncRNAs were acquired from the LncRNADisease database and MNDR, of which 6 lncRNAs were present in our network. As one of the seed nodes, ENSG00000234741 achieved the highest score among them. The other two cancer-related lncRNAs (ENSG00000225937 and ENSG00000236830) were ranked within the top 30. In addition, the top candidate lncRNA ENSG00000261777 shares an intron with DDX19, and interacts with IGF2 P1, indicating its involvement in prostate cancer. In this paper, we described a new method for predicting candidate lncRNA targets, and obtained candidate therapeutic targets using this method. We hope that this study will bring a new perspective in future lncRNA studies.
Assuntos
Redes Reguladoras de Genes , Fases de Leitura Aberta/genética , Neoplasias da Próstata/genética , RNA Longo não Codificante/genética , Mineração de Dados , Regulação Neoplásica da Expressão Gênica , Genoma Humano , Humanos , Masculino , RNA Longo não Codificante/metabolismoRESUMO
BACKGROUND: Over the last decade, an increasing number of integrative studies on cancer-related genes have been published. Integrative analyses aim to overcome the limitation of a single data type, and provide a more complete view of carcinogenesis. The vast majority of these studies used sample-matched data of gene expression and copy number to investigate the impact of copy number alteration on gene expression, and to predict and prioritize candidate oncogenes and tumor suppressor genes. However, correlations between genes were neglected in these studies. Our work aimed to evaluate the co-alteration of copy number, methylation and expression, allowing us to identify cancer-related genes and essential functional modules in cancer. RESULTS: We built the Integrated Co-alteration network (ICan) based on multi-omics data, and analyzed the network to uncover cancer-related genes. After comparison with random networks, we identified 155 ovarian cancer-related genes, including well-known (TP53, BRCA1, RB1 and PTEN) and also novel cancer-related genes, such as PDPN and EphA2. We compared the results with a conventional method: CNAmet, and obtained a significantly better area under the curve value (ICan: 0.8179, CNAmet: 0.5183). CONCLUSION: In this paper, we describe a framework to find cancer-related genes based on an Integrated Co-alteration network. Our results proved that ICan could precisely identify candidate cancer genes and provide increased mechanistic understanding of carcinogenesis. This work suggested a new research direction for biological network analyses involving multi-omics data.
Assuntos
Biologia Computacional/métodos , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Oncogenes , Neoplasias Ovarianas/genética , Algoritmos , Variações do Número de Cópias de DNA , Metilação de DNA , Bases de Dados Genéticas , Feminino , Humanos , Estimativa de Kaplan-Meier , Neoplasias Ovarianas/mortalidade , Prognóstico , Fluxo de TrabalhoRESUMO
Changes in intermolecular interactions (differential interactions) may influence the progression of cancer. Specific genes and their regulatory networks may be more closely associated with cancer when taking their transcriptional and post-transcriptional levels and dynamic and static interactions into account simultaneously. In this paper, a differential interaction analysis was performed to detect lung adenocarcinoma-related genes. Furthermore, a miRNA-TF (transcription factor) synergistic regulation network was constructed to identify three kinds of co-regulated motifs, namely, triplet, crosstalk and joint. Not only were the known cancer-related miRNAs and TFs (let-7, miR-15a, miR-17, TP53, ETS1, and so on) were detected in the motifs, but also the miR-15, let-7 and miR-17 families showed a tendency to regulate the triplet, crosstalk and joint motifs, respectively. Moreover, several biological functions (i.e., cell cycle, signaling pathways and hemopoiesis) associated with the three motifs were found to be frequently targeted by the drugs for lung adenocarcinoma. Specifically, the two 4-node motifs (crosstalk and joint) based on co-expression and interaction had a closer relationship to lung adenocarcinoma, and so further research was performed on them. A 10-gene biomarker (UBC, SRC, SP1, MYC, STAT3, JUN, NR3C1, RB1, GRB2 and MAPK1) was selected from the joint motif, and a survival analysis indicated its significant association with survival. Among the ten genes, JUN, NR3C1 and GRB2 are our newly detected candidate lung adenocarcinoma-related genes. The genes, regulators and regulatory motifs detected in this work will provide potential drug targets and new strategies for individual therapy.
Assuntos
Adenocarcinoma/genética , Biomarcadores Tumorais/genética , Redes Reguladoras de Genes , Neoplasias Pulmonares/genética , Adenocarcinoma de Pulmão , Regulação Neoplásica da Expressão Gênica , Ontologia Genética , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Prognóstico , Fatores de Transcrição/metabolismoRESUMO
Copy number alteration (CNA) is known to induce gene expression changes mainly through dosage effect, and therefore affect the initiation and progression of tumor. However, tumor samples exhibit heterogeneity in gene dosage sensitivity due to the complicated mechanisms of transcriptional regulation. Currently, no high-throughput method has been available for identifying the regulatory factors affecting the functional consequences of CNA, and determining their effects on cancer. In view of the important regulatory role of miRNA, we investigated the influence of miRNAs on the dosage sensitivities of genes within the CNA regions. By integrating copy number, mRNA expression, miRNA expression profiles of three kinds of cancer, we observed a tendency for high dosage-sensitivity genes to be more targeted by miRNAs in cancer, and identified the miRNAs regulating the dosage sensitivity of amplified/deleted target genes. The results show that miRNAs can modulate oncogenic biological functions by regulating the genes within the CNA regions, and thus play a role as a trigger or balancer in cancer, affecting cancer processes, even survival. This work provided a framework for analyzing the regulation of dosage effect, which will shed a light on understanding the oncogenic and tumor suppressive mechanisms of CNA. Besides, new cancer-related miRNAs were identified.
Assuntos
Variações do Número de Cópias de DNA/genética , MicroRNAs/genética , Neoplasias/genética , Neoplasias da Mama/genética , Feminino , Dosagem de Genes , Regulação Neoplásica da Expressão Gênica , Genes Neoplásicos , Humanos , MicroRNAs/metabolismo , Transdução de Sinais/genéticaRESUMO
BACKGROUND: Lung cancer, especially non-small cell lung cancer, is a leading cause of malignant tumor death worldwide. Understanding the mechanisms employed by the main regulators, such as microRNAs (miRNAs) and transcription factors (TFs), still remains elusive. The patterns of their cooperation and biological functions in the synergistic regulatory network have rarely been studied. RESULTS: Here, we describe the first miRNA-TF synergistic regulation network in human lung cancer. We identified important regulators (MYC, NFKB1, miR-590, and miR-570) and significant miRNA-TF synergistic regulatory motifs by random simulations. The two most significant motifs were the co-regulation of miRNAs and TFs, and TF-mediated cascade regulation. We also developed an algorithm to uncover the biological functions of the human lung cancer miRNA-TF synergistic regulatory network (regulation of apoptosis, cellular protein metabolic process, and cell cycle), and the specific functions of each miRNA-TF synergistic subnetwork. We found that the miR-17 family exerted important effects in the regulation of non-small cell lung cancer, such as in proliferation and cell cycle regulation by targeting the retinoblastoma protein (RB1) and forming a feed forward loop with the E2F1 TF. We proposed a model for the miR-17 family, E2F1, and RB1 to demonstrate their potential roles in the occurrence and development of non-small cell lung cancer. CONCLUSIONS: This work will provide a framework for constructing miRNA-TF synergistic regulatory networks, function analysis in diseases, and identification of the main regulators and regulatory motifs, which will be useful for understanding the putative regulatory motifs involving miRNAs and TFs, and for predicting new targets for cancer studies.
Assuntos
Carcinoma Pulmonar de Células não Pequenas/genética , Biologia Computacional , Redes Reguladoras de Genes , Neoplasias Pulmonares/genética , MicroRNAs/genética , Motivos de Nucleotídeos , Fatores de Transcrição/metabolismo , Carcinoma Pulmonar de Células não Pequenas/metabolismo , Carcinoma Pulmonar de Células não Pequenas/patologia , Proliferação de Células , Humanos , Neoplasias Pulmonares/metabolismo , Neoplasias Pulmonares/patologia , Reprodutibilidade dos TestesRESUMO
Phenotypic similarity is correlated with a number of measures of gene function, such as relatedness at the level of direct protein-protein interaction. The phenotypic effect of a deleted or mutated gene, which is one part of gene annotation, has caught broad attention. However, there have been few measures to study phenotypic similarity with the data from Human Phenotype Ontology (HPO) database, therefore more analogous measures should be developed and investigated. We used five semantic similarity-based measures (Jiang and Conrath, Lin, Schlicker, Yu and Wu) to calculate the human phenotypic similarity between genes (PSG) with data from HPO database, and evaluated their accuracy with information of protein-protein interaction, protein complex, protein family, gene function or DNA sequence. Compared with the gene pairs that were random selected, the results of these methods were statistically significant (all P<0.001). Furthermore, we assessed the performance of these five measures by receiver operating characteristic (ROC) curve analysis, and found that most of them performed better than the previous methods. This work had proved that these measures based on semantic similarity for calculation of PSG were effective for hierarchical structure data. Our study contributes to the development and optimization of novel algorithms of PSG calculation and provides more alternative methods to researchers as well as tools and directions for PSG study.