RESUMEN
The Long-Read Personalized OncoGenomics (POG) dataset comprises a cohort of 189 patient tumors and 41 matched normal samples sequenced using the Oxford Nanopore Technologies PromethION platform. This dataset from the POG program and the Marathon of Hope Cancer Centres Network includes DNA and RNA short-read sequence data, analytics, and clinical information. We show the potential of long-read sequencing for resolving complex cancer-related structural variants, viral integrations, and extrachromosomal circular DNA. Long-range phasing facilitates the discovery of allelically differentially methylated regions (aDMRs) and allele-specific expression, including recurrent aDMRs in the cancer genes RET and CDKN2A. Germline promoter methylation in MLH1 can be directly observed in Lynch syndrome. Promoter methylation in BRCA1 and RAD51C is a likely driver behind homologous recombination deficiency where no coding driver mutation was found. This dataset demonstrates applications for long-read sequencing in precision medicine and is available as a resource for developing analytical approaches using this technology.
RESUMEN
The complexities of cancer genomes are becoming more easily interpreted due to advancements in sequencing technologies and improved bioinformatic analysis. Structural variants (SVs) represent an important subset of somatic events in tumors. While detection of SVs has been markedly improved by the development of long-read sequencing, somatic variant identification and annotation remains challenging. We hypothesized that use of a completed human reference genome (CHM13-T2T) would improve somatic SV calling. Our findings in a tumour/normal matched benchmark sample and two patient samples show that the CHM13-T2T improves SV detection and prioritization accuracy compared to GRCh38, with a notable reduction in false positive calls. We also overcame the lack of annotation resources for CHM13-T2T by lifting over CHM13-T2T-aligned reads to the GRCh38 genome, therefore combining both improved alignment and advanced annotations. In this process, we assessed the current SV benchmark set for COLO829/COLO829BL across four replicates sequenced at different centers with different long-read technologies. We discovered instability of this cell line across these replicates; 346 SVs (1.13%) were only discoverable in a single replicate. We identify 49 somatic SVs, which appear to be stable as they are consistently present across the four replicates. As such, we propose this consensus set as an updated benchmark for somatic SV calling and include both GRCh38 and CHM13-T2T coordinates in our benchmark. The benchmark is available at: 10.5281/zenodo.10819636 Our work demonstrates new approaches to optimize somatic SV prioritization in cancer with potential improvements in other genetic diseases.
RESUMEN
Human papillomavirus (HPV) integration has been implicated in transforming HPV infection into cancer, but its genomic consequences have been difficult to study using short-read technologies. To resolve the dysregulation associated with HPV integration, we performed long-read sequencing on 63 cervical cancer genomes. We identified six categories of integration events based on HPV-human genomic structures. Of all HPV integrants, defined as two HPV-human breakpoints bridged by an HPV sequence, 24% contained variable copies of HPV between the breakpoints, a phenomenon we termed heterologous integration. Analysis of DNA methylation within and in proximity to the HPV genome at individual integration events revealed relationships between methylation status of the integrant and its orientation and structure. Dysregulation of the human epigenome and neighboring gene expression in cis with the HPV-integrated allele was observed over megabase-ranges of the genome. By elucidating the structural, epigenetic, and allele-specific impacts of HPV integration, we provide insight into the role of integrated HPV in cervical cancer.
RESUMEN
BACKGROUND & AIMS: Despite the widespread increase in the incidence of early-onset colorectal cancer (EoCRC), the reasons for this increase remain unclear. The objective of this study was to determine risk factors for the development of EoCRC. METHODS: We conducted a systematic literature review and meta-analysis of studies examining non-genetic risk factors for EoCRC, including demographic factors, comorbidities, and lifestyle factors. Random effects meta-analyses were conducted for risk factors that were examined in at least three studies. Heterogeneity was investigated using the Q-test and I2 statistic. RESULTS: From 3304 initial citations, 20 studies were included in this review. Significant risk factors for EoCRC included CRC history in a first-degree relative (RR 4.21, 95% CI 2.61-6.79), hyperlipidemia (RR 1.62, 95% CI 1.22-2.13), obesity (RR 1.54, 95% CI 1.01-2.35), and alcohol consumption (high vs. non-drinkers) (RR 1.71, 95% CI 1.62-1.80). While smoking was suggestive as a risk factor, the association was not statistically significant (RR 1.35, 95% CI 0.81-2.25). With the exception of alcohol consumption, there was considerable heterogeneity among studies (I2 > 60%). Other potential risk factors included hypertension, metabolic syndrome, ulcerative colitis, chronic kidney disease, dietary factors, sedentary behaviour, and occupational exposure to organic dusts, but these were only examined in one or two studies. CONCLUSIONS: The results of this study advance the understanding of the etiology of EoCRC. High-quality studies conducted on generalizable populations and that comprehensively examine risk factors for EoCRC are required to inform primary and secondary prevention strategies.
Asunto(s)
Neoplasias Colorrectales , Neoplasias Colorrectales/epidemiología , Neoplasias Colorrectales/etiología , Neoplasias Colorrectales/prevención & control , Comorbilidad , Humanos , Incidencia , Obesidad/epidemiología , Factores de RiesgoRESUMEN
BACKGROUND: As the use of nanopore sequencing for metagenomic analysis increases, tools capable of performing long-read taxonomic classification (ie. determining the composition of a sample) in a fast and accurate manner are needed. Existing tools were either designed for short-read data (eg. Centrifuge), take days to analyse modern sequencer outputs (eg. MetaMaps) or suffer from suboptimal accuracy (eg. CDKAM). Additionally, all tools require command line expertise and do not scale in the cloud. RESULTS: We present BugSeq, a novel, highly accurate metagenomic classifier for nanopore reads. We evaluate BugSeq on simulated data, mock microbial communities and real clinical samples. On the ZymoBIOMICS Even and Log communities, BugSeq (F1 = 0.95 at species level) offers better read classification than MetaMaps (F1 = 0.89-0.94) in a fraction of the time. BugSeq significantly improves on the accuracy of Centrifuge (F1 = 0.79-0.93) and CDKAM (F1 = 0.91-0.94) while offering competitive run times. When applied to 41 samples from patients with lower respiratory tract infections, BugSeq produces greater concordance with microbiological culture and qPCR compared with "What's In My Pot" analysis. CONCLUSION: BugSeq is deployed to the cloud for easy and scalable long-read metagenomic analyses. BugSeq is freely available for non-commercial use at https://bugseq.com/free .
Asunto(s)
Nube Computacional , Metagenómica , Secuenciación de Nanoporos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , MetagenomaRESUMEN
OBJECTIVE: The mechanisms driving the associations between body weight and physical activity levels and multiple types of cancer are not yet well understood. The purpose of this review was to examine the effects of lifestyle interventions on proposed biomarkers of lifestyle and cancer risk at the level of adipose tissue in humans. METHODS: Embase, MEDLINE, and CINAHL were searched by using keywords relating to exercise or diet interventions, adipose tissue biology, and outcomes of interest. Eligible studies included randomized clinical trials of exercise and/or dietary interventions in humans compared with control or other interventions, reporting the collection of subcutaneous abdominal adipose tissue. RESULTS: Nineteen studies met criteria for inclusion. Eight studies modified dietary intake, five altered exercise levels, and six studies used a combination of both. Change in subcutaneous adipose tissue gene expression was most commonly observed with dietary weight loss, with a pattern of decrease in leptin, tumor necrosis factor alpha, and interleukin 6, along with an increase in adiponectin. There was limited change with exercise-only interventions or study arms. CONCLUSIONS: Interventions leading to weight loss result in an altered gene expression of adipokines and inflammatory markers in subcutaneous adipose tissue, while less change in gene expression was noted with exercise alone.