RESUMEN
We present Epiclomal, a probabilistic clustering method arising from a hierarchical mixture model to simultaneously cluster sparse single-cell DNA methylation data and impute missing values. Using synthetic and published single-cell CpG datasets, we show that Epiclomal outperforms non-probabilistic methods and can handle the inherent missing data characteristic that dominates single-cell CpG genome sequences. Using newly generated single-cell 5mCpG sequencing data, we show that Epiclomal discovers sub-clonal methylation patterns in aneuploid tumour genomes, thus defining epiclones that can match or transcend copy number-determined clonal lineages and opening up an important form of clonal analysis in cancer. Epiclomal is written in R and Python and is available at https://github.com/shahcompbio/Epiclomal.
Asunto(s)
Metilación de ADN , Análisis de la Célula Individual , Análisis por Conglomerados , Islas de CpG , Humanos , Probabilidad , Análisis de Secuencia de ADN/métodosRESUMEN
ARID1A is the core DNA-binding subunit of the BAF chromatin remodeling complex and is mutated in about 8% of all cancers. The frequency of ARID1A loss varies between cancer subtypes, with clear cell ovarian carcinoma (CCOC) presenting the highest incidence at > 50% of cases. Despite a growing understanding of the consequences of ARID1A loss in cancer, there remains limited targeted therapeutic options for ARID1A-deficient cancers. Using a genome-wide CRISPR screening approach, we identify KEAP1 as a genetic dependency of ARID1A in CCOC. Depletion or chemical perturbation of KEAP1 results in selective growth inhibition of ARID1A-KO cell lines and edited primary endometrial epithelial cells. While we confirm that KEAP1-NRF2 signalling is dysregulated in ARID1A-KO cells, we suggest that this synthetic lethality is not due to aberrant NRF2 signalling. Rather, we find that KEAP1 perturbation exacerbates genome instability phenotypes associated with ARID1A deficiency. Together, our findings identify a potentially novel synthetic lethal interaction of ARID1A-deficient cells.
RESUMEN
Synovial sarcoma (SyS) is an aggressive soft-tissue malignancy characterized by a pathognomonic chromosomal translocation leading to the formation of the SS18::SSX fusion oncoprotein. SS18::SSX associates with mammalian BAF complexes suggesting deregulation of chromatin architecture as the oncogenic driver in this tumour type. To examine the epigenomic state of SyS we performed comprehensive multi-omics analysis on 52 primary pre-treatment human SyS tumours. Our analysis revealed a continuum of epigenomic states across the cohort at fusion target genes independent of rare somatic genetic lesions. We identify cell-of-origin signatures defined by enhancer states and reveal unexpected relationships between H2AK119Ub1 and active marks. The number of bivalent promoters, dually marked by the repressive H3K27me3 and activating H3K4me3 marks, has strong prognostic value and outperforms tumor grade in predicting patient outcome. Finally, we identify SyS defining epigenomic features including H3K4me3 expansion associated with striking promoter DNA hypomethylation in which SyS displays the lowest mean methylation level of any sarcoma subtype. We explore these distinctive features as potential vulnerabilities in SyS and identify H3K4me3 inhibition as a promising therapeutic strategy.
RESUMEN
The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts. The goal of VirusSeq was to allow open access to Canadian SARS-CoV-2 genomic sequences and enhanced, standardized contextual data that were unavailable in other repositories and that meet FAIR standards (Findable, Accessible, Interoperable and Reusable). In addition, the Portal data submission pipeline contains data quality checking procedures and appropriate acknowledgement of data generators that encourages collaboration. From inception to execution, the portal was developed with a conscientious focus on strong data governance principles and practices. Extensive efforts ensured a commitment to Canadian privacy laws, data security standards, and organizational processes. This Portal has been coupled with other resources like Viral AI and was further leveraged by the Coronavirus Variants Rapid Response Network (CoVaRR-Net) to produce a suite of continually updated analytical tools and notebooks. Here we highlight this Portal, including its contextual data not available elsewhere, and the 'Duotang', a web platform that presents key genomic epidemiology and modeling analyses on circulating and emerging SARS-CoV-2 variants in Canada. Duotang presents dynamic changes in variant composition of SARS-CoV-2 in Canada and by province, estimates variant growth, and displays complementary interactive visualizations, with a text overview of the current situation. The VirusSeq Data Portal and Duotang resources, alongside additional analyses and resources computed from the Portal (COVID-MVP, CoVizu), are all open-source and freely available. Together, they provide an updated picture of SARS-CoV-2 evolution to spur scientific discussions, inform public discourse, and support communication with and within public health authorities. They also serve as a framework for other jurisdictions interested in open, collaborative sequence data sharing and analyses.
RESUMEN
The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform the public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts. The goal of VirusSeq was to allow open access to Canadian SARS-CoV-2 genomic sequences and enhanced, standardized contextual data that were unavailable in other repositories and that meet FAIR standards (Findable, Accessible, Interoperable and Reusable). In addition, the portal data submission pipeline contains data quality checking procedures and appropriate acknowledgement of data generators that encourages collaboration. From inception to execution, the portal was developed with a conscientious focus on strong data governance principles and practices. Extensive efforts ensured a commitment to Canadian privacy laws, data security standards, and organizational processes. This portal has been coupled with other resources, such as Viral AI, and was further leveraged by the Coronavirus Variants Rapid Response Network (CoVaRR-Net) to produce a suite of continually updated analytical tools and notebooks. Here we highlight this portal (https://virusseq-dataportal.ca/), including its contextual data not available elsewhere, and the Duotang (https://covarr-net.github.io/duotang/duotang.html), a web platform that presents key genomic epidemiology and modelling analyses on circulating and emerging SARS-CoV-2 variants in Canada. Duotang presents dynamic changes in variant composition of SARS-CoV-2 in Canada and by province, estimates variant growth, and displays complementary interactive visualizations, with a text overview of the current situation. The VirusSeq Data Portal and Duotang resources, alongside additional analyses and resources computed from the portal (COVID-MVP, CoVizu), are all open source and freely available. Together, they provide an updated picture of SARS-CoV-2 evolution to spur scientific discussions, inform public discourse, and support communication with and within public health authorities. They also serve as a framework for other jurisdictions interested in open, collaborative sequence data sharing and analyses.
Asunto(s)
COVID-19 , Genoma Viral , SARS-CoV-2 , Canadá/epidemiología , SARS-CoV-2/genética , Humanos , COVID-19/epidemiología , COVID-19/virología , Genómica/métodos , Pandemias , Bases de Datos GenéticasRESUMEN
The COVID-19 pandemic has highlighted the need for generic reagents and flexible systems in diagnostic testing. Magnetic bead-based nucleic acid extraction protocols using 96-well plates on open liquid handlers are readily amenable to meet this need. Here, one such approach is rigorously optimized to minimize cross-well contamination while maintaining sensitivity.