ABSTRACT
Deconvolution methods infer quantitative cell type estimates from bulk measurement of mixed samples including blood and tissue. DNA methylation sequencing measures multiple CpGs per read, but few existing deconvolution methods leverage this within-read information. We develop CelFiE-ISH, which extends an existing method (CelFiE) to use within-read haplotype information. CelFiE-ISH outperforms CelFiE and other existing methods, achieving 30% better accuracy and more sensitive detection of rare cell types. We also demonstrate the importance of marker selection and of tailoring markers for haplotype-aware methods. While here we use gold-standard short-read sequencing data, haplotype-aware methods will be well-suited for long-read sequencing.
Subject(s)
DNA Methylation , Haplotypes , Humans , Models, Statistical , Sequence Analysis, DNA/methods , CpG IslandsABSTRACT
Data from both bulk and single-cell whole-genome DNA methylation experiments are under-utilized in many ways. This is attributable to inefficient mapping of methylation sequencing reads, routinely discarded genetic information, and neglected read-level epigenetic and genetic linkage information. We introduce the BISulfite-seq Command line User Interface Toolkit (BISCUIT) and its companion R/Bioconductor package, biscuiteer, for simultaneous extraction of genetic and epigenetic information from bulk and single-cell DNA methylation sequencing. BISCUIT's performance, flexibility and standards-compliant output allow large, complex experimental designs to be characterized on clinical timescales. BISCUIT is particularly suited for processing data from single-cell DNA methylation assays, with its excellent scalability, efficiency, and ability to greatly enhance mappability, a key challenge for single-cell studies. We also introduce the epiBED format for single-molecule analysis of coupled epigenetic and genetic information, facilitating the study of cellular and tissue heterogeneity from DNA methylation sequencing.
Subject(s)
DNA Methylation , Epigenesis, Genetic , High-Throughput Nucleotide Sequencing , Software , Epigenomics , Sequence Analysis, DNA , SulfitesABSTRACT
The Oxford Nanopore (ONT) platform provides portable and rapid genome sequencing, and its ability to natively profile DNA methylation without complex sample processing is attractive for point-of-care real-time sequencing. We recently demonstrated ONT shallow whole-genome sequencing to detect copy number alterations (CNAs) from the circulating tumor DNA (ctDNA) of cancer patients. Here, we show that cell type and cancer-specific methylation changes can also be detected, as well as cancer-associated fragmentation signatures. This feasibility study suggests that ONT shallow WGS could be a powerful tool for liquid biopsy.
Subject(s)
Cell-Free Nucleic Acids , Circulating Tumor DNA , Nanopore Sequencing , Neoplasms , DNA Methylation , High-Throughput Nucleotide Sequencing , Humans , Neoplasms/geneticsABSTRACT
Base editors are dedicated engineered deaminases that enable directed conversion of specific bases in the genome or transcriptome in a precise and efficient manner, and hold promise for correcting pathogenic mutations. A major concern limiting application of this powerful approach is the issue of off-target edits. Several recent studies have shown substantial off-target RNA activity induced by base editors and demonstrated that off-target mutations may be suppressed by improved deaminases versions or optimized guide RNAs. Here, we describe a new class of off-target events that are invisible to the established methods for detection of genomic variations and were thus far overlooked. We show that nonspecific, seemingly stochastic, off-target events affect a large number of sites throughout the genome or the transcriptome, and account for the majority of off-target activity. We develop and employ a different, complementary approach that is sensitive to the stochastic off-target activity and use it to quantify the abundant off-target RNA mutations due to current, optimized deaminase editors. We provide a computational tool to quantify global off-target activity, which can be used to optimize future base editors. Engineered base editors enable directed manipulation of the genome or transcriptome at single-base resolution. We believe that implementation of this computational approach would facilitate design of more specific base editors.
ABSTRACT
Crystallographic structures of protein complexes are essential to develop proteomic and structural biology methods, as prediction of protein-protein interaction (PPI) sites and protein-protein docking. Such structures can aid the development of protein complexation inhibitors. Complex DataBase (CDB), accessible at www.jct-bioinfo.com/cdb/search, is a database web application for heterodimeric protein crystallographic complexes along with the crystallographic structures of each individual unbound protein. Direct access to crystallographic structures of protein complexes, along with provided annotations, can serve as starting point for constructing new experimental protein complexes sets of any type, for protein binding studies, and the development and evaluation of PPIs prediction methods.