RESUMO
Zinc finger of the cerebellum (Zic) proteins act as classic transcription factors to promote transcription of the Foxd3 gene during neural crest cell specification. Additionally, they can act as co-factors that bind proteins from the T-cell factor/lymphoid enhancing factor (TCF/LEF) family (TCFs) to repress WNT-ß-catenin-dependent transcription without contacting DNA. Here, we show that ZIC activity at the neural plate border is influenced by WNT-dependent SUMOylation. In the presence of high canonical WNT activity, a lysine residue within the highly conserved zinc finger N-terminally conserved (ZF-NC) domain of ZIC5 is SUMOylated, which reduces formation of the ZIC-TCF co-repressor complex and shifts the balance towards transcription factor function. The modification is crucial in vivo, as a ZIC5 SUMO-incompetent mouse strain exhibits neural crest specification defects. This work reveals the function of the ZF-NC domain within ZIC, provides in vivo validation of target protein SUMOylation and demonstrates that WNT-ß-catenin signalling directs transcription at non-TCF DNA-binding sites. Furthermore, it can explain how WNT signals convert a broad region of Zic ectodermal expression into a restricted region of neural crest cell specification.
Assuntos
Crista Neural , Sumoilação , Animais , Diferenciação Celular , Camundongos , Crista Neural/metabolismo , Fatores de Transcrição TCF/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , beta Catenina/genética , beta Catenina/metabolismoRESUMO
Aging is associated with significant changes in the hematopoietic system, including increased inflammation, impaired hematopoietic stem cell (HSC) function, and increased incidence of myeloid malignancy. Inflammation of aging ("inflammaging") has been proposed as a driver of age-related changes in HSC function and myeloid malignancy, but mechanisms linking these phenomena remain poorly defined. We identified loss of miR-146a as driving aging-associated inflammation in AML patients. miR-146a expression declined in old wild-type mice, and loss of miR-146a promoted premature HSC aging and inflammation in young miR-146a-null mice, preceding development of aging-associated myeloid malignancy. Using single-cell assays of HSC quiescence, stemness, differentiation potential, and epigenetic state to probe HSC function and population structure, we found that loss of miR-146a depleted a subpopulation of primitive, quiescent HSCs. DNA methylation and transcriptome profiling implicated NF-κB, IL6, and TNF as potential drivers of HSC dysfunction, activating an inflammatory signaling relay promoting IL6 and TNF secretion from mature miR-146a-/- myeloid and lymphoid cells. Reducing inflammation by targeting Il6 or Tnf was sufficient to restore single-cell measures of miR-146a-/- HSC function and subpopulation structure and reduced the incidence of hematological malignancy in miR-146a-/- mice. miR-146a-/- HSCs exhibited enhanced sensitivity to IL6 stimulation, indicating that loss of miR-146a affects HSC function via both cell-extrinsic inflammatory signals and increased cell-intrinsic sensitivity to inflammation. Thus, loss of miR-146a regulates cell-extrinsic and -intrinsic mechanisms linking HSC inflammaging to the development of myeloid malignancy.
Assuntos
Envelhecimento/genética , Inflamação/genética , Interleucina-6/fisiologia , Leucemia Mieloide Aguda/etiologia , MicroRNAs/genética , Fator de Necrose Tumoral alfa/fisiologia , Adolescente , Adulto , Idoso , Envelhecimento/imunologia , Animais , Diferenciação Celular , Autorrenovação Celular , Senescência Celular , Citocinas/biossíntese , Metilação de DNA , Feminino , Células-Tronco Hematopoéticas/metabolismo , Células-Tronco Hematopoéticas/patologia , Humanos , Inflamação/fisiopatologia , Interleucina-6/antagonistas & inibidores , Masculino , Camundongos , Camundongos Knockout , MicroRNAs/biossíntese , Pessoa de Meia-Idade , NF-kappa B/fisiologia , Análise de Célula Única , Transcriptoma , Fator de Necrose Tumoral alfa/antagonistas & inibidores , Adulto JovemRESUMO
Overcoming drug resistance and targeting cancer stem cells remain challenges for curative cancer treatment. To investigate the role of microRNAs (miRNAs) in regulating drug resistance and leukemic stem cell (LSC) fate, we performed global transcriptome profiling in treatment-naive chronic myeloid leukemia (CML) stem/progenitor cells and identified that miR-185 levels anticipate their response to ABL tyrosine kinase inhibitors (TKIs). miR-185 functions as a tumor suppressor: its restored expression impaired survival of drug-resistant cells, sensitized them to TKIs in vitro, and markedly eliminated long-term repopulating LSCs and infiltrating blast cells, conferring a survival advantage in preclinical xenotransplantation models. Integrative analysis with mRNA profiles uncovered PAK6 as a crucial target of miR-185, and pharmacological inhibition of PAK6 perturbed the RAS/MAPK pathway and mitochondrial activity, sensitizing therapy-resistant cells to TKIs. Thus, miR-185 presents as a potential predictive biomarker, and dual targeting of miR-185-mediated PAK6 activity and BCR-ABL1 may provide a valuable strategy for overcoming drug resistance in patients.
Assuntos
Resistencia a Medicamentos Antineoplásicos/genética , Leucemia Mielogênica Crônica BCR-ABL Positiva/genética , MicroRNAs/genética , Células-Tronco Neoplásicas/patologia , Quinases Ativadas por p21/genética , Animais , Regulação Leucêmica da Expressão Gênica/genética , Xenoenxertos , Humanos , Leucemia Mielogênica Crônica BCR-ABL Positiva/tratamento farmacológico , Leucemia Mielogênica Crônica BCR-ABL Positiva/metabolismo , Camundongos , Camundongos SCID , MicroRNAs/metabolismo , Células-Tronco Neoplásicas/metabolismo , Inibidores de Proteínas Quinases/uso terapêutico , Transdução de Sinais/fisiologia , Quinases Ativadas por p21/metabolismoRESUMO
Prenatal detection of structural variants of uncertain significance, including copy number variants (CNV), challenges genetic counseling, and creates ambiguity for expectant parents. In Duchenne muscular dystrophy, variant classification and phenotypic severity of CNVs are currently assessed by familial segregation, prediction of the effect on the reading frame, and precedent data. Delineation of pathogenicity by familial segregation is limited by time and suitable family members, whereas analytical tools can rapidly delineate potential consequences of variants. We identified a duplication of uncertain significance encompassing a portion of the dystrophin gene (DMD) in an unaffected mother and her male fetus. Using long-read whole genome sequencing and alignment of short reads, we rapidly defined the precise breakpoints of this variant in DMD and could provide timely counseling. The benign nature of the variant was substantiated, more slowly, by familial segregation to a healthy maternal uncle. We find long-read whole genome sequencing of clinical utility in a prenatal setting for accurate and rapid characterization of structural variants, specifically a duplication involving DMD.
Assuntos
Variações do Número de Cópias de DNA , Distrofina/genética , Testes Genéticos/métodos , Distrofia Muscular de Duchenne/diagnóstico , Distrofia Muscular de Duchenne/genética , Diagnóstico Pré-Natal/métodos , Adulto , Pontos de Quebra do Cromossomo , Duplicação Cromossômica , Cromossomos Humanos X , Hibridização Genômica Comparativa , Éxons , Feminino , Estudos de Associação Genética/métodos , Predisposição Genética para Doença , Humanos , Masculino , Linhagem , Gravidez , Análise de Sequência de DNARESUMO
Accurate reference genome sequences provide the foundation for modern molecular biology and genomics as the interpretation of sequence data to study evolution, gene expression, and epigenetics depends heavily on the quality of the genome assembly used for its alignment. Correctly organising sequenced fragments such as contigs and scaffolds in relation to each other is a critical and often challenging step in the construction of robust genome references. We previously identified misoriented regions in the mouse and human reference assemblies using Strand-seq, a single cell sequencing technique that preserves DNA directionality Here we demonstrate the ability of Strand-seq to build and correct full-length chromosomes by identifying which scaffolds belong to the same chromosome and determining their correct order and orientation, without the need for overlapping sequences. We demonstrate that Strand-seq exquisitely maps assembly fragments into large related groups and chromosome-sized clusters without using new assembly data. Using template strand inheritance as a bi-allelic marker, we employ genetic mapping principles to cluster scaffolds that are derived from the same chromosome and order them within the chromosome based solely on directionality of DNA strand inheritance. We prove the utility of our approach by generating improved genome assemblies for several model organisms including the ferret, pig, Xenopus, zebrafish, Tasmanian devil and the Guinea pig.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Célula Única/métodos , Sequenciamento Completo do Genoma/métodos , Algoritmos , Alelos , Animais , Sequência de Bases , Mapeamento Cromossômico/métodos , Cromossomos , Genômica/métodos , Humanos , Análise de Sequência de DNA/métodos , SoftwareRESUMO
PURPOSE: Structural variants (SVs) may be an underestimated cause of hereditary cancer syndromes given the current limitations of short-read next-generation sequencing. Here we investigated the utility of long-read sequencing in resolving germline SVs in cancer susceptibility genes detected through short-read genome sequencing. METHODS: Known or suspected deleterious germline SVs were identified using Illumina genome sequencing across a cohort of 669 advanced cancer patients with paired tumor genome and transcriptome sequencing. Candidate SVs were subsequently assessed by Oxford Nanopore long-read sequencing. RESULTS: Nanopore sequencing confirmed eight simple pathogenic or likely pathogenic SVs, resolving three additional variants whose impact could not be fully elucidated through short-read sequencing. A recurrent sequencing artifact on chromosome 16p13 and one complex rearrangement on chromosome 5q35 were subsequently classified as likely benign, obviating the need for further clinical assessment. Variant configuration was further resolved in one case with a complex pathogenic rearrangement affecting TSC2. CONCLUSION: Our findings demonstrate that long-read sequencing can improve the validation, resolution, and classification of germline SVs. This has important implications for return of results, cascade carrier testing, cancer screening, and prophylactic interventions.
Assuntos
Predisposição Genética para Doença , Neoplasias , Sequência de Bases , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , HumanosRESUMO
SUMMARY: Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or as chromosome builds (consisting of correctly ordered, but not necessarily correctly oriented scaffolds separated by gaps). Completion of de novo assemblies remains difficult, as regions that are repetitive or hard to sequence prevent the accumulation of larger scaffolds, and create errors such as misorientations and mislocalizations. Thus, complementary methods for determining the orientation and positioning of fragments are important for finishing assemblies. Strand-seq is a method for determining template strand inheritance in single cells, information that can be used to determine relative genomic distance and orientation between scaffolds, and find errors within them. We present contiBAIT, an R/Bioconductor package which uses Strand-seq data to repair and improve existing assemblies. AVAILABILITY AND IMPLEMENTATION: contiBAIT is available on Bioconductor. Source files available from GitHub. CONTACT: koneill@bcgsc.ca or mark.hills@stemcell.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Genômica/métodos , HumanosRESUMO
MOTIVATION: Deep profiling the phenotypic landscape of tissues using high-throughput flow cytometry (FCM) can provide important new insights into the interplay of cells in both healthy and diseased tissue. But often, especially in clinical settings, the cytometer cannot measure all the desired markers in a single aliquot. In these cases, tissue is separated into independently analysed samples, leaving a need to electronically recombine these to increase dimensionality. Nearest-neighbour (NN) based imputation fulfils this need but can produce artificial subpopulations. Clustering-based NNs can reduce these, but requires prior domain knowledge to be able to parameterize the clustering, so is unsuited to discovery settings. RESULTS: We present flowBin, a parameterization-free method for combining multitube FCM data into a higher-dimensional form suitable for deep profiling and discovery. FlowBin allocates cells to bins defined by the common markers across tubes in a multitube experiment, then computes aggregate expression for each bin within each tube, to create a matrix of expression of all markers assayed in each tube. We show, using simulated multitube data, that flowType analysis of flowBin output reproduces the results of that same analysis on the original data for cell types of >10% abundance. We used flowBin in conjunction with classifiers to distinguish normal from cancerous cells. We used flowBin together with flowType and RchyOptimyx to profile the immunophenotypic landscape of NPM1-mutated acute myeloid leukemia, and present a series of novel cell types associated with that mutation.
Assuntos
Biomarcadores Tumorais/genética , Citometria de Fluxo/métodos , Leucemia Mieloide Aguda/genética , Leucócitos Mononucleares/metabolismo , Mutação/genética , Software , Estudos de Casos e Controles , Linhagem da Célula , Separação Celular , Humanos , Imunofenotipagem , Leucemia Mieloide Aguda/patologia , Leucócitos Mononucleares/citologia , Proteínas Nucleares/genética , NucleofosminaRESUMO
We present a significantly improved version of the flowType and RchyOptimyx BioConductor-based pipeline that is both 14 times faster and can accommodate multiple levels of biomarker expression for up to 96 markers. With these improvements, the pipeline is positioned to be an integral part of data analysis for high-throughput experiments on high-dimensional single-cell assay platforms, including flow cytometry, mass cytometry and single-cell RT-qPCR.
Assuntos
Citometria de Fluxo/métodos , Antígenos CD/análise , Biomarcadores/análise , SoftwareRESUMO
Flow cytometry bioinformatics is the application of bioinformatics to flow cytometry data, which involves storing, retrieving, organizing, and analyzing flow cytometry data using extensive computational resources and tools. Flow cytometry bioinformatics requires extensive use of and contributes to the development of techniques from computational statistics and machine learning. Flow cytometry and related methods allow the quantification of multiple independent biomarkers on large numbers of single cells. The rapid growth in the multidimensionality and throughput of flow cytometry data, particularly in the 2000s, has led to the creation of a variety of computational analysis methods, data standards, and public databases for the sharing of results. Computational methods exist to assist in the preprocessing of flow cytometry data, identifying cell populations within it, matching those cell populations across samples, and performing diagnosis and discovery using the results of previous steps. For preprocessing, this includes compensating for spectral overlap, transforming data onto scales conducive to visualization and analysis, assessing data for quality, and normalizing data across samples and experiments. For population identification, tools are available to aid traditional manual identification of populations in two-dimensional scatter plots (gating), to use dimensionality reduction to aid gating, and to find populations automatically in higher dimensional space in a variety of ways. It is also possible to characterize data in more comprehensive ways, such as the density-guided binary space partitioning technique known as probability binning, or by combinatorial gating. Finally, diagnosis using flow cytometry data can be aided by supervised learning techniques, and discovery of new cell types of biological importance by high-throughput statistical methods, as part of pipelines incorporating all of the aforementioned methods. Open standards, data, and software are also key parts of flow cytometry bioinformatics. Data standards include the widely adopted Flow Cytometry Standard (FCS) defining how data from cytometers should be stored, but also several new standards under development by the International Society for Advancement of Cytometry (ISAC) to aid in storing more detailed information about experimental design and analytical steps. Open data is slowly growing with the opening of the CytoBank database in 2010 and FlowRepository in 2012, both of which allow users to freely distribute their data, and the latter of which has been recommended as the preferred repository for MIFlowCyt-compliant data by ISAC. Open software is most widely available in the form of a suite of Bioconductor packages, but is also available for web execution on the GenePattern platform.
Assuntos
Biologia Computacional , Citometria de Fluxo , Separação CelularRESUMO
The complexities of cancer genomes are becoming more easily interpreted due to advancements in sequencing technologies and improved bioinformatic analysis. Structural variants (SVs) represent an important subset of somatic events in tumors. While detection of SVs has been markedly improved by the development of long-read sequencing, somatic variant identification and annotation remains challenging. We hypothesized that use of a completed human reference genome (CHM13-T2T) would improve somatic SV calling. Our findings in a tumour/normal matched benchmark sample and two patient samples show that the CHM13-T2T improves SV detection and prioritization accuracy compared to GRCh38, with a notable reduction in false positive calls. We also overcame the lack of annotation resources for CHM13-T2T by lifting over CHM13-T2T-aligned reads to the GRCh38 genome, therefore combining both improved alignment and advanced annotations. In this process, we assessed the current SV benchmark set for COLO829/COLO829BL across four replicates sequenced at different centers with different long-read technologies. We discovered instability of this cell line across these replicates; 346 SVs (1.13%) were only discoverable in a single replicate. We identify 49 somatic SVs, which appear to be stable as they are consistently present across the four replicates. As such, we propose this consensus set as an updated benchmark for somatic SV calling and include both GRCh38 and CHM13-T2T coordinates in our benchmark. The benchmark is available at: 10.5281/zenodo.10819636 Our work demonstrates new approaches to optimize somatic SV prioritization in cancer with potential improvements in other genetic diseases.
RESUMO
The Long-Read Personalized OncoGenomics (POG) dataset comprises a cohort of 189 patient tumors and 41 matched normal samples sequenced using the Oxford Nanopore Technologies PromethION platform. This dataset from the POG program and the Marathon of Hope Cancer Centres Network includes DNA and RNA short-read sequence data, analytics, and clinical information. We show the potential of long-read sequencing for resolving complex cancer-related structural variants, viral integrations, and extrachromosomal circular DNA. Long-range phasing facilitates the discovery of allelically differentially methylated regions (aDMRs) and allele-specific expression, including recurrent aDMRs in the cancer genes RET and CDKN2A. Germline promoter methylation in MLH1 can be directly observed in Lynch syndrome. Promoter methylation in BRCA1 and RAD51C is a likely driver behind homologous recombination deficiency where no coding driver mutation was found. This dataset demonstrates applications for long-read sequencing in precision medicine and is available as a resource for developing analytical approaches using this technology.
RESUMO
MOTIVATION: Polychromatic flow cytometry (PFC), has enormous power as a tool to dissect complex immune responses (such as those observed in HIV disease) at a single cell level. However, analysis tools are severely lacking. Although high-throughput systems allow rapid data collection from large cohorts, manual data analysis can take months. Moreover, identification of cell populations can be subjective and analysts rarely examine the entirety of the multidimensional dataset (focusing instead on a limited number of subsets, the biology of which has usually already been well-described). Thus, the value of PFC as a discovery tool is largely wasted. RESULTS: To address this problem, we developed a computational approach that automatically reveals all possible cell subsets. From tens of thousands of subsets, those that correlate strongly with clinical outcome are selected and grouped. Within each group, markers that have minimal relevance to the biological outcome are removed, thereby distilling the complex dataset into the simplest, most clinically relevant subsets. This allows complex information from PFC studies to be translated into clinical or resource-poor settings, where multiparametric analysis is less feasible. We demonstrate the utility of this approach in a large (n=466), retrospective, 14-parameter PFC study of early HIV infection, where we identify three T-cell subsets that strongly predict progression to AIDS (only one of which was identified by an initial manual analysis). AVAILABILITY: The 'flowType: Phenotyping Multivariate PFC Assays' package is available through Bioconductor. Additional documentation and examples are available at: www.terryfoxlab.ca/flowsite/flowType/ SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CONTACT: rbrinkman@bccrc.ca.
Assuntos
Biologia Computacional/métodos , Citometria de Fluxo , Infecções por HIV/imunologia , Subpopulações de Linfócitos T/imunologia , Biomarcadores/análise , Humanos , Imunofenotipagem/métodos , Valor Preditivo dos Testes , Modelos de Riscos Proporcionais , Estudos Retrospectivos , Subpopulações de Linfócitos T/citologiaRESUMO
Hundreds of loci in human genomes have alleles that are methylated differentially according to their parent of origin. These imprinted loci generally show little variation across tissues, individuals, and populations. We show that such loci can be used to distinguish the maternal and paternal homologs for all human autosomes without the need for the parental DNA. We integrate methylation-detecting nanopore sequencing with the long-range phase information in Strand-seq data to determine the parent of origin of chromosome-length haplotypes for both DNA sequence and DNA methylation in five trios with diverse genetic backgrounds. The parent of origin was correctly inferred for all autosomes with an average mismatch error rate of 0.31% for SNVs and 1.89% for insertions or deletions (indels). Because our method can determine whether an inherited disease allele originated from the mother or the father, we predict that it will improve the diagnosis and management of many genetic diseases.
RESUMO
We present a genome assembly of Caretta caretta (the Loggerhead sea turtle; Chordata, Testudines, Cheloniidae), generated from genomic data from two unrelated females. The genome sequence is 2.13 gigabases in size. The assembly has a busco completion score of 96.1% and N50 of 130.95 Mb. The majority of the assembly is scaffolded into 28 chromosomal representations with a remaining 2% of the assembly being excluded from these.
Assuntos
Tartarugas , Animais , Feminino , Tartarugas/genética , Répteis , Genoma , GenômicaRESUMO
Germline structural variants (SVs) are challenging to resolve by conventional genetic testing assays. Long-read sequencing has improved the global characterization of SVs, but its sensitivity at cancer susceptibility loci has not been reported. Nanopore long-read genome sequencing was performed for nineteen individuals with pathogenic copy number alterations in BRCA1, BRCA2, CHEK2 and PALB2 identified by prior clinical testing. Fourteen variants, which spanned single exons to whole genes and included a tandem duplication, were accurately represented. Defining the precise breakpoints of SVs in BRCA1 and CHEK2 revealed unforeseen allelic heterogeneity and informed the mechanisms underlying the formation of recurrent deletions. Integrating read-based and statistical phasing further helped define extended haplotypes associated with founder alleles. Long-read sequencing is a sensitive method for characterizing private, recurrent and founder SVs underlying breast cancer susceptibility. Our findings demonstrate the potential for nanopore sequencing as a powerful genetic testing assay in the hereditary cancer setting.
Assuntos
Neoplasias da Mama , Sequenciamento por Nanoporos , Nanoporos , Humanos , Feminino , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Predisposição Genética para Doença , Testes Genéticos/métodosRESUMO
Human papillomavirus (HPV) integration has been implicated in transforming HPV infection into cancer, but its genomic consequences have been difficult to study using short-read technologies. To resolve the dysregulation associated with HPV integration, we performed long-read sequencing on 63 cervical cancer genomes. We identified six categories of integration events based on HPV-human genomic structures. Of all HPV integrants, defined as two HPV-human breakpoints bridged by an HPV sequence, 24% contained variable copies of HPV between the breakpoints, a phenomenon we termed heterologous integration. Analysis of DNA methylation within and in proximity to the HPV genome at individual integration events revealed relationships between methylation status of the integrant and its orientation and structure. Dysregulation of the human epigenome and neighboring gene expression in cis with the HPV-integrated allele was observed over megabase-ranges of the genome. By elucidating the structural, epigenetic, and allele-specific impacts of HPV integration, we provide insight into the role of integrated HPV in cervical cancer.
RESUMO
Analysis of high-dimensional flow cytometry datasets can reveal novel cell populations with poorly understood biology. Following discovery, characterization of these populations in terms of the critical markers involved is an important step, as this can help to both better understand the biology of these populations and aid in designing simpler marker panels to identify them on simpler instruments and with fewer reagents (i.e., in resource poor or highly regulated clinical settings). However, current tools to design panels based on the biological characteristics of the target cell populations work exclusively based on technical parameters (e.g., instrument configurations, spectral overlap, and reagent availability). To address this shortcoming, we developed RchyOptimyx (cellular hieraRCHY OPTIMization), a computational tool that constructs cellular hierarchies by combining automated gating with dynamic programming and graph theory to provide the best gating strategies to identify a target population to a desired level of purity or correlation with a clinical outcome, using the simplest possible marker panels. RchyOptimyx can assess and graphically present the trade-offs between marker choice and population specificity in high-dimensional flow or mass cytometry datasets. We present three proof-of-concept use cases for RchyOptimyx that involve 1) designing a panel of surface markers for identification of rare populations that are primarily characterized using their intracellular signature; 2) simplifying the gating strategy for identification of a target cell population; 3) identification of a non-redundant marker set to identify a target cell population.
Assuntos
Células da Medula Óssea/citologia , Citometria de Fluxo/métodos , Software , Algoritmos , Antígenos CD/análise , Antígenos CD/imunologia , Biomarcadores/análise , Células da Medula Óssea/imunologia , Biologia Computacional/métodos , Infecções por HIV/imunologia , Humanos , Imunofenotipagem/métodos , Interleucina-7/imunologia , Lipopolissacarídeos/imunologia , Fenótipo , Coloração e Rotulagem , Linfócitos T/citologia , Linfócitos T/imunologiaRESUMO
Imprinting is a critical part of normal embryonic development in mammals, controlled by defined parent-of-origin (PofO) differentially methylated regions (DMRs) known as imprinting control regions. Direct nanopore sequencing of DNA provides a means to detect allelic methylation and to overcome the drawbacks of methylation array and short-read technologies. Here, we used publicly available nanopore sequencing data for 12 standard B-lymphocyte cell lines to acquire the genome-wide mapping of imprinted intervals in humans. Using the sequencing data, we were able to phase 95% of the human methylome and detect 94% of the previously well-characterized, imprinted DMRs. In addition, we found 42 novel imprinted DMRs (16 germline and 26 somatic), which were confirmed using whole-genome bisulfite sequencing (WGBS) data. Analysis of WGBS data in mouse (Mus musculus), rhesus monkey (Macaca mulatta), and chimpanzee (Pan troglodytes) suggested that 17 of these imprinted DMRs are conserved. Some of the novel imprinted intervals are within or close to imprinted genes without a known DMR. We also detected subtle parental methylation bias, spanning several kilobases at seven known imprinted clusters. At these blocks, hypermethylation occurs at the gene body of expressed allele(s) with mutually exclusive H3K36me3 and H3K27me3 allelic histone marks. These results expand upon our current knowledge of imprinting and the potential of nanopore sequencing to identify imprinting regions using only parent-offspring trios, as opposed to the large multi-generational pedigrees that have previously been required.