ABSTRACT
Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversions <2 kbp form by twin-priming during L1 retrotransposition; 80% of the larger inversions are balanced and affect twice as many nucleotides as CNVs. Balanced inversions show an excess of common variants, and 72% are flanked by segmental duplications (SDs) or retrotransposons. Since flanking repeats promote non-allelic homologous recombination, we developed complementary approaches to identify recurrent inversion formation. We describe 40 recurrent inversions encompassing 0.6% of the genome, showing inversion rates up to 2.7 × 10-4 per locus per generation. Recurrent inversions exhibit a sex-chromosomal bias and co-localize with genomic disorder critical regions. We propose that inversion recurrence results in an elevated number of heterozygous carriers and structural SD diversity, which increases mutability in the population and predisposes specific haplotypes to disease-causing CNVs.
Subject(s)
Chromosome Inversion , Segmental Duplications, Genomic , Chromosome Inversion/genetics , DNA Copy Number Variations/genetics , Genome, Human , Genomics , HumansABSTRACT
The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.
Subject(s)
Chromosomes, Human, Y , Evolution, Molecular , Humans , Male , Chromosomes, Human, Y/genetics , Genome, Human/genetics , Genomics , Mutation Rate , Phenotype , Euchromatin/genetics , Pseudogenes , Genetic Variation/genetics , Chromosomes, Human, X/genetics , Pseudoautosomal Regions/geneticsABSTRACT
The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society1,2. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals3,4. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome5. To address these limitations, the Human Pangenome Reference Consortium formed with the goal of creating high-quality, cost-effective, diploid genome assemblies for a pangenome reference that represents human genetic diversity6. Here, in our first scientific report, we determined which combination of current genome sequencing and assembly approaches yield the most complete and accurate diploid genome assembly with minimal manual curation. Approaches that used highly accurate long reads and parent-child data with graph-based haplotype phasing during assembly outperformed those that did not. Developing a combination of the top-performing methods, we generated our first high-quality diploid reference assembly, containing only approximately four gaps per chromosome on average, with most chromosomes within ±1% of the length of CHM13. Nearly 48% of protein-coding genes have non-synonymous amino acid changes between haplotypes, and centromeric regions showed the highest diversity. Our findings serve as a foundation for assembling near-complete diploid human genomes at scale for a pangenome reference to capture global genetic variation from single nucleotides to structural rearrangements.
Subject(s)
Chromosome Mapping , Diploidy , Genome, Human , Genomics , Humans , Chromosome Mapping/standards , Genome, Human/genetics , Haplotypes/genetics , High-Throughput Nucleotide Sequencing/methods , High-Throughput Nucleotide Sequencing/standards , Sequence Analysis, DNA/methods , Sequence Analysis, DNA/standards , Reference Standards , Genomics/methods , Genomics/standards , Chromosomes, Human/genetics , Genetic Variation/geneticsABSTRACT
Advances in long-read sequencing (LRS) technologies continue to make whole-genome sequencing more complete, affordable, and accurate. LRS provides significant advantages over short-read sequencing approaches, including phased de novo genome assembly, access to previously excluded genomic regions, and discovery of more complex structural variants (SVs) associated with disease. Limitations remain with respect to cost, scalability, and platform-dependent read accuracy and the tradeoffs between sequence coverage and sensitivity of variant discovery are important experimental considerations for the application of LRS. We compare the genetic variant-calling precision and recall of Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) HiFi platforms over a range of sequence coverages. For read-based applications, LRS sensitivity begins to plateau around 12-fold coverage with a majority of variants called with reasonable accuracy (F1 score above 0.5), and both platforms perform well for SV detection. Genome assembly increases variant-calling precision and recall of SVs and indels in HiFi data sets with HiFi outperforming ONT in quality as measured by the F1 score of assembly-based variant call sets. While both technologies continue to evolve, our work offers guidance to design cost-effective experimental strategies that do not compromise on discovering novel biology.
Subject(s)
Genomics , Nanopores , INDEL Mutation , Whole Genome SequencingABSTRACT
There has been tremendous progress in phased genome assembly production by combining long-read data with parental information or linked-read data. Nevertheless, a typical phased genome assembly generated by trio-hifiasm still generates more than 140 gaps. We perform a detailed analysis of gaps, assembly breaks, and misorientations from 182 haploid assemblies obtained from a diversity panel of 77 unique human samples. Although trio-based approaches using HiFi are the current gold standard, chromosome-wide phasing accuracy is comparable when using Strand-seq instead of parental data. Importantly, the majority of assembly gaps cluster near the largest and most identical repeats (including segmental duplications [35.4%], satellite DNA [22.3%], or regions enriched in GA/AT-rich DNA [27.4%]). Consequently, 1513 protein-coding genes overlap assembly gaps in at least one haplotype, and 231 are recurrently disrupted or missing from five or more haplotypes. Furthermore, we estimate that 6-7 Mbp of DNA are misorientated per haplotype irrespective of whether trio-free or trio-based approaches are used. Of these misorientations, 81% correspond to bona fide large inversion polymorphisms in the human species, most of which are flanked by large segmental duplications. We also identify large-scale alignment discontinuities consistent with 11.9 Mbp of deletions and 161.4 Mbp of insertions per haploid genome. Although 99% of this variation corresponds to satellite DNA, we identify 230 regions of euchromatic DNA with frequent expansions and contractions, nearly half of which overlap with 197 protein-coding genes. Such variable and incompletely assembled regions are important targets for future algorithmic development and pangenome representation.
Subject(s)
DNA, Satellite , Polymorphism, Genetic , Humans , DNA, Satellite/genetics , Haplotypes , Segmental Duplications, Genomic , Sequence Analysis, DNAABSTRACT
Targeted inhibition of mitogen-activated protein kinase (MAPK) kinase (MEK) can induce regression of tumors bearing activating mutations in the Ras pathway but rarely leads to tumor eradication. Although combining MEK inhibition with T-cell-directed immunotherapy might lead to more durable efficacy, T cell responses are themselves at least partially dependent on MEK activity. We show here that MEK inhibition did profoundly block naive CD8(+) T cell priming in tumor-bearing mice, but actually increased the number of effector-phenotype antigen-specific CD8(+) T cells within the tumor. MEK inhibition protected tumor-infiltrating CD8(+) T cells from death driven by chronic TCR stimulation while sparing cytotoxic activity. Combining MEK inhibition with anti-programmed death-ligand 1 (PD-L1) resulted in synergistic and durable tumor regression even where either agent alone was only modestly effective. Thus, despite the central importance of the MAP kinase pathway in some aspects of T cell function, MEK-targeted agents can be compatible with T-cell-dependent immunotherapy.
Subject(s)
B7-H1 Antigen/immunology , CD8-Positive T-Lymphocytes/immunology , Carcinoma/therapy , Colonic Neoplasms/therapy , Immunotherapy , Animals , Antibodies, Monoclonal/administration & dosage , Apoptosis , Azetidines/administration & dosage , Azetidines/pharmacology , CD8-Positive T-Lymphocytes/drug effects , Carcinoma/immunology , Cell Cycle Checkpoints/drug effects , Cell Line, Tumor , Colonic Neoplasms/immunology , Drug Synergism , Drug Therapy , Drug Therapy, Combination , Extracellular Signal-Regulated MAP Kinases , Humans , Lymphocyte Activation/drug effects , Mice , Mice, Inbred BALB C , Molecular Targeted Therapy , Neoplasm Transplantation , Piperidines/administration & dosage , Piperidines/pharmacologyABSTRACT
The impact of epigenetics on the differentiation of memory T (Tmem) cells is poorly defined. We generated deep epigenomes comprising genome-wide profiles of DNA methylation, histone modifications, DNA accessibility, and coding and non-coding RNA expression in naive, central-, effector-, and terminally differentiated CD45RA+ CD4+ Tmem cells from blood and CD69+ Tmem cells from bone marrow (BM-Tmem). We observed a progressive and proliferation-associated global loss of DNA methylation in heterochromatic parts of the genome during Tmem cell differentiation. Furthermore, distinct gradually changing signatures in the epigenome and the transcriptome supported a linear model of memory development in circulating T cells, while tissue-resident BM-Tmem branched off with a unique epigenetic profile. Integrative analyses identified candidate master regulators of Tmem cell differentiation, including the transcription factor FOXP1. This study highlights the importance of epigenomic changes for Tmem cell biology and demonstrates the value of epigenetic data for the identification of lineage regulators.
Subject(s)
CD4-Positive T-Lymphocytes/immunology , Cell Differentiation/immunology , Epigenesis, Genetic/immunology , Epigenomics/methods , Immunologic Memory/immunology , Female , Flow Cytometry , Gene Expression Profiling/methods , Humans , Machine Learning , Polymerase Chain Reaction , TranscriptomeABSTRACT
The binding of T cell antigen receptors (TCRs) to specific complexes of peptide and major histocompatibility complex (pMHC) is typically of very low affinity, which necessitates the use of multimeric pMHC complexes to label T lymphocytes stably. We report here the development of pMHC complexes able to be crosslinked by ultraviolet irradiation; even as monomers, these efficiently and specifically stained cognate T cells. We also used this reagent to probe T cell activation and found that a covalently bound pMHC was more stimulatory than an agonist pMHC on lipid bilayers. This finding suggested that serial engagement of TCRs is dispensable for activation when a substantial fraction of TCRs are stably engaged. Finally, pMHC-bound TCRs were 'preferentially' transported into the central supramolecular activation cluster after activation, which suggested that ligand engagement enabled linkage of the TCR and its associated CD3 signaling molecules to the cytoskeleton.
Subject(s)
Cross-Linking Reagents/chemistry , Major Histocompatibility Complex/immunology , Receptors, Antigen, T-Cell/chemistry , T-Lymphocytes/chemistry , Animals , CD3 Complex/chemistry , CD3 Complex/immunology , Cells, Cultured , Coloring Agents/chemistry , Cytoskeleton/chemistry , Cytoskeleton/immunology , Lymphocyte Activation , Mice , Mice, Transgenic , Receptors, Antigen, T-Cell/immunology , Signal Transduction/immunology , T-Lymphocytes/immunologyABSTRACT
It has long been thought that clonal deletion efficiently removes almost all self-specific T cells from the peripheral repertoire. We found that self-peptide MHC-specific CD8(+) T cells in the blood of healthy humans were present in frequencies similar to those specific for non-self antigens. For the Y chromosome-encoded SMCY antigen, self-specific T cells exhibited only a 3-fold lower average frequency in males versus females and were anergic with respect to peptide activation, although this inhibition could be overcome by a stronger stimulus. We conclude that clonal deletion prunes but does not eliminate self-specific T cells and suggest that to do so would create holes in the repertoire that pathogens could readily exploit. In support of this hypothesis, we detected T cells specific for all 20 amino acid variants at the p5 position of a hepatitis C virus epitope in a random group of blood donors.
Subject(s)
CD8-Positive T-Lymphocytes/cytology , CD8-Positive T-Lymphocytes/immunology , Clonal Deletion , Animals , Antigenic Variation , Female , Flow Cytometry , Humans , Male , Mice , Receptors, Antigen, T-Cell, alpha-beta/genetics , Self Tolerance/immunologyABSTRACT
MOTIVATION: The generation of genome-wide maps of histone modifications using chromatin immunoprecipitation sequencing is a standard approach to dissect the complexity of the epigenome. Interpretation and differential analysis of histone datasets remains challenging due to regulatory meaningful co-occurrences of histone marks and their difference in genomic spread. To ease interpretation, chromatin state segmentation maps are a commonly employed abstraction combining individual histone marks. We developed the tool SCIDDO as a fast, flexible and statistically sound method for the differential analysis of chromatin state segmentation maps. RESULTS: We demonstrate the utility of SCIDDO in a comparative analysis that identifies differential chromatin domains (DCD) in various regulatory contexts and with only moderate computational resources. We show that the identified DCDs correlate well with observed changes in gene expression and can recover a substantial number of differentially expressed genes (DEGs). We showcase SCIDDO's ability to directly interrogate chromatin dynamics, such as enhancer switches in downstream analysis, which simplifies exploring specific questions about regulatory changes in chromatin. By comparing SCIDDO to competing methods, we provide evidence that SCIDDO's performance in identifying DEGs via differential chromatin marking is more stable across a range of cell-type comparisons and parameter cut-offs. AVAILABILITY AND IMPLEMENTATION: The SCIDDO source code is openly available under github.com/ptrebert/sciddo. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Subject(s)
Chromatin , Chromosomes , Chromatin Immunoprecipitation , Genome , Histone CodeABSTRACT
SUMMARY: Single-cell DNA template strand sequencing (Strand-seq) enables chromosome length haplotype phasing, construction of phased assemblies, mapping sister-chromatid exchange events and structural variant discovery. The initial quality control of potentially thousands of single-cell libraries is still done manually by domain experts. ASHLEYS automates this tedious task, delivers near-expert performance and labels even large datasets in seconds. AVAILABILITY AND IMPLEMENTATION: github.com/friendsofstrandseq/ashleys-qc, MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
ABSTRACT
Thymic positive selection is based on the interactions of T cell antigen receptors (TCRs) with self peptide-major histocompatibility complex (MHC) ligands, but the identity of selecting peptides for MHC class II-restricted TCRs and the functional consequences of this peptide specificity are not clear. Here we identify several endogenous self peptides that positively selected the MHC class II-restricted 5C.C7 TCR. The most potent of these also enhanced mature T cell activation, which supports the hypothesis that one function of positive selection is to produce T cells that can use particular self peptide-MHC complexes for activation and/or homeostasis. We also show that inhibiting the microRNA miR-181a resulted in maturation of T cells that overtly reacted toward these erstwhile positively selecting peptides. Therefore, miR-181a helps to guarantee the clonal deletion of particular moderate-affinity clones by modulating the TCR signaling threshold of thymocytes.
Subject(s)
Histocompatibility Antigens Class II/immunology , Lymphocyte Activation , MicroRNAs/immunology , Peptides/immunology , T-Lymphocytes/immunology , Animals , Cells, Cultured , Clonal Deletion , Gene Expression Regulation , Mice , Mice, Knockout , Receptors, Antigen, T-Cell/immunology , T-Lymphocytes/cytology , Thymus Gland/cytology , Thymus Gland/immunologyABSTRACT
Temporal data on gene expression and context-specific open chromatin states can improve identification of key transcription factors (TFs) and the gene regulatory networks (GRNs) controlling cellular differentiation. However, their integration remains challenging. Here, we delineate a general approach for data-driven and unbiased identification of key TFs and dynamic GRNs, called EPIC-DREM. We generated time-series transcriptomic and epigenomic profiles during differentiation of mouse multipotent bone marrow stromal cell line (ST2) toward adipocytes and osteoblasts. Using our novel approach we constructed time-resolved GRNs for both lineages and identifed the shared TFs involved in both differentiation processes. To take an alternative approach to prioritize the identified shared regulators, we mapped dynamic super-enhancers in both lineages and associated them to target genes with correlated expression profiles. The combination of the two approaches identified aryl hydrocarbon receptor (AHR) and Glis family zinc finger 1 (GLIS1) as mesenchymal key TFs controlled by dynamic cell type-specific super-enhancers that become repressed in both lineages. AHR and GLIS1 control differentiation-induced genes and their overexpression can inhibit the lineage commitment of the multipotent bone marrow-derived ST2 cells.
Subject(s)
DNA-Binding Proteins/metabolism , Enhancer Elements, Genetic , Mesenchymal Stem Cells/metabolism , Receptors, Aryl Hydrocarbon/metabolism , Transcription Factors/metabolism , Adipocytes/metabolism , Animals , Cell Differentiation/genetics , Cell Line , Cell Lineage/genetics , Gene Regulatory Networks , Mesenchymal Stem Cells/cytology , Mice , Osteoblasts/metabolismABSTRACT
Chromatin accessibility maps are important for the functional interpretation of the genome. Here, we systematically analysed assay specific differences between DNase I-seq, ATAC-seq and NOMe-seq in a side by side experimental and bioinformatic setup. We observe that most prominent nucleosome depleted regions (NDRs, e.g. in promoters) are roboustly called by all three or at least two assays. However, we also find a high proportion of assay specific NDRs that are often 'called' by only one of the assays. We show evidence that these assay specific NDRs are indeed genuine open chromatin sites and contribute important information for accurate gene expression prediction. While technically ATAC-seq and DNase I-seq provide a superb high NDR calling rate for relatively low sequencing costs in comparison to NOMe-seq, NOMe-seq singles out for its genome-wide coverage allowing to not only detect NDRs but also endogenous DNA methylation and as we show here genome wide segmentation into heterochromatic B domains and local phasing of nucleosomes outside of NDRs. In summary, our comparisons strongly suggest to consider assay specific differences for the experimental design and for generalized and comparative functional interpretations.
Subject(s)
Chromatin Immunoprecipitation Sequencing/methods , Chromatin Immunoprecipitation Sequencing/standards , Hep G2 Cells , Humans , Nucleosomes/chemistry , Nucleosomes/metabolism , Promoter Regions, GeneticABSTRACT
SUMMARY: Prediction of transcription factor (TF) binding from epigenetics data and integrative analysis thereof are challenging. Here, we present TEPIC 2 a framework allowing for fast, accurate and versatile prediction, and analysis of TF binding from epigenetics data: it supports 30 species with binding motifs, computes TF gene and scores up to two orders of magnitude faster than before due to improved implementation, and offers easy-to-use machine learning pipelines for integrated analysis of TF binding predictions with gene expression data allowing the identification of important TFs. AVAILABILITY AND IMPLEMENTATION: TEPIC is implemented in C++, R, and Python. It is freely available at https://github.com/SchulzLab/TEPIC and can be used on Linux based systems. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Subject(s)
Epigenomics , Animals , Binding Sites , Humans , Mice , Protein Binding , Transcription Factors , TriazinesABSTRACT
The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively.
Subject(s)
Chromatin/metabolism , DNA/genetics , Gene Expression Regulation , Histones/genetics , Machine Learning , Transcription Factors/genetics , Algorithms , Binding Sites , CD4-Positive T-Lymphocytes/cytology , CD4-Positive T-Lymphocytes/metabolism , Cell Line , Cell Line, Tumor , Chromatin/chemistry , Chromatin Assembly and Disassembly , DNA/metabolism , Hep G2 Cells , Hepatocytes/cytology , Hepatocytes/metabolism , Histones/metabolism , Human Embryonic Stem Cells/cytology , Human Embryonic Stem Cells/metabolism , Humans , K562 Cells , Organ Specificity , Primary Cell Culture , Principal Component Analysis , Protein Binding , Transcription Factors/metabolismABSTRACT
Immature double-positive (CD4(+)CD8(+)) thymocytes respond to negatively selecting peptide-MHC ligands by forming an immune synapse that sustains contact with the antigen-presenting cell (APC). Using fluorescently labeled peptides, we showed that as few as two agonist ligands could promote APC contact and subsequent apoptosis in reactive thymocytes. Furthermore, we showed that productive signaling for positive selection, as gauged by nuclear translocation of a green fluorescent protein (GFP)-labeled NFATc construct, did not involve formation of a synapse between thymocytes and selecting epithelial cells in reaggregate thymus cultures. Antibody blockade of endogenous positively selecting ligands prevented NFAT nuclear accumulation in such cultures and reversed NFAT accumulation in previously stimulated thymocytes. Together, these data suggest a "gauntlet" model in which thymocytes mature by continually acquiring and reacquiring positively selecting signals without sustained contact with epithelial cells, thereby allowing them to sample many cell surfaces for potentially negatively selecting ligands.
Subject(s)
Antigen-Presenting Cells/immunology , Immunological Synapses , NFATC Transcription Factors/immunology , Receptors, Antigen, T-Cell/immunology , T-Lymphocyte Subsets/immunology , Active Transport, Cell Nucleus , Animals , Antigen-Presenting Cells/metabolism , Apoptosis , Cell Nucleus/metabolism , Gene Knockdown Techniques , Ligands , Lymphocyte Activation , Major Histocompatibility Complex/immunology , Mice , NFATC Transcription Factors/metabolism , Receptors, Antigen, T-Cell/metabolism , Signal Transduction , T-Lymphocyte Subsets/cytology , T-Lymphocyte Subsets/metabolism , Thymus Gland/cytology , Thymus Gland/immunology , Thymus Gland/metabolismABSTRACT
Recent data suggest important biological roles for oxidative modifications of methylated cytosines, specifically hydroxymethylation, formylation and carboxylation. Several assays are now available for profiling these DNA modifications genome-wide as well as in targeted, locus-specific settings. Here we present BiQ Analyzer HiMod, a user-friendly software tool for sequence alignment, quality control and initial analysis of locus-specific DNA modification data. The software supports four different assay types, and it leads the user from raw sequence reads to DNA modification statistics and publication-quality plots. BiQ Analyzer HiMod combines well-established graphical user interface of its predecessor tool, BiQ Analyzer HT, with new and extended analysis modes. BiQ Analyzer HiMod also includes updates of the analysis workspace, an intuitive interface, a custom vector graphics engine and support of additional input and output data formats. The tool is freely available as a stand-alone installation package from http://biq-analyzer-himod.bioinf.mpi-inf.mpg.de/.
Subject(s)
5-Methylcytosine/analysis , DNA Methylation , Software , Animals , DNA/chemistry , Genetic Loci , High-Throughput Nucleotide Sequencing , Internet , Mice , Oxidation-Reduction , Sequence Analysis, DNAABSTRACT
Naïve T cells can be induced to differentiate into Foxp3(+) regulatory T cells (iTregs) upon suboptimal T cell receptor (TCR) stimulus or TCR stimulus in conjunction with TGF-ß signaling; however, we do not fully understand how these signals coordinately control foxp3 expression. Here, we show that strong TCR activation, in terms of both duration and ligand affinity, causes the accumulation of DNA (cytosine-5)-methyltransferase 1 (DNMT1) and DNMT3b and their specific enrichment at the foxp3 locus, which leads to increased CpG methylation and inhibits foxp3 transcription. During this process the augmentation of DNMT1 is regulated through at least two post-transcriptional mechanisms; that is, strong TCR signal inactivates GSK3ß to rescue DNMT1 protein from proteasomal degradation, and strong TCR signal suppresses miR-148a to derepress DNMT1 mRNA translation. Meanwhile, TGF-ß signaling antagonizes DNMT1 accumulation via activation of p38 MAP kinase. Thus, independent of transcription factor activation, TCR and TGF-ß signals converge on DNMT1 to modulate the expression of foxp3 epigenetically, which marks mother cell iTreg lineage choice within the genome of differentiating daughter cells.
Subject(s)
Cell Differentiation , DNA (Cytosine-5-)-Methyltransferases/metabolism , Forkhead Transcription Factors/metabolism , Receptors, Antigen, T-Cell/metabolism , Signal Transduction , Transforming Growth Factor beta/metabolism , Animals , Cell Proliferation , CpG Islands , Epigenesis, Genetic , Humans , Mice , Mice, Inbred C57BL , Mice, Transgenic , MicroRNAs/metabolism , Promoter Regions, Genetic , T-Lymphocytes, Regulatory/metabolismABSTRACT
Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de-novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de-novo haplotypes for diploid genomes. Graphasing readily integrates with any assembly workflow that both outputs an assembly graph and has a haplotype assembly mode. Graphasing performs comparably to trio-phasing in contiguity, phasing accuracy, and assembly quality, outperforms Hi-C in phasing accuracy, and generates human assemblies with over 18 chromosome-spanning haplotypes.