Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 222
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Cell ; 185(20): 3689-3704.e21, 2022 09 29.
Article in English | MEDLINE | ID: mdl-36179666

ABSTRACT

Regulatory landscapes drive complex developmental gene expression, but it remains unclear how their integrity is maintained when incorporating novel genes and functions during evolution. Here, we investigated how a placental mammal-specific gene, Zfp42, emerged in an ancient vertebrate topologically associated domain (TAD) without adopting or disrupting the conserved expression of its gene, Fat1. In ESCs, physical TAD partitioning separates Zfp42 and Fat1 with distinct local enhancers that drive their independent expression. This separation is driven by chromatin activity and not CTCF/cohesin. In contrast, in embryonic limbs, inactive Zfp42 shares Fat1's intact TAD without responding to active Fat1 enhancers. However, neither Fat1 enhancer-incompatibility nor nuclear envelope-attachment account for Zfp42's unresponsiveness. Rather, Zfp42's promoter is rendered inert to enhancers by context-dependent DNA methylation. Thus, diverse mechanisms enabled the integration of independent Zfp42 regulation in the Fat1 locus. Critically, such regulatory complexity appears common in evolution as, genome wide, most TADs contain multiple independently expressed genes.


Subject(s)
Chromatin , Placenta , Animals , CCCTC-Binding Factor/metabolism , Chromatin Assembly and Disassembly , Enhancer Elements, Genetic , Evolution, Molecular , Female , Genome , Mammals/metabolism , Placenta/metabolism , Pregnancy , Promoter Regions, Genetic , Transcription Factors/genetics , Transcription Factors/metabolism
2.
Cell ; 178(1): 242-260.e29, 2019 06 27.
Article in English | MEDLINE | ID: mdl-31155234

ABSTRACT

Gene expression in human tissue has primarily been studied on the transcriptional level, largely neglecting translational regulation. Here, we analyze the translatomes of 80 human hearts to identify new translation events and quantify the effect of translational regulation. We show extensive translational control of cardiac gene expression, which is orchestrated in a process-specific manner. Translation downstream of predicted disease-causing protein-truncating variants appears to be frequent, suggesting inefficient translation termination. We identify hundreds of previously undetected microproteins, expressed from lncRNAs and circRNAs, for which we validate the protein products in vivo. The translation of microproteins is not restricted to the heart and prominent in the translatomes of human kidney and liver. We associate these microproteins with diverse cellular processes and compartments and find that many locate to the mitochondria. Importantly, dozens of microproteins are translated from lncRNAs with well-characterized noncoding functions, indicating previously unrecognized biology.


Subject(s)
Myocardium/metabolism , Protein Biosynthesis , Adolescent , Adult , Aged , Animals , Codon/genetics , Female , Gene Expression Regulation , HEK293 Cells , Humans , Infant , Male , Mice , Mice, Inbred C57BL , Middle Aged , Open Reading Frames/genetics , RNA, Circular/genetics , RNA, Circular/metabolism , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , Rats , Ribosomes/genetics , Ribosomes/metabolism , Young Adult
3.
Development ; 150(17)2023 09 01.
Article in English | MEDLINE | ID: mdl-37519269

ABSTRACT

Changes in gene expression represent an important source of phenotypic innovation. Yet how such changes emerge and impact the evolution of traits remains elusive. Here, we explore the molecular mechanisms associated with the development of masculinizing ovotestes in female moles. By performing integrative analyses of epigenetic and transcriptional data in mole and mouse, we identified the co-option of SALL1 expression in mole ovotestes formation. Chromosome conformation capture analyses highlight a striking conservation of the 3D organization at the SALL1 locus, but an evolutionary divergence of enhancer activity. Interspecies reporter assays support the capability of mole-specific enhancers to activate transcription in urogenital tissues. Through overexpression experiments in transgenic mice, we further demonstrate the capability of SALL1 to induce kidney-related gene programs, which are a signature of mole ovotestes. Our results highlight the co-option of gene expression, through changes in enhancer activity, as a plausible mechanism for the evolution of traits.


Subject(s)
Kidney , Moles , Animals , Female , Mice , Kidney/metabolism , Mice, Transgenic , Moles/genetics
4.
Nucleic Acids Res ; 52(13): e57, 2024 Jul 22.
Article in English | MEDLINE | ID: mdl-38850160

ABSTRACT

A fundamental analysis task for single-cell transcriptomics data is clustering with subsequent visualization of cell clusters. The genes responsible for the clustering are only inferred in a subsequent step. Clustering cells and genes together would be the remit of biclustering algorithms, which are often bogged down by the size of single-cell data. Here we present 'Correspondence Analysis based Biclustering on Networks' (CAbiNet) for joint clustering and visualization of single-cell RNA-sequencing data. CAbiNet performs efficient co-clustering of cells and their respective marker genes and jointly visualizes the biclusters in a non-linear embedding for easy and interactive visual exploration of the data.


Subject(s)
Algorithms , Gene Expression Profiling , Single-Cell Analysis , Software , Transcriptome , Single-Cell Analysis/methods , Cluster Analysis , Gene Expression Profiling/methods , Humans , Sequence Analysis, RNA/methods
5.
EMBO J ; 40(24): e105862, 2021 12 15.
Article in English | MEDLINE | ID: mdl-34786738

ABSTRACT

The onset of random X chromosome inactivation in mouse requires the switch from a symmetric to an asymmetric state, where the identities of the future inactive and active X chromosomes are assigned. This process is known as X chromosome choice. Here, we show that RIF1 and KAP1 are two fundamental factors for the definition of this transcriptional asymmetry. We found that at the onset of differentiation of mouse embryonic stem cells (mESCs), biallelic up-regulation of the long non-coding RNA Tsix weakens the symmetric association of RIF1 with the Xist promoter. The Xist allele maintaining the association with RIF1 goes on to up-regulate Xist RNA expression in a RIF1-dependent manner. Conversely, the promoter that loses RIF1 gains binding of KAP1, and KAP1 is required for the increase in Tsix levels preceding the choice. We propose that the mutual exclusion of Tsix and RIF1, and of RIF1 and KAP1, at the Xist promoters establish a self-sustaining loop that transforms an initially stochastic event into a stably inherited asymmetric X-chromosome state.


Subject(s)
Mouse Embryonic Stem Cells/cytology , RNA, Long Noncoding/genetics , Telomere-Binding Proteins/metabolism , Tripartite Motif-Containing Protein 28/metabolism , Animals , Cell Differentiation , Cell Line , Female , Mice , Promoter Regions, Genetic , Stochastic Processes , Up-Regulation , X Chromosome Inactivation
6.
Am J Hum Genet ; 108(9): 1725-1734, 2021 09 02.
Article in English | MEDLINE | ID: mdl-34433009

ABSTRACT

Copy-number variations (CNVs) are a common cause of congenital limb malformations and are interpreted primarily on the basis of their effect on gene dosage. However, recent studies show that CNVs also influence the 3D genome chromatin organization. The functional interpretation of whether a phenotype is the result of gene dosage or a regulatory position effect remains challenging. Here, we report on two unrelated families with individuals affected by bilateral hypoplasia of the femoral bones, both harboring de novo duplications on chromosome 10q24.32. The ∼0.5 Mb duplications include FGF8, a key regulator of limb development and several limb enhancer elements. To functionally characterize these variants, we analyzed the local chromatin architecture in the affected individuals' cells and re-engineered the duplications in mice by using CRISPR-Cas9 genome editing. We found that the duplications were associated with ectopic chromatin contacts and increased FGF8 expression. Transgenic mice carrying the heterozygous tandem duplication including Fgf8 exhibited proximal shortening of the limbs, resembling the human phenotype. To evaluate whether the phenotype was a result of gene dosage, we generated another transgenic mice line, carrying the duplication on one allele and a concurrent Fgf8 deletion on the other allele, as a control. Surprisingly, the same malformations were observed. Capture Hi-C experiments revealed ectopic interaction with the duplicated region and Fgf8, indicating a position effect. In summary, we show that duplications at the FGF8 locus are associated with femoral hypoplasia and that the phenotype is most likely the result of position effects altering FGF8 expression rather than gene dosage effects.


Subject(s)
Chromosome Duplication , Chromosomes, Human, Pair 10/chemistry , DNA Copy Number Variations , Fibroblast Growth Factor 8/genetics , Lower Extremity Deformities, Congenital/genetics , Adolescent , Alleles , Animals , CRISPR-Cas Systems , Child, Preschool , Chromatin/chemistry , Chromatin/metabolism , Chromosomes, Human, Pair 10/metabolism , Enhancer Elements, Genetic , Family , Female , Femur/abnormalities , Femur/diagnostic imaging , Femur/metabolism , Fibroblast Growth Factor 8/metabolism , Gene Editing , Heterozygote , Humans , Infant , Lower Extremity Deformities, Congenital/diagnostic imaging , Lower Extremity Deformities, Congenital/metabolism , Lower Extremity Deformities, Congenital/pathology , Male , Mice , Mice, Transgenic , Pedigree , Phenotype
7.
Bioinformatics ; 39(6)2023 06 01.
Article in English | MEDLINE | ID: mdl-37267159

ABSTRACT

MOTIVATION: Long-read transcriptome sequencing (LRTS) has the potential to enhance our understanding of alternative splicing and the complexity of this process requires the use of versatile computational tools, with the ability to accommodate various stages of the workflow with maximum flexibility. RESULTS: We introduce IsoTools, a Python-based LRTS analysis framework that offers a wide range of functionality for transcriptome reconstruction and quantification of transcripts. Furthermore, we integrate a graph-based method for identifying alternative splicing events and a statistical approach based on the beta-binomial distribution for detecting differential events. To demonstrate the effectiveness of our methods, we applied IsoTools to PacBio LRTS data of human hepatocytes treated with the histone deacetylase inhibitor valproic acid. Our results indicate that LRTS can provide valuable insights into alternative splicing, particularly in terms of complex and differential splicing patterns, in comparison to short-read RNA-seq. AVAILABILITY AND IMPLEMENTATION: IsoTools is available on GitHub and PyPI, and its documentation, including tutorials, CLI, and API references, can be found at https://isotools.readthedocs.io/.


Subject(s)
Alternative Splicing , Transcriptome , Humans , Workflow , Gene Expression Profiling , RNA Splicing , High-Throughput Nucleotide Sequencing , Sequence Analysis, RNA/methods
8.
Bioinformatics ; 39(11)2023 11 01.
Article in English | MEDLINE | ID: mdl-37982748

ABSTRACT

MOTIVATION: Identifying target promoters of active enhancers is a crucial step for realizing gene regulation and deciphering phenotypes and diseases. Up to now, several computational methods were developed to predict enhancer gene interactions, but they require either many epigenomic and transcriptomic experimental assays to generate cell-type (CT)-specific predictions or a single experiment applied to a large cohort of CTs to extract correlations between activities of regulatory elements. Thus, inferring CT-specific enhancer gene interactions in unstudied or poorly annotated CTs becomes a laborious and costly task. RESULTS: Here, we aim to infer CT-specific enhancer target interactions, using minimal experimental input. We introduce Cell-specific ENhancer Target pREdiction (CENTRE), a machine learning framework that predicts enhancer target interactions in a CT-specific manner, using only gene expression and ChIP-seq data for three histone modifications for the CT of interest. CENTRE exploits the wealth of available datasets and extracts cell-type agnostic statistics to complement the CT-specific information. CENTRE is thoroughly tested across many datasets and CTs and achieves equivalent or superior performance than existing algorithms that require massive experimental data. AVAILABILITY AND IMPLEMENTATION: CENTRE's open-source code is available at GitHub via https://github.com/slrvv/CENTRE.


Subject(s)
Algorithms , Enhancer Elements, Genetic , Humans , Gene Expression Regulation , Promoter Regions, Genetic , Epigenomics
9.
Proteins ; 91(7): 980-990, 2023 07.
Article in English | MEDLINE | ID: mdl-36908253

ABSTRACT

Protein-protein interactions (PPIs) play a crucial role in numerous molecular processes. Despite many efforts, mechanisms governing molecular recognition between interacting proteins remain poorly understood and it is particularly challenging to predict from sequence whether two proteins can interact. Here we present a new method to tackle this challenge using intrinsically disordered regions (IDRs). IDRs are protein segments that are functional despite lacking a single invariant three-dimensional structure. The prevalence of IDRs in eukaryotic proteins suggests that IDRs are critical for interactions. To test this hypothesis, we predicted PPIs using IDR sequences in candidate proteins in humans. Moreover, we divide the PPI prediction problem into two specific subproblems and adapt appropriate training and test strategies based on problem type. Our findings underline the importance of defining clearly the problem type and show that sequences encoding IDRs can aid in predicting specific features of the protein interaction network of intrinsically disordered proteins. Our findings further suggest that accounting for IDRs in future analyses should accelerate efforts to elucidate the eukaryotic PPI network.


Subject(s)
Intrinsically Disordered Proteins , Humans , Intrinsically Disordered Proteins/chemistry , Eukaryota , Protein Interaction Maps , Protein Conformation
10.
Am J Hum Genet ; 106(6): 872-884, 2020 06 04.
Article in English | MEDLINE | ID: mdl-32470376

ABSTRACT

Genome-wide analysis methods, such as array comparative genomic hybridization (CGH) and whole-genome sequencing (WGS), have greatly advanced the identification of structural variants (SVs) in the human genome. However, even with standard high-throughput sequencing techniques, complex rearrangements with multiple breakpoints are often difficult to resolve, and predicting their effects on gene expression and phenotype remains a challenge. Here, we address these problems by using high-throughput chromosome conformation capture (Hi-C) generated from cultured cells of nine individuals with developmental disorders (DDs). Three individuals had previously been identified as harboring duplications at the SOX9 locus and six had been identified with translocations. Hi-C resolved the positions of the duplications and was instructive in interpreting their distinct pathogenic effects, including the formation of new topologically associating domains (neo-TADs). Hi-C was very sensitive in detecting translocations, and it revealed previously unrecognized complex rearrangements at the breakpoints. In several cases, we observed the formation of fused-TADs promoting ectopic enhancer-promoter interactions that were likely to be involved in the disease pathology. In summary, we show that Hi-C is a sensible method for the detection of complex SVs in a clinical setting. The results help interpret the possible pathogenic effects of the SVs in individuals with DDs.


Subject(s)
Chromosomes, Human/genetics , Developmental Disabilities/genetics , Genome, Human/genetics , Molecular Conformation , Translocation, Genetic/genetics , Chromatin Assembly and Disassembly/genetics , Chromosome Breakpoints , Cohort Studies , Humans , SOX9 Transcription Factor/genetics , Segmental Duplications, Genomic/genetics
11.
Nucleic Acids Res ; 49(8): 4402-4420, 2021 05 07.
Article in English | MEDLINE | ID: mdl-33788942

ABSTRACT

Pausing of transcribing RNA polymerase is regulated and creates opportunities to control gene expression. Research in metazoans has so far mainly focused on RNA polymerase II (Pol II) promoter-proximal pausing leaving the pervasive nature of pausing and its regulatory potential in mammalian cells unclear. Here, we developed a pause detecting algorithm (PDA) for nucleotide-resolution occupancy data and a new native elongating transcript sequencing approach, termed nested NET-seq, that strongly reduces artifactual peaks commonly misinterpreted as pausing sites. Leveraging PDA and nested NET-seq reveal widespread genome-wide Pol II pausing at single-nucleotide resolution in human cells. Notably, the majority of Pol II pauses occur outside of promoter-proximal gene regions primarily along the gene-body of transcribed genes. Sequence analysis combined with machine learning modeling reveals DNA sequence properties underlying widespread transcriptional pausing including a new pause motif. Interestingly, key sequence determinants of RNA polymerase pausing are conserved between human cells and bacteria. These studies indicate pervasive sequence-induced transcriptional pausing in human cells and the knowledge of exact pause locations implies potential functional roles in gene expression.


Subject(s)
Conserved Sequence , RNA Polymerase II/metabolism , RNA-Seq/methods , Transcription, Genetic , Algorithms , Base Sequence , DNA/chemistry , DNA/metabolism , HEK293 Cells , HeLa Cells , Humans , RNA Polymerase II/chemistry
12.
Nucleic Acids Res ; 49(21): 12178-12195, 2021 12 02.
Article in English | MEDLINE | ID: mdl-34850108

ABSTRACT

Embryonic stem cells (ESCs) can differentiate into any given cell type and therefore represent a versatile model to study the link between gene regulation and differentiation. To quantitatively assess the dynamics of enhancer activity during the early stages of murine ESC differentiation, we analyzed accessible genomic regions using STARR-seq, a massively parallel reporter assay. This resulted in a genome-wide quantitative map of active mESC enhancers, in pluripotency and during the early stages of differentiation. We find that only a minority of accessible regions is active and that such regions are enriched near promoters, characterized by specific chromatin marks, enriched for distinct sequence motifs, and modeling shows that active regions can be predicted from sequence alone. Regions that change their activity upon retinoic acid-induced differentiation are more prevalent at distal intergenic regions when compared to constitutively active enhancers. Further, analysis of differentially active enhancers verified the contribution of individual TF motifs toward activity and inducibility as well as their role in regulating endogenous genes. Notably, the activity of retinoic acid receptor alpha (RARα) occupied regions can either increase or decrease upon the addition of its ligand, retinoic acid, with the direction of the change correlating with spacing and orientation of the RARα consensus motif and the co-occurrence of additional sequence motifs. Together, our genome-wide enhancer activity map elucidates features associated with enhancer activity levels, identifies regulatory regions disregarded by computational prediction tools, and provides a resource for future studies into regulatory elements in mESCs.


Subject(s)
Mouse Embryonic Stem Cells/cytology , Receptors, Retinoic Acid/metabolism , Animals , Cell Differentiation , Chromosome Mapping , Enhancer Elements, Genetic , Mice
13.
Bioinformatics ; 36(22-23): 5519-5521, 2021 Apr 01.
Article in English | MEDLINE | ID: mdl-33346817

ABSTRACT

MOTIVATION: With the availability of new sequencing technologies, the generation of haplotype-resolved genome assemblies up to chromosome scale has become feasible. These assemblies capture the complete genetic information of both parental haplotypes, increase structural variant (SV) calling sensitivity and enable direct genotyping and phasing of SVs. Yet, existing SV callers are designed for haploid genome assemblies only, do not support genotyping or detect only a limited set of SV classes. RESULTS: We introduce our method SVIM-asm for the detection and genotyping of six common classes of SVs from haploid and diploid genome assemblies. Compared against the only other existing SV caller for diploid assemblies, DipCall, SVIM-asm detects more SV classes and reached higher F1 scores for the detection of insertions and deletions on two recently published assemblies of the HG002 individual. AVAILABILITY AND IMPLEMENTATION: SVIM-asm has been implemented in Python and can be easily installed via bioconda. Its source code is available at github.com/eldariont/svim-asm. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

14.
Bioinformatics ; 37(Suppl_1): i1-i6, 2021 07 12.
Article in English | MEDLINE | ID: mdl-34252962

ABSTRACT

Annually, the International Society for Computational Biology (ISCB) recognizes three outstanding researchers for significant scientific contributions to the field of bioinformatics and computational biology, as well as one individual for exemplary service to the field. ISCB is honored to announce the 2021 Accomplishments by a Senior Scientist Awardee, Overton Prize recipient, Innovator Awardee and Outstanding Contributions to ISCB Awardee. Peer Bork, EMBL Heidelberg, is the winner of the Accomplishments by a Senior Scientist Award. Barbara Engelhardt, Princeton University, is the Overton Prize winner. Ben Raphael, Princeton University, is the winner of the ISCB Innovator Award. Teresa Attwood, Manchester University, has been selected as the winner of the Outstanding Contributions to ISCB Award. Martin Vingron, Chair, ISCB Awards Committee noted, 'As chair of the Awards Committee it gives me great pleasure to convey my heart-felt congratulations to this year's awardees. Our community, as represented by the committee, admires these individuals' outstanding achievements in research, training, and outreach.'


Subject(s)
Awards and Prizes , Computational Biology , Heart , Humans , Societies, Scientific
15.
Nature ; 538(7624): 265-269, 2016 Oct 13.
Article in English | MEDLINE | ID: mdl-27706140

ABSTRACT

Chromosome conformation capture methods have identified subchromosomal structures of higher-order chromatin interactions called topologically associated domains (TADs) that are separated from each other by boundary regions. By subdividing the genome into discrete regulatory units, TADs restrict the contacts that enhancers establish with their target genes. However, the mechanisms that underlie partitioning of the genome into TADs remain poorly understood. Here we show by chromosome conformation capture (capture Hi-C and 4C-seq methods) that genomic duplications in patient cells and genetically modified mice can result in the formation of new chromatin domains (neo-TADs) and that this process determines their molecular pathology. Duplications of non-coding DNA within the mouse Sox9 TAD (intra-TAD) that cause female to male sex reversal in humans, showed increased contact of the duplicated regions within the TAD, but no change in the overall TAD structure. In contrast, overlapping duplications that extended over the next boundary into the neighbouring TAD (inter-TAD), resulted in the formation of a new chromatin domain (neo-TAD) that was isolated from the rest of the genome. As a consequence of this insulation, inter-TAD duplications had no phenotypic effect. However, incorporation of the next flanking gene, Kcnj2, in the neo-TAD resulted in ectopic contacts of Kcnj2 with the duplicated part of the Sox9 regulatory region, consecutive misexpression of Kcnj2, and a limb malformation phenotype. Our findings provide evidence that TADs are genomic regulatory units with a high degree of internal stability that can be sculptured by structural genomic variations. This process is important for the interpretation of copy number variations, as these variations are routinely detected in diagnostic tests for genetic disease and cancer. This finding also has relevance in an evolutionary setting because copy-number differences are thought to have a crucial role in the evolution of genome complexity.


Subject(s)
Chromatin Assembly and Disassembly/genetics , DNA Copy Number Variations/genetics , Disease/genetics , Gene Duplication/genetics , Animals , DNA/genetics , Facies , Female , Fibroblasts , Fingers/abnormalities , Foot Deformities, Congenital/genetics , Gene Expression , Genomics , Hand Deformities, Congenital/genetics , Male , Mice , Phenotype , SOX9 Transcription Factor/genetics
16.
Proc Natl Acad Sci U S A ; 116(25): 12390-12399, 2019 06 18.
Article in English | MEDLINE | ID: mdl-31147463

ABSTRACT

Long-range gene regulation involves physical proximity between enhancers and promoters to generate precise patterns of gene expression in space and time. However, in some cases, proximity coincides with gene activation, whereas, in others, preformed topologies already exist before activation. In this study, we investigate the preformed configuration underlying the regulation of the Shh gene by its unique limb enhancer, the ZRS, in vivo during mouse development. Abrogating the constitutive transcription covering the ZRS region led to a shift within the Shh-ZRS contacts and a moderate reduction in Shh transcription. Deletion of the CTCF binding sites around the ZRS resulted in the loss of the Shh-ZRS preformed interaction and a 50% decrease in Shh expression but no phenotype, suggesting an additional, CTCF-independent mechanism of promoter-enhancer communication. This residual activity, however, was diminished by combining the loss of CTCF binding with a hypomorphic ZRS allele, resulting in severe Shh loss of function and digit agenesis. Our results indicate that the preformed chromatin structure of the Shh locus is sustained by multiple components and acts to reinforce enhancer-promoter communication for robust transcription.


Subject(s)
Chromatin/metabolism , Extremities/embryology , Hedgehog Proteins/genetics , Transcription, Genetic , Animals , Binding Sites , CCCTC-Binding Factor/metabolism , Cell Cycle Proteins/metabolism , Chromosomal Proteins, Non-Histone/metabolism , Down-Regulation , Enhancer Elements, Genetic , Membrane Proteins/genetics , Mice , Promoter Regions, Genetic , Cohesins
17.
PLoS Comput Biol ; 16(5): e1007843, 2020 05.
Article in English | MEDLINE | ID: mdl-32469863

ABSTRACT

Reconstructing haplotypes from sequencing data is one of the major challenges in genetics. Haplotypes play a crucial role in many analyses, including genome-wide association studies and population genetics. Haplotype reconstruction becomes more difficult for higher numbers of homologous chromosomes, as it is often the case for polyploid plants. This complexity is compounded further by higher heterozygosity, which denotes the frequent presence of variants between haplotypes. We have designed Ranbow, a new tool for haplotype reconstruction of polyploid genome from short read sequencing data. Ranbow integrates all types of small variants in bi- and multi-allelic sites to reconstruct haplotypes. To evaluate Ranbow and currently available competing methods on real data, we have created and released a real gold standard dataset from sweet potato sequencing data. Our evaluations on real and simulated data clearly show Ranbow's superior performance in terms of accuracy, haplotype length, memory usage, and running time. Specifically, Ranbow is one order of magnitude faster than the next best method. The efficiency and accuracy of Ranbow makes whole genome haplotype reconstruction of complex genome with higher ploidy feasible.


Subject(s)
Haplotypes , Polyploidy , Algorithms , Datasets as Topic , Heterozygote , Humans
18.
Mol Cell ; 50(6): 844-55, 2013 Jun 27.
Article in English | MEDLINE | ID: mdl-23727019

ABSTRACT

The extracellular signal-regulated kinase (ERK)/mitogen-activated protein kinase signal-transduction cascade is one of the key pathways regulating proliferation and differentiation in development and disease. ERK signaling is required for human embryonic stem cells' (hESCs') self-renewing property. Here, we studied the convergence of the ERK signaling cascade at the DNA by mapping genome-wide kinase-chromatin interactions for ERK2 in hESCs. We observed that ERK2 binding occurs near noncoding genes and histone, cell-cycle, metabolism, and pluripotency-associated genes. We find that the transcription factor ELK1 is essential in hESCs and that ERK2 co-occupies promoters bound by ELK1. Strikingly, promoters bound by ELK1 without ERK2 are occupied by Polycomb group proteins that repress genes involved in lineage commitment. In summary, we propose a model wherein extracellular-signaling-stimulated proliferation and intrinsic repression of differentiation are integrated to maintain the identity of hESCs.


Subject(s)
Chromatin/enzymology , Embryonic Stem Cells/enzymology , MAP Kinase Signaling System , Mitogen-Activated Protein Kinase 1/metabolism , ets-Domain Protein Elk-1/metabolism , Base Sequence , Cell Differentiation , Cell Lineage , Cells, Cultured , Consensus Sequence , Embryonic Stem Cells/physiology , Gene Expression Regulation , Gene Knockdown Techniques , Genome, Human , Humans , Mitogen-Activated Protein Kinase 1/genetics , Polycomb-Group Proteins/genetics , Polycomb-Group Proteins/metabolism , Promoter Regions, Genetic , Protein Binding , RNA, Small Interfering/genetics , RNA, Small Nuclear/genetics , RNA, Small Nuclear/metabolism , Transcription, Genetic , Transcriptome , ets-Domain Protein Elk-1/genetics
19.
PLoS Genet ; 14(11): e1007793, 2018 11.
Article in English | MEDLINE | ID: mdl-30427832

ABSTRACT

The binding of transcription factors to short recognition sequences plays a pivotal role in controlling the expression of genes. The sequence and shape characteristics of binding sites influence DNA binding specificity and have also been implicated in modulating the activity of transcription factors downstream of binding. To quantitatively assess the transcriptional activity of tens of thousands of designed synthetic sites in parallel, we developed a synthetic version of STARR-seq (synSTARR-seq). We used the approach to systematically analyze how variations in the recognition sequence of the glucocorticoid receptor (GR) affect transcriptional regulation. Our approach resulted in the identification of a novel highly active functional GR binding sequence and revealed that sequence variation both within and flanking GR's core binding site can modulate GR activity without apparent changes in DNA binding affinity. Notably, we found that the sequence composition of variants with similar activity profiles was highly diverse. In contrast, groups of variants with similar activity profiles showed specific DNA shape characteristics indicating that DNA shape may be a better predictor of activity than DNA sequence. Finally, using single cell experiments with individual enhancer variants, we obtained clues indicating that the architecture of the response element can independently tune expression mean and cell-to cell variability in gene expression (noise). Together, our studies establish synSTARR as a powerful method to systematically study how DNA sequence and shape modulate transcriptional output and noise.


Subject(s)
DNA/genetics , Sequence Analysis, DNA/methods , Transcription, Genetic , Binding Sites/genetics , DNA/chemistry , DNA/metabolism , Enhancer Elements, Genetic , Gene Expression Regulation , Genes, Reporter , Genes, Synthetic , Genetic Variation , Humans , Nucleic Acid Conformation , Protein Conformation , Receptors, Glucocorticoid/chemistry , Receptors, Glucocorticoid/genetics , Receptors, Glucocorticoid/metabolism , Response Elements , Sequence Analysis, DNA/statistics & numerical data , Transcription Factors/genetics , Transcription Factors/metabolism
20.
Genome Res ; 27(2): 223-233, 2017 02.
Article in English | MEDLINE | ID: mdl-27923844

ABSTRACT

Complex regulatory landscapes control the pleiotropic transcriptional activities of developmental genes. For most genes, the number, location, and dynamics of their associated regulatory elements are unknown. In this work, we characterized the three-dimensional chromatin microarchitecture and regulatory landscape of 446 limb-associated gene loci in mouse using Capture-C, ChIP-seq, and RNA-seq in forelimb, hindlimb at three developmental stages, and midbrain. The fine mapping of chromatin interactions revealed a strong preference for functional genomic regions such as repressed or active domains. By combining chromatin marks and interaction peaks, we annotated more than 1000 putative limb enhancers and their associated genes. Moreover, the analysis of chromatin interactions revealed two regimes of chromatin folding, one producing interactions stable across tissues and stages and another one associated with tissue and/or stage-specific interactions. Whereas stable interactions associate strongly with CTCF/RAD21 binding, the intensity of variable interactions correlates with changes in underlying chromatin modifications, specifically at the viewpoint and at the interaction site. In conclusion, this comprehensive data set provides a resource for the characterization of hundreds of limb-associated regulatory landscapes and a framework to interpret the chromatin folding dynamics observed during embryogenesis.


Subject(s)
Chromatin/genetics , Enhancer Elements, Genetic , Transcription Factors/genetics , Transcriptional Activation/genetics , Animals , Binding Sites , Chromatin Immunoprecipitation , Extremities/growth & development , Gene Expression Regulation, Developmental , Histones/genetics , Mice , Promoter Regions, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL