Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
1.
bioRxiv ; 2024 Feb 01.
Article in English | MEDLINE | ID: mdl-38352499

ABSTRACT

The challenge of systematically modifying and optimizing regulatory elements for precise gene expression control is central to modern genomics and synthetic biology. Advancements in generative AI have paved the way for designing synthetic sequences with the aim of safely and accurately modulating gene expression. We leverage diffusion models to design context-specific DNA regulatory sequences, which hold significant potential toward enabling novel therapeutic applications requiring precise modulation of gene expression. Our framework uses a cell type-specific diffusion model to generate synthetic 200 bp regulatory elements based on chromatin accessibility across different cell types. We evaluate the generated sequences based on key metrics to ensure they retain properties of endogenous sequences: transcription factor binding site composition, potential for cell type-specific chromatin accessibility, and capacity for sequences generated by DNA diffusion to activate gene expression in different cell contexts using state-of-the-art prediction models. Our results demonstrate the ability to robustly generate DNA sequences with cell type-specific regulatory potential. DNA-Diffusion paves the way for revolutionizing a regulatory modulation approach to mammalian synthetic biology and precision gene therapy.

3.
Nat Genet ; 53(5): 638-649, 2021 05.
Article in English | MEDLINE | ID: mdl-33859415

ABSTRACT

A central question in the post-genomic era is how genes interact to form biological pathways. Measurements of gene dependency across hundreds of cell lines have been used to cluster genes into 'co-essential' pathways, but this approach has been limited by ubiquitous false positives. In the present study, we develop a statistical method that enables robust identification of gene co-essentiality and yields a genome-wide set of functional modules. This atlas recapitulates diverse pathways and protein complexes, and predicts the functions of 108 uncharacterized genes. Validating top predictions, we show that TMEM189 encodes plasmanylethanolamine desaturase, a key enzyme for plasmalogen synthesis. We also show that C15orf57 encodes a protein that binds the AP2 complex, localizes to clathrin-coated pits and enables efficient transferrin uptake. Finally, we provide an interactive webtool for the community to explore our results, which establish co-essentiality profiling as a powerful resource for biological pathway identification and discovery of new gene functions.


Subject(s)
Gene Regulatory Networks , Genes , Genome , Clathrin/metabolism , Endocytosis , Epigenesis, Genetic , Gene Expression Regulation , HeLa Cells , Humans , Molecular Sequence Annotation , Neoplasms/genetics , Plasmalogens/biosynthesis , Signal Transduction/genetics
4.
Stem Cell Reports ; 16(4): 717-726, 2021 04 13.
Article in English | MEDLINE | ID: mdl-33770495

ABSTRACT

T cell development is restricted to the thymus and is dependent on high levels of Notch signaling induced within the thymic microenvironment. To understand Notch function in thymic restriction, we investigated the basis for target gene selectivity in response to quantitative differences in Notch signal strength, focusing on the chromatin architecture of genes essential for T cell differentiation. We find that high Notch signal strength is required to activate promoters of known targets essential for T cell commitment, including Il2ra, Cd3ε, and Rag1, which feature low CpG content (LCG) and DNA inaccessibility in hematopoietic stem progenitor cells. Our findings suggest that promoter DNA inaccessibility at LCG T lineage genes provides robust protection against stochastic activation in inappropriate Notch signaling contexts, limiting T cell development to the thymus.


Subject(s)
CpG Islands/genetics , Promoter Regions, Genetic/genetics , Receptors, Notch/metabolism , Signal Transduction , T-Lymphocytes/metabolism , Animals , DNA/metabolism , Deoxyribonuclease I/metabolism , Mice, Inbred C57BL
5.
Nature ; 590(7845): 300-307, 2021 02.
Article in English | MEDLINE | ID: mdl-33536621

ABSTRACT

Annotating the molecular basis of human disease remains an unsolved challenge, as 93% of disease loci are non-coding and gene-regulatory annotations are highly incomplete1-3. Here we present EpiMap, a compendium comprising 10,000 epigenomic maps across 800 samples, which we used to define chromatin states, high-resolution enhancers, enhancer modules, upstream regulators and downstream target genes. We used this resource to annotate 30,000 genetic loci that were associated with 540 traits4, predicting trait-relevant tissues, putative causal nucleotide variants in enriched tissue enhancers and candidate tissue-specific target genes for each. We partitioned multifactorial traits into tissue-specific contributing factors with distinct functional enrichments and disease comorbidity patterns, and revealed both single-factor monotropic and multifactor pleiotropic loci. Top-scoring loci frequently had multiple predicted driver variants, converging through multiple enhancers with a common target gene, multiple genes in common tissues, or multiple genes and multiple tissues, indicating extensive pleiotropy. Our results demonstrate the importance of dense, rich, high-resolution epigenomic annotations for the investigation of complex traits.


Subject(s)
Disease/genetics , Epigenesis, Genetic/genetics , Epigenomics , Gene Regulatory Networks/genetics , Genetic Loci/genetics , Chromatin/genetics , Enhancer Elements, Genetic/genetics , Female , Genome-Wide Association Study , Humans , Male , Multifactorial Inheritance/genetics , Organ Specificity/genetics , Reproducibility of Results
6.
Nature ; 584(7820): 244-251, 2020 08.
Article in English | MEDLINE | ID: mdl-32728217

ABSTRACT

DNase I hypersensitive sites (DHSs) are generic markers of regulatory DNA1-5 and contain genetic variations associated with diseases and phenotypic traits6-8. We created high-resolution maps of DHSs from 733 human biosamples encompassing 438 cell and tissue types and states, and integrated these to delineate and numerically index approximately 3.6 million DHSs within the human genome sequence, providing a common coordinate system for regulatory DNA. Here we show that these maps highly resolve the cis-regulatory compartment of the human genome, which encodes unexpectedly diverse cell- and tissue-selective regulatory programs at very high density. These programs can be captured comprehensively by a simple vocabulary that enables the assignment to each DHS of a regulatory barcode that encapsulates its tissue manifestations, and global annotation of protein-coding and non-coding RNA genes in a manner orthogonal to gene expression. Finally, we show that sharply resolved DHSs markedly enhance the genetic association and heritability signals of diseases and traits. Rather than being confined to a small number of distal elements or promoters, we find that genetic signals converge on congruently regulated sets of DHSs that decorate entire gene bodies. Together, our results create a universal, extensible coordinate system and vocabulary for human regulatory DNA marked by DHSs, and provide a new global perspective on the architecture of human gene regulation.


Subject(s)
Chromatin/genetics , DNA/metabolism , Deoxyribonuclease I/metabolism , Molecular Sequence Annotation , Chromatin/chemistry , Chromatin/metabolism , DNA/chemistry , DNA/genetics , Gene Expression Regulation , Genes/genetics , Genome, Human/genetics , Humans , Promoter Regions, Genetic/genetics , Regulatory Sequences, Nucleic Acid/genetics
7.
Nature ; 583(7818): 729-736, 2020 07.
Article in English | MEDLINE | ID: mdl-32728250

ABSTRACT

Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits1, but it remains challenging to distinguish variants that affect regulatory function2. Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin3-6. However, only a small fraction of such sites have been precisely resolved on the human genome sequence6. Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate about 4.5 million compact genomic elements that encode transcription factor occupancy at nucleotide resolution. We map the fine-scale structure within about 1.6 million DNase I-hypersensitive sites and show that the overwhelming majority are populated by well-spaced sites of single transcription factor-DNA interaction. Cell-context-dependent cis-regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements. We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions1,7 is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.


Subject(s)
DNA Footprinting/standards , Genome, Human/genetics , Transcription Factors/metabolism , Consensus Sequence , DNA/genetics , DNA/metabolism , Deoxyribonuclease I/metabolism , Genetics, Population , Genome-Wide Association Study , Humans , Models, Molecular , Polymorphism, Single Nucleotide , Regulatory Sequences, Nucleic Acid/genetics
8.
Nat Commun ; 8: 15011, 2017 04 07.
Article in English | MEDLINE | ID: mdl-28387224

ABSTRACT

Chromatin-state analysis is widely applied in the studies of development and diseases. However, existing methods operate at a single length scale, and therefore cannot distinguish large domains from isolated elements of the same type. To overcome this limitation, we present a hierarchical hidden Markov model, diHMM, to systematically annotate chromatin states at multiple length scales. We apply diHMM to analyse a public ChIP-seq data set. diHMM not only accurately captures nucleosome-level information, but identifies domain-level states that vary in nucleosome-level state composition, spatial distribution and functionality. The domain-level states recapitulate known patterns such as super-enhancers, bivalent promoters and Polycomb repressed regions, and identify additional patterns whose biological functions are not yet characterized. By integrating chromatin-state information with gene expression and Hi-C data, we identify context-dependent functions of nucleosome-level states. Thus, diHMM provides a powerful tool for investigating the role of higher-order chromatin structure in gene regulation.


Subject(s)
Algorithms , Chromatin/genetics , Gene Expression Regulation , Markov Chains , Nucleosomes/genetics , Cell Line , Chromatin/metabolism , Histones/metabolism , Humans , K562 Cells , Nucleosomes/metabolism , Polycomb-Group Proteins/genetics , Polycomb-Group Proteins/metabolism , Promoter Regions, Genetic/genetics
9.
Genome Res ; 27(6): 922-933, 2017 06.
Article in English | MEDLINE | ID: mdl-28341771

ABSTRACT

The spatial arrangement of chromatin is linked to the regulation of nuclear processes. One striking aspect of nuclear organization is the spatial segregation of heterochromatic and euchromatic domains. The mechanisms of this chromatin segregation are still poorly understood. In this work, we investigated the link between the primary genomic sequence and chromatin domains. We analyzed the spatial intranuclear arrangement of a human artificial chromosome (HAC) in a xenospecific mouse background in comparison to an orthologous region of native mouse chromosome. The two orthologous regions include segments that can be assigned to three major chromatin classes according to their gene abundance and repeat repertoire: (1) gene-rich and SINE-rich euchromatin; (2) gene-poor and LINE/LTR-rich heterochromatin; and (3) gene-depleted and satellite DNA-containing constitutive heterochromatin. We show, using fluorescence in situ hybridization (FISH) and 4C-seq technologies, that chromatin segments ranging from 0.6 to 3 Mb cluster with segments of the same chromatin class. As a consequence, the chromatin segments acquire corresponding positions in the nucleus irrespective of their chromosomal context, thereby strongly suggesting that this is their autonomous property. Interactions with the nuclear lamina, although largely retained in the HAC, reveal less autonomy. Taken together, our results suggest that building of a functional nucleus is largely a self-organizing process based on mutual recognition of chromosome segments belonging to the major chromatin classes.


Subject(s)
Cell Nucleus/genetics , Chromosomes, Artificial, Human/metabolism , Euchromatin/metabolism , Fibroblasts/metabolism , Heterochromatin/metabolism , Retina/metabolism , Animals , Cell Line, Transformed , Cell Nucleus/metabolism , Cell Nucleus/ultrastructure , Chromosomes, Artificial, Human/ultrastructure , Euchromatin/classification , Euchromatin/ultrastructure , Fibroblasts/ultrastructure , Gene Expression Profiling , Gene Expression Regulation , Heterochromatin/classification , Heterochromatin/ultrastructure , Humans , In Situ Hybridization, Fluorescence , Mice , Primary Cell Culture , Retina/ultrastructure
10.
N Engl J Med ; 373(10): 895-907, 2015 Sep 03.
Article in English | MEDLINE | ID: mdl-26287746

ABSTRACT

BACKGROUND: Genomewide association studies can be used to identify disease-relevant genomic regions, but interpretation of the data is challenging. The FTO region harbors the strongest genetic association with obesity, yet the mechanistic basis of this association remains elusive. METHODS: We examined epigenomic data, allelic activity, motif conservation, regulator expression, and gene coexpression patterns, with the aim of dissecting the regulatory circuitry and mechanistic basis of the association between the FTO region and obesity. We validated our predictions with the use of directed perturbations in samples from patients and from mice and with endogenous CRISPR-Cas9 genome editing in samples from patients. RESULTS: Our data indicate that the FTO allele associated with obesity represses mitochondrial thermogenesis in adipocyte precursor cells in a tissue-autonomous manner. The rs1421085 T-to-C single-nucleotide variant disrupts a conserved motif for the ARID5B repressor, which leads to derepression of a potent preadipocyte enhancer and a doubling of IRX3 and IRX5 expression during early adipocyte differentiation. This results in a cell-autonomous developmental shift from energy-dissipating beige (brite) adipocytes to energy-storing white adipocytes, with a reduction in mitochondrial thermogenesis by a factor of 5, as well as an increase in lipid storage. Inhibition of Irx3 in adipose tissue in mice reduced body weight and increased energy dissipation without a change in physical activity or appetite. Knockdown of IRX3 or IRX5 in primary adipocytes from participants with the risk allele restored thermogenesis, increasing it by a factor of 7, and overexpression of these genes had the opposite effect in adipocytes from nonrisk-allele carriers. Repair of the ARID5B motif by CRISPR-Cas9 editing of rs1421085 in primary adipocytes from a patient with the risk allele restored IRX3 and IRX5 repression, activated browning expression programs, and restored thermogenesis, increasing it by a factor of 7. CONCLUSIONS: Our results point to a pathway for adipocyte thermogenesis regulation involving ARID5B, rs1421085, IRX3, and IRX5, which, when manipulated, had pronounced pro-obesity and anti-obesity effects. (Funded by the German Research Center for Environmental Health and others.).


Subject(s)
Adipocytes/metabolism , Obesity/genetics , Proteins/genetics , Thermogenesis/genetics , Alleles , Alpha-Ketoglutarate-Dependent Dioxygenase FTO , Animals , Base Sequence , Clustered Regularly Interspaced Short Palindromic Repeats , Epigenomics , Gene Expression , Genetic Engineering , Humans , Mice , Mitochondria/metabolism , Molecular Sequence Data , Obesity/metabolism , Phenotype , RNA Editing , Risk , Thermogenesis/physiology
11.
Nature ; 518(7539): 317-30, 2015 Feb 19.
Article in English | MEDLINE | ID: mdl-25693563

ABSTRACT

The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.


Subject(s)
Epigenesis, Genetic/genetics , Epigenomics , Genome, Human/genetics , Base Sequence , Cell Lineage/genetics , Cells, Cultured , Chromatin/chemistry , Chromatin/genetics , Chromatin/metabolism , Chromosomes, Human/chemistry , Chromosomes, Human/genetics , Chromosomes, Human/metabolism , DNA/chemistry , DNA/genetics , DNA/metabolism , DNA Methylation , Datasets as Topic , Enhancer Elements, Genetic/genetics , Genetic Variation/genetics , Genome-Wide Association Study , Histones/metabolism , Humans , Organ Specificity/genetics , RNA/genetics , Reference Values
12.
Dis Model Mech ; 8(4): 373-84, 2015 Apr.
Article in English | MEDLINE | ID: mdl-25713299

ABSTRACT

E-cadherin inactivation underpins the progression of invasive lobular breast carcinoma (ILC). In ILC, p120-catenin (p120) translocates to the cytosol where it controls anchorage independence through the Rho-Rock signaling pathway, a key mechanism driving tumor growth and metastasis. We now demonstrate that anchorage-independent ILC cells show an increase in nuclear p120, which results in relief of transcriptional repression by Kaiso. To identify the Kaiso target genes that control anchorage independence we performed genome-wide mRNA profiling on anoikis-resistant mouse ILC cells, and identified 29 candidate target genes, including the established Kaiso target Wnt11. Our data indicate that anchorage-independent upregulation of Wnt11 in ILC cells is controlled by nuclear p120 through inhibition of Kaiso-mediated transcriptional repression. Finally, we show that Wnt11 promotes activation of RhoA, which causes ILC anoikis resistance. Our findings thereby establish a mechanistic link between E-cadherin loss and subsequent control of Rho-driven anoikis resistance through p120- and Kaiso-dependent expression of Wnt11.


Subject(s)
Anoikis , Carcinoma, Lobular/pathology , Catenins/metabolism , Cell Nucleus/metabolism , Mammary Neoplasms, Animal/pathology , Transcription Factors/metabolism , Wnt Proteins/metabolism , Animals , Anoikis/genetics , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Carcinoma, Lobular/genetics , Cell Adhesion , Cytosol/metabolism , Female , Genetic Association Studies , Humans , Mammary Neoplasms, Animal/genetics , Mice , Neoplasm Invasiveness , Protein Transport , Repressor Proteins/metabolism , Transcription, Genetic , Up-Regulation/genetics , rhoA GTP-Binding Protein/metabolism , Delta Catenin
13.
Cell ; 154(4): 914-27, 2013 Aug 15.
Article in English | MEDLINE | ID: mdl-23953119

ABSTRACT

Reporter genes integrated into the genome are a powerful tool to reveal effects of regulatory elements and local chromatin context on gene expression. However, so far such reporter assays have been of low throughput. Here, we describe a multiplexing approach for the parallel monitoring of transcriptional activity of thousands of randomly integrated reporters. More than 27,000 distinct reporter integrations in mouse embryonic stem cells, obtained with two different promoters, show ∼1,000-fold variation in expression levels. Data analysis indicates that lamina-associated domains act as attenuators of transcription, likely by reducing access of transcription factors to binding sites. Furthermore, chromatin compaction is predictive of reporter activity. We also found evidence for crosstalk between neighboring genes and estimate that enhancers can influence gene expression on average over ∼20 kb. The multiplexed reporter assay is highly flexible in design and can be modified to query a wide range of aspects of gene regulation.


Subject(s)
Chromosomal Position Effects , Genetic Techniques , Animals , Chromatin/metabolism , Embryonic Stem Cells/metabolism , Genes, Reporter , High-Throughput Nucleotide Sequencing , Mice , Promoter Regions, Genetic
14.
Genome Res ; 23(2): 270-80, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23124521

ABSTRACT

In metazoans, the nuclear lamina is thought to play an important role in the spatial organization of interphase chromosomes, by providing anchoring sites for large genomic segments named lamina-associated domains (LADs). Some of these LADs are cell-type specific, while many others appear constitutively associated with the lamina. Constitutive LADs (cLADs) may contribute to a basal chromosome architecture. By comparison of mouse and human lamina interaction maps, we find that the sizes and genomic positions of cLADs are strongly conserved. Moreover, cLADs are depleted of synteny breakpoints, pointing to evolutionary selective pressure to keep cLADs intact. Paradoxically, the overall sequence conservation is low for cLADs. Instead, cLADs are universally characterized by long stretches of DNA of high A/T content. Cell-type specific LADs also tend to adhere to this "A/T rule" in embryonic stem cells, but not in differentiated cells. This suggests that the A/T rule represents a default positioning mechanism that is locally overruled during lineage commitment. Analysis of paralogs suggests that during evolution changes in A/T content have driven the relocation of genes to and from the nuclear lamina, in tight association with changes in expression level. Taken together, these results reveal that the spatial organization of mammalian genomes is highly conserved and tightly linked to local nucleotide composition.


Subject(s)
AT Rich Sequence , Conserved Sequence , Genome , Nuclear Lamina/metabolism , Animals , Caenorhabditis elegans , Conserved Sequence/genetics , Drosophila melanogaster , Embryonic Stem Cells/metabolism , Humans , Lamin Type A/metabolism , Lamin Type B/metabolism , Mice , Octamer Transcription Factor-1/metabolism
15.
Chromosoma ; 121(5): 447-64, 2012 Oct.
Article in English | MEDLINE | ID: mdl-22610065

ABSTRACT

Mutations in the A-type lamins A and C, two major components of the nuclear lamina, cause a large group of phenotypically diverse diseases collectively referred to as laminopathies. These conditions often involve defects in chromatin organization. However, it is unclear whether A-type lamins interact with chromatin in vivo and whether aberrant chromatin-lamin interactions contribute to disease. Here, we have used an unbiased approach to comparatively map genome-wide interactions of gene promoters with lamin A and progerin, the mutated lamin A isoform responsible for the premature aging disorder Hutchinson-Gilford progeria syndrome (HGPS) in mouse cardiac myoytes and embryonic fibroblasts. We find that lamin A-associated genes are predominantly transcriptionally silent and that loss of lamin association leads to the relocation of peripherally localized genes, but not necessarily to their activation. We demonstrate that progerin induces global changes in chromatin organization by enhancing interactions with a specific subset of genes in addition to the identified lamin A-associated genes. These observations demonstrate disease-related changes in higher order genome organization in HGPS and provide novel insights into the role of lamin-chromatin interactions in chromatin organization.


Subject(s)
Lamin Type A/metabolism , Nuclear Proteins/metabolism , Progeria/metabolism , Protein Precursors/metabolism , Animals , Cell Line , Chromosome Mapping , Fibroblasts/metabolism , Humans , Lamin Type A/genetics , Mice , Muscle Cells/metabolism , Nuclear Proteins/genetics , Progeria/genetics , Protein Binding , Protein Precursors/genetics
16.
PLoS One ; 5(11): e15013, 2010 Nov 24.
Article in English | MEDLINE | ID: mdl-21124834

ABSTRACT

Specific interactions of the genome with the nuclear lamina (NL) are thought to assist chromosome folding inside the nucleus and to contribute to the regulation of gene expression. High-resolution mapping has recently identified hundreds of large, sharply defined lamina-associated domains (LADs) in the human genome, and suggested that the insulator protein CTCF may help to demarcate these domains. Here, we report the detailed structure of LADs in Drosophila cells, and investigate the putative roles of five insulator proteins in LAD organization. We found that the Drosophila genome is also organized in discrete LADs, which are about five times smaller than human LADs but contain on average a similar number of genes. Systematic comparison to new and published insulator binding maps shows that only SU(HW) binds preferentially at LAD borders and at specific positions inside LADs, while GAF, CTCF, BEAF-32 and DWG are mostly absent from these regions. By knockdown and overexpression studies we demonstrate that SU(HW) weakens genome - NL interactions through a local antagonistic effect, but we did not obtain evidence that it is essential for border formation. Our results provide insights into the evolution of LAD organization and identify SU(HW) as a fine-tuner of genome - NL interactions.


Subject(s)
Drosophila Proteins/metabolism , Drosophila melanogaster/metabolism , Genome, Insect , Nuclear Lamina/metabolism , Repressor Proteins/metabolism , Animals , Binding Sites/genetics , Blotting, Western , Cell Line , Chromatin/metabolism , Drosophila Proteins/genetics , Drosophila melanogaster/cytology , Drosophila melanogaster/genetics , Gene Expression Profiling , Humans , Insulator Elements/genetics , Protein Binding , RNA Interference , Repressor Proteins/genetics
17.
Mol Cell ; 38(4): 603-13, 2010 May 28.
Article in English | MEDLINE | ID: mdl-20513434

ABSTRACT

The three-dimensional organization of chromosomes within the nucleus and its dynamics during differentiation are largely unknown. To visualize this process in molecular detail, we generated high-resolution maps of genome-nuclear lamina interactions during subsequent differentiation of mouse embryonic stem cells via lineage-committed neural precursor cells into terminally differentiated astrocytes. This reveals that a basal chromosome architecture present in embryonic stem cells is cumulatively altered at hundreds of sites during lineage commitment and subsequent terminal differentiation. This remodeling involves both individual transcription units and multigene regions and affects many genes that determine cellular identity. Often, genes that move away from the lamina are concomitantly activated; many others, however, remain inactive yet become unlocked for activation in a next differentiation step. These results suggest that lamina-genome interactions are widely involved in the control of gene expression programs during lineage commitment and terminal differentiation.


Subject(s)
Cell Differentiation , Chromosome Positioning , Embryonic Stem Cells/cytology , Genome , Nuclear Lamina/metabolism , Animals , Astrocytes/cytology , Cell Lineage , Drosophila , Humans , Mice , Neurons/cytology
18.
BMC Bioinformatics ; 10 Suppl 1: S51, 2009 Jan 30.
Article in English | MEDLINE | ID: mdl-19208154

ABSTRACT

BACKGROUND: Spectra resulting from Surface-Enhanced Laser Desorption/Ionisation (SELDI) mass spectrometry measurements are constructed by combining sub-spectra, each of which are the result of a single firing of the laser responsible for the process of desorption/ionisation. These firings are performed at different locations of the spot on which the sample is analysed. The final spectrum is then constructed by summing over all these sub-spectra. This process is sub-optimal in that it can average out peaks from peptides that are present in low abundance or are unevenly distributed across the spot, particularly because the amount of noise varies considerably between sub-spectra. This argues for analysing sub-spectra separately and combining results afterwards. RESULTS: Here, we propose to analyse these sub-spectra one-by-one and combine the results using a framework which includes a significance test. This allows one to, for the first time, attach a confidence measure to detected peaks, based on the signal strength of a peak across sub-spectra. In a comparison with three other approaches the sub-spectral approach achieves a higher sensitivity and a low FDR. We further introduce the notion of peak-bags, which provide rich information about the sub-spectral contributions to a given peak. CONCLUSION: The proposed procedure offers better control over the process of distinguishing signal from noise, resulting in an improved performance over other available methods. Moreover, our method provides an implicit deconvolution of peaks, yielding insight in the actual shape of a peak, potentially aiding in a deeper understanding of peak distribution. AVAILABILITY: Implementations of the algorithm in R are available upon request.


Subject(s)
Peptides/chemistry , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization/methods , Algorithms , Pattern Recognition, Automated , Peptides/analysis , Signal Processing, Computer-Assisted
19.
J Comput Biol ; 15(10): 1329-45, 2008 Dec.
Article in English | MEDLINE | ID: mdl-19040367

ABSTRACT

Genomic datasets often consist of large, binary, sparse data matrices. In such a dataset, one is often interested in finding contiguous blocks that (mostly) contain ones. This is a biclustering problem, and while many algorithms have been proposed to deal with gene expression data, only two algorithms have been proposed that specifically deal with binary matrices. None of the gene expression biclustering algorithms can handle the large number of zeros in sparse binary matrices. The two proposed binary algorithms failed to produce meaningful results. In this article, we present a new algorithm that is able to extract biclusters from sparse, binary datasets. A powerful feature is that biclusters with different numbers of rows and columns can be detected, varying from many rows to few columns and few rows to many columns. It allows the user to guide the search towards biclusters of specific dimensions. When applying our algorithm to an input matrix derived from TRANSFAC, we find transcription factors with distinctly dissimilar binding motifs, but a clear set of common targets that are significantly enriched for GO categories.


Subject(s)
Algorithms , Cluster Analysis , Computational Biology/methods , Genome , Databases, Genetic , Models, Genetic , Oligonucleotide Array Sequence Analysis/methods , Software , Transcription Factors/genetics , Transcription Factors/metabolism
20.
Nature ; 453(7197): 948-51, 2008 Jun 12.
Article in English | MEDLINE | ID: mdl-18463634

ABSTRACT

The architecture of human chromosomes in interphase nuclei is still largely unknown. Microscopy studies have indicated that specific regions of chromosomes are located in close proximity to the nuclear lamina (NL). This has led to the idea that certain genomic elements may be attached to the NL, which may contribute to the spatial organization of chromosomes inside the nucleus. However, sequences in the human genome that interact with the NL in vivo have not been identified. Here we construct a high-resolution map of the interaction sites of the entire genome with NL components in human fibroblasts. This map shows that genome-lamina interactions occur through more than 1,300 sharply defined large domains 0.1-10 megabases in size. These lamina-associated domains (LADs) are typified by low gene-expression levels, indicating that LADs represent a repressive chromatin environment. The borders of LADs are demarcated by the insulator protein CTCF, by promoters that are oriented away from LADs, or by CpG islands, suggesting possible mechanisms of LAD confinement. Taken together, these results demonstrate that the human genome is divided into large, discrete domains that are units of chromosome organization within the nucleus.


Subject(s)
Chromosome Positioning , Chromosomes, Human/metabolism , Nuclear Lamina/metabolism , CCCTC-Binding Factor , Cell Line , Chromatin/genetics , Chromatin/metabolism , Chromosomes, Human/genetics , CpG Islands/genetics , DNA-Binding Proteins/metabolism , Fibroblasts , Genome, Human , Humans , Lamin Type B/metabolism , Nuclear Lamina/chemistry , Promoter Regions, Genetic/genetics , Protein Binding , Repressor Proteins/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...