ABSTRACT
Higher-order chromosomal organization for transcription regulation is poorly understood in eukaryotes. Using genome-wide Chromatin Interaction Analysis with Paired-End-Tag sequencing (ChIA-PET), we mapped long-range chromatin interactions associated with RNA polymerase II in human cells and uncovered widespread promoter-centered intragenic, extragenic, and intergenic interactions. These interactions further aggregated into higher-order clusters, wherein proximal and distal genes were engaged through promoter-promoter interactions. Most genes with promoter-promoter interactions were active and transcribed cooperatively, and some interacting promoters could influence each other implying combinatorial complexity of transcriptional controls. Comparative analyses of different cell lines showed that cell-specific chromatin interactions could provide structural frameworks for cell-specific transcription, and suggested significant enrichment of enhancer-promoter interactions for cell-specific functions. Furthermore, genetically-identified disease-associated noncoding elements were found to be spatially engaged with corresponding genes through long-range interactions. Overall, our study provides insights into transcription regulation by three-dimensional chromatin interactions for both housekeeping and cell-specific genes in human cells.
Subject(s)
Chromatin/metabolism , Gene Expression Regulation , Promoter Regions, Genetic , RNA Polymerase II/metabolism , Transcription, Genetic , Cell Line, Tumor , Chromatin Immunoprecipitation , Enhancer Elements, Genetic , Genome-Wide Association Study , HumansABSTRACT
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.
Subject(s)
Caenorhabditis elegans/genetics , Drosophila melanogaster/genetics , Gene Expression Profiling , Transcriptome/genetics , Animals , Caenorhabditis elegans/embryology , Caenorhabditis elegans/growth & development , Chromatin/genetics , Cluster Analysis , Drosophila melanogaster/growth & development , Gene Expression Regulation, Developmental/genetics , Histones/metabolism , Humans , Larva/genetics , Larva/growth & development , Models, Genetic , Molecular Sequence Annotation , Promoter Regions, Genetic/genetics , Pupa/genetics , Pupa/growth & development , RNA, Untranslated/genetics , Sequence Analysis, RNAABSTRACT
Accurate chromosome segregation requires centromeres (CENs), the DNA sequences where kinetochores form, to attach chromosomes to microtubules. In contrast to most eukaryotes, which have broad centromeres, Saccharomyces cerevisiae possesses sequence-defined point CENs. Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) reveals colocalization of four kinetochore proteins at novel, discrete, non-centromeric regions, especially when levels of the centromeric histone H3 variant, Cse4 (a.k.a. CENP-A or CenH3), are elevated. These regions of overlapping protein binding enhance the segregation of plasmids and chromosomes and have thus been termed Centromere-Like Regions (CLRs). CLRs form in close proximity to S. cerevisiae CENs and share characteristics typical of both point and regional CENs. CLR sequences are conserved among related budding yeasts. Many genomic features characteristic of CLRs are also associated with these conserved homologous sequences from closely related budding yeasts. These studies provide general and important insights into the origin and evolution of centromeres.
Subject(s)
Centromere/genetics , Chromosome Segregation/genetics , Genome, Fungal , Microtubules/genetics , Autoantigens/genetics , Autoantigens/metabolism , Base Sequence , Centromere Protein A , Chromatin/genetics , Chromosomal Proteins, Non-Histone/genetics , Chromosomal Proteins, Non-Histone/metabolism , DNA-Binding Proteins/genetics , DNA-Binding Proteins/metabolism , Evolution, Molecular , Histones/genetics , Histones/metabolism , Kinetochores/metabolism , Nucleosomes/genetics , Nucleosomes/metabolism , Protein Binding , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolismABSTRACT
We present an integrative machine learning method, incRNA, for whole-genome identification of noncoding RNAs (ncRNAs). It combines a large amount of expression data, RNA secondary-structure stability, and evolutionary conservation at the protein and nucleic-acid level. Using the incRNA model and data from the modENCODE consortium, we are able to separate known C. elegans ncRNAs from coding sequences and other genomic elements with a high level of accuracy (97% AUC on an independent validation set), and find more than 7000 novel ncRNA candidates, among which more than 1000 are located in the intergenic regions of C. elegans genome. Based on the validation set, we estimate that 91% of the approximately 7000 novel ncRNA candidates are true positives. We then analyze 15 novel ncRNA candidates by RT-PCR, detecting the expression for 14. In addition, we characterize the properties of all the novel ncRNA candidates and find that they have distinct expression patterns across developmental stages and tend to use novel RNA structural families. We also find that they are often targeted by specific transcription factors (â¼59% of intergenic novel ncRNA candidates). Overall, our study identifies many new potential ncRNAs in C. elegans and provides a method that can be adapted to other organisms.
Subject(s)
Caenorhabditis elegans/genetics , High-Throughput Nucleotide Sequencing , Oligonucleotide Array Sequence Analysis , RNA, Untranslated/chemistry , RNA, Untranslated/genetics , Algorithms , Animals , Binding Sites/genetics , DNA, Intergenic/genetics , Gene Expression Profiling , Molecular Sequence Annotation , Nucleic Acid Conformation , RNA Polymerase II/metabolism , Transcription Factors/metabolismABSTRACT
MOTIVATION: Biological analysis has shifted from identifying genes and transcripts to mapping these genes and transcripts to biological functions. The ENCODE Project has generated hundreds of ChIP-Seq experiments spanning multiple transcription factors and cell lines for public use, but tools for a biomedical scientist to analyze these data are either non-existent or tailored to narrow biological questions. We present the ENCODE ChIP-Seq Significance Tool, a flexible web application leveraging public ENCODE data to identify enriched transcription factors in a gene or transcript list for comparative analyses. IMPLEMENTATION: The ENCODE ChIP-Seq Significance Tool is written in JavaScript on the client side and has been tested on Google Chrome, Apple Safari and Mozilla Firefox browsers. Server-side scripts are written in PHP and leverage R and a MySQL database. The tool is available at http://encodeqt.stanford.edu. CONTACT: abutte@stanford.edu SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.
Subject(s)
Chromatin Immunoprecipitation , Software , Transcription Factors/metabolism , Genes , Internet , Sequence Analysis, DNAABSTRACT
A systems understanding of nuclear organization and events is critical for determining how cells divide, differentiate, and respond to stimuli and for identifying the causes of diseases. Chromatin remodeling complexes such as SWI/SNF have been implicated in a wide variety of cellular processes including gene expression, nuclear organization, centromere function, and chromosomal stability, and mutations in SWI/SNF components have been linked to several types of cancer. To better understand the biological processes in which chromatin remodeling proteins participate, we globally mapped binding regions for several components of the SWI/SNF complex throughout the human genome using ChIP-Seq. SWI/SNF components were found to lie near regulatory elements integral to transcription (e.g. 5' ends, RNA Polymerases II and III, and enhancers) as well as regions critical for chromosome organization (e.g. CTCF, lamins, and DNA replication origins). Interestingly we also find that certain configurations of SWI/SNF subunits are associated with transcripts that have higher levels of expression, whereas other configurations of SWI/SNF factors are associated with transcripts that have lower levels of expression. To further elucidate the association of SWI/SNF subunits with each other as well as with other nuclear proteins, we also analyzed SWI/SNF immunoprecipitated complexes by mass spectrometry. Individual SWI/SNF factors are associated with their own family members, as well as with cellular constituents such as nuclear matrix proteins, key transcription factors, and centromere components, implying a ubiquitous role in gene regulation and nuclear function. We find an overrepresentation of both SWI/SNF-associated regions and proteins in cell cycle and chromosome organization. Taken together the results from our ChIP and immunoprecipitation experiments suggest that SWI/SNF facilitates gene regulation and genome function more broadly and through a greater diversity of interactions than previously appreciated.
Subject(s)
Cell Cycle/genetics , Chromatin Assembly and Disassembly/genetics , Chromatin , Chromosomal Proteins, Non-Histone , Transcription Factors , Chromatin/genetics , Chromatin/metabolism , Chromatin Immunoprecipitation/methods , Chromosomal Proteins, Non-Histone/genetics , Chromosomal Proteins, Non-Histone/metabolism , HeLa Cells , High-Throughput Nucleotide Sequencing/methods , Humans , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , Protein Binding/genetics , Sequence Analysis, DNA/methods , Transcription Factors/genetics , Transcription Factors/metabolismABSTRACT
Chromatin-remodeling enzymes play essential roles in many biological processes, including gene expression, DNA replication and repair, and cell division. Although one such complex, SWI/SNF, has been extensively studied, new discoveries are still being made. Here, we review SWI/SNF biochemistry; highlight recent genomic and proteomic advances; and address the role of SWI/SNF in human diseases, including cancer and viral infections. These studies have greatly increased our understanding of complex nuclear processes.
Subject(s)
Chromosomal Proteins, Non-Histone/metabolism , DNA Repair , DNA Replication , Gene Expression Regulation , Neoplasms/metabolism , Transcription Factors/metabolism , Virus Diseases/metabolism , Chromosomal Proteins, Non-Histone/genetics , Humans , Neoplasms/genetics , Transcription Factors/genetics , Virus Diseases/geneticsABSTRACT
Transcription factors are key components of regulatory networks that control development, as well as the response to environmental stimuli. We have established an experimental pipeline in Caenorhabditis elegans that permits global identification of the binding sites for transcription factors using chromatin immunoprecipitation and deep sequencing. We describe and validate this strategy, and apply it to the transcription factor PHA-4, which plays critical roles in organ development and other cellular processes. We identified thousands of binding sites for PHA-4 during formation of the embryonic pharynx, and also found a role for this factor during the starvation response. Many binding sites were found to shift dramatically between embryos and starved larvae, from developmentally regulated genes to genes involved in metabolism. These results indicate distinct roles for this regulator in two different biological processes and demonstrate the versatility of transcription factors in mediating diverse biological roles.
Subject(s)
Caenorhabditis elegans Proteins/metabolism , Caenorhabditis elegans/growth & development , Caenorhabditis elegans/genetics , Environment , Genome, Helminth/genetics , Trans-Activators/metabolism , Animals , Binding Sites , Caenorhabditis elegans Proteins/genetics , Chromatin Immunoprecipitation , Embryo, Nonmammalian/metabolism , Gene Expression Regulation, Developmental , Genes, Helminth/genetics , Green Fluorescent Proteins/metabolism , Larva/metabolism , Protein Binding , RNA Polymerase II/metabolism , Recombinant Fusion Proteins/metabolism , Starvation , Survival Analysis , Trans-Activators/genetics , Transcription Factors/metabolismABSTRACT
Disruptions in local chromatin structure often indicate features of biological interest such as regulatory regions. We find that sonication of cross-linked chromatin, when combined with a size-selection step and massively parallel short-read sequencing, can be used as a method (Sono-Seq) to map locations of high chromatin accessibility in promoter regions. Sono-Seq sites frequently correspond to actively transcribed promoter regions, as evidenced by their co-association with RNA Polymerase II ChIP regions, transcription start sites, histone H3 lysine 4 trimethylation (H3K4me3) marks, and CpG islands; signals over other sites, such as those bound by the CTCF insulator, are also observed. The pattern of breakage by Sono-Seq overlaps with, but is distinct from, that observed for FAIRE and DNase I hypersensitive sites. Our results demonstrate that Sono-Seq can be a useful and simple method by which to map many local alterations in chromatin structure. Furthermore, our results provide insights into the mapping of binding sites by using ChIP-Seq experiments and the value of reference samples that should be used in such experiments.
Subject(s)
Chromatin , Chromosome Mapping/methods , Oligonucleotide Array Sequence Analysis/methods , Sequence Analysis, DNA/methods , Animals , Base Sequence , Gene Expression , Genetic Markers , HeLa Cells , Histones/metabolism , Humans , Methylation , MiceABSTRACT
The sciences have seen a large increase in demand for students in bioinformatics and multidisciplinary fields in general. Many new educational programs have been created to satisfy this demand, but navigating these programs requires a non-traditional outlook and emphasizes working in teams of individuals with distinct yet complementary skill sets. Written from the perspective of a current bioinformatics student, this article seeks to offer advice to prospective and current students in bioinformatics regarding what to expect in their educational program, how multidisciplinary fields differ from more traditional paths, and decisions that they will face on the road to becoming successful, productive bioinformaticists.
Subject(s)
Biomedical Engineering/education , Computational Biology/education , Interdisciplinary Communication , Students , Algorithms , Computational Biology/methods , Humans , Learning , Publishing/standards , WorkforceABSTRACT
In parallel to the growth in bioscience databases, biomedical publications have increased exponentially in the past decade. However, the extraction of high-quality information from the corpus of scientific literature has been hampered by the lack of machine-interpretable content, despite text-mining advances. To address this, we propose creating a structured digital table as part of an overall effort in developing machine-readable, structured digital literature. In particular, we envision transforming publication tables into standardized triples using Semantic Web approaches. We identify three canonical types of tables (conveying information about properties, networks, and concept hierarchies) and show how more complex tables can be built from these basic types. We envision that authors would create tables initially using the structured triples for canonical types and then have them visually rendered for publication, and we present examples for converting representative tables into triples. Finally, we discuss how 'stub' versions of structured digital tables could be a useful bridge for connecting together the literature with databases, allowing the former to more precisely document the later.
Subject(s)
Database Management Systems , Internet , Peer Review, Research , Publishing , Semantics , Protein Structure, Tertiary , Saccharomyces cerevisiae/metabolismABSTRACT
BACKGROUND: Phylogeographic reconstruction of some bacterial populations is hindered by low diversity coupled with high levels of lateral gene transfer. A comparison of recombination levels and diversity at seven housekeeping genes for eleven bacterial species, most of which are commonly cited as having high levels of lateral gene transfer shows that the relative contributions of homologous recombination versus mutation for Burkholderia pseudomallei is over two times higher than for Streptococcus pneumoniae and is thus the highest value yet reported in bacteria. Despite the potential for homologous recombination to increase diversity, B. pseudomallei exhibits a relative lack of diversity at these loci. In these situations, whole genome genotyping of orthologous shared single nucleotide polymorphism loci, discovered using next generation sequencing technologies, can provide very large data sets capable of estimating core phylogenetic relationships. We compared and searched 43 whole genome sequences of B. pseudomallei and its closest relatives for single nucleotide polymorphisms in orthologous shared regions to use in phylogenetic reconstruction. RESULTS: Bayesian phylogenetic analyses of >14,000 single nucleotide polymorphisms yielded completely resolved trees for these 43 strains with high levels of statistical support. These results enable a better understanding of a separate analysis of population differentiation among >1,700 B. pseudomallei isolates as defined by sequence data from seven housekeeping genes. We analyzed this larger data set for population structure and allele sharing that can be attributed to lateral gene transfer. Our results suggest that despite an almost panmictic population, we can detect two distinct populations of B. pseudomallei that conform to biogeographic patterns found in many plant and animal species. That is, separation along Wallace's Line, a biogeographic boundary between Southeast Asia and Australia. CONCLUSION: We describe an Australian origin for B. pseudomallei, characterized by a single introduction event into Southeast Asia during a recent glacial period, and variable levels of lateral gene transfer within populations. These patterns provide insights into mechanisms of genetic diversification in B. pseudomallei and its closest relatives, and provide a framework for integrating the traditionally separate fields of population genetics and phylogenetics for other bacterial species with high levels of lateral gene transfer.
Subject(s)
Burkholderia pseudomallei/genetics , Gene Transfer, Horizontal/physiology , Genes, Bacterial , Genetics, Population , Australia , DNA, Bacterial/chemistry , DNA, Bacterial/genetics , Genome, Bacterial , Humans , Molecular Epidemiology , Phylogeny , Polymorphism, Single Nucleotide , Sequence Analysis, DNA , Sequence HomologyABSTRACT
Francisella tularensis contains several highly pathogenic subspecies, including Francisella tularensis subsp. holarctica, whose distribution is circumpolar in the northern hemisphere. The phylogeography of these subspecies and their subclades was examined using whole-genome single nucleotide polymorphism (SNP) analysis, high-density microarray SNP genotyping, and real-time-PCR-based canonical SNP (canSNP) assays. Almost 30,000 SNPs were identified among 13 whole genomes for phylogenetic analysis. We selected 1,655 SNPs to genotype 95 isolates on a high-density microarray platform. Finally, 23 clade- and subclade-specific canSNPs were identified and used to genotype 496 isolates to establish global geographic genetic patterns. We confirm previous findings concerning the four subspecies and two Francisella tularensis subsp. tularensis subpopulations and identify additional structure within these groups. We identify 11 subclades within F. tularensis subsp. holarctica, including a new, genetically distinct subclade that appears intermediate between Japanese F. tularensis subsp. holarctica isolates and the common F. tularensis subsp. holarctica isolates associated with the radiation event (the B radiation) wherein this subspecies spread throughout the northern hemisphere. Phylogenetic analyses suggest a North American origin for this B-radiation clade and multiple dispersal events between North America and Eurasia. These findings indicate a complex transmission history for F. tularensis subsp. holarctica.
Subject(s)
DNA, Bacterial/genetics , Francisella tularensis/classification , Francisella tularensis/isolation & purification , Geography , Polymorphism, Single Nucleotide , Tularemia/epidemiology , Tularemia/microbiology , Asia/epidemiology , Bacterial Typing Techniques , Cluster Analysis , Europe/epidemiology , Francisella tularensis/genetics , Genome, Bacterial , Genotype , Microarray Analysis/methods , Molecular Epidemiology , North America/epidemiology , PhylogenyABSTRACT
BACKGROUND: Short-read high-throughput DNA sequencing technologies provide new tools to answer biological questions. However, high cost and low throughput limit their widespread use, particularly in organisms with smaller genomes such as S. cerevisiae. Although ChIP-Seq in mammalian cell lines is replacing array-based ChIP-chip as the standard for transcription factor binding studies, ChIP-Seq in yeast is still underutilized compared to ChIP-chip. We developed a multiplex barcoding system that allows simultaneous sequencing and analysis of multiple samples using Illumina's platform. We applied this method to analyze the chromosomal distributions of three yeast DNA binding proteins (Ste12, Cse4 and RNA PolII) and a reference sample (input DNA) in a single experiment and demonstrate its utility for rapid and accurate results at reduced costs. RESULTS: We developed a barcoding ChIP-Seq method for the concurrent analysis of transcription factor binding sites in yeast. Our multiplex strategy generated high quality data that was indistinguishable from data obtained with non-barcoded libraries. None of the barcoded adapters induced differences relative to a non-barcoded adapter when applied to the same DNA sample. We used this method to map the binding sites for Cse4, Ste12 and Pol II throughout the yeast genome and we found 148 binding targets for Cse4, 823 targets for Ste12 and 2508 targets for PolII. Cse4 was strongly bound to all yeast centromeres as expected and the remaining non-centromeric targets correspond to highly expressed genes in rich media. The presence of Cse4 non-centromeric binding sites was not reported previously. CONCLUSION: We designed a multiplex short-read DNA sequencing method to perform efficient ChIP-Seq in yeast and other small genome model organisms. This method produces accurate results with higher throughput and reduced cost. Given constant improvements in high-throughput sequencing technologies, increasing multiplexing will be possible to further decrease costs per sample and to accelerate the completion of large consortium projects such as modENCODE.
Subject(s)
Oligonucleotide Array Sequence Analysis/methods , Saccharomyces cerevisiae/genetics , Sequence Analysis, DNA/methods , Binding Sites , Centromere/metabolism , Chromatin Immunoprecipitation , Chromosome Mapping , DNA, Fungal/genetics , Genome, Fungal , Genomic Library , Genomics/methods , Transcription Factors/metabolismABSTRACT
BACKGROUND: Burkholderia pseudomallei is the etiologic agent of melioidosis, a significant cause of morbidity and mortality where this infection is endemic. Genomic differences among strains of B. pseudomallei are predicted to be one of the major causes of the diverse clinical manifestations observed among patients with melioidosis. The purpose of this study was to examine the role of genomic islands (GIs) as sources of genomic diversity in this species. RESULTS: We found that genomic islands (GIs) vary greatly among B. pseudomallei strains. We identified 71 distinct GIs from the genome sequences of five reference strains of B. pseudomallei: K96243, 1710b, 1106a, MSHR668, and MSHR305. The genomic positions of these GIs are not random, as many of them are associated with tRNA gene loci. In particular, the 3' end sequences of tRNA genes are predicted to be involved in the integration of GIs. We propose the term "tRNA-mediated site-specific recombination" (tRNA-SSR) for this mechanism. In addition, we provide a GI nomenclature that is based upon integration hotspots identified here or previously described. CONCLUSION: Our data suggest that acquisition of GIs is one of the major sources of genomic diversity within B. pseudomallei and the molecular mechanisms that facilitate horizontally-acquired GIs are common across multiple strains of B. pseudomallei. The differential presence of the 71 GIs across multiple strains demonstrates the importance of these mobile elements for shaping the genetic composition of individual strains and populations within this bacterial species.
Subject(s)
Burkholderia mallei/genetics , Genetic Variation , Genomic Islands , Gene Transfer, Horizontal , RNA, Transfer/genetics , Terminology as TopicABSTRACT
We present TaqMan-minor groove binding (MGB) assays for an SNP that separates the Yersinia pestis strain CO92 from all other strains and for another SNP that separates North American strains from all other global strains.
Subject(s)
Bacterial Typing Techniques/methods , Yersinia pestis/classification , Yersinia pestis/isolation & purification , Alleles , DNA, Bacterial/analysis , DNA, Bacterial/genetics , Genotype , North America , Polymorphism, Single Nucleotide/genetics , Reproducibility of Results , Time Factors , Yersinia pestis/geneticsABSTRACT
Burkholderia pseudomallei is the etiologic agent of melioidosis. Many disease manifestations are associated with melioidosis, and the mechanisms causing this variation are unknown; genomic differences among strains offer one explanation. We compared the genome sequences of two strains of B. pseudomallei: the original reference strain K96243 from Thailand and strain MSHR305 from Australia. We identified a variable homologous region between the two strains. This region was previously identified in comparisons of the genome of B. pseudomallei strain K96243 with the genome of strain E264 from the closely related B. thailandensis. In that comparison, K96243 was shown to possess a horizontally acquired Yersinia-like fimbrial (YLF) gene cluster. Here, we show that the homologous genomic region in B. pseudomallei strain 305 is similar to that previously identified in B. thailandensis strain E264. We have named this region in B. pseudomallei strain 305 the B. thailandensis-like flagellum and chemotaxis (BTFC) gene cluster. We screened for these different genomic components across additional genome sequences and 571 B. pseudomallei DNA extracts obtained from regions of endemicity. These alternate genomic states define two distinct groups within B. pseudomallei: all strains contained either the BTFC gene cluster (group BTFC) or the YLF gene cluster (group YLF). These two groups have distinct geographic distributions: group BTFC is dominant in Australia, and group YLF is dominant in Thailand and elsewhere. In addition, clinical isolates are more likely to belong to group YLF, whereas environmental isolates are more likely to belong to group BTFC. These groups should be further characterized in an animal model.
Subject(s)
Burkholderia pseudomallei/classification , Burkholderia pseudomallei/genetics , Evolution, Molecular , Gene Transfer, Horizontal , Australia/epidemiology , Burkholderia pseudomallei/isolation & purification , Chromosomes, Bacterial/genetics , DNA, Bacterial/chemistry , DNA, Bacterial/genetics , Environmental Microbiology , Genotype , Humans , Melioidosis/epidemiology , Melioidosis/microbiology , Molecular Epidemiology , Molecular Sequence Data , Multigene Family , Sequence Analysis, DNA , Sequence Homology , Synteny , Thailand/epidemiologyABSTRACT
In mammals, a cell's decision to divide is thought to be under the control of the Rb/E2F pathway. We previously found that inactivation of the Rb family of cell cycle inhibitors (Rb, p107, and p130) in quiescent liver progenitors leads to uncontrolled division and cancer initiation. Here, we show that, in contrast, deletion of the entire Rb gene family in mature hepatocytes is not sufficient for their long-term proliferation. The cell cycle block in Rb family mutant hepatocytes is independent of the Arf/p53/p21 checkpoint but can be abrogated upon decreasing liver size. At the molecular level, we identify YAP, a transcriptional regulator involved in organ size control, as a factor required for the sustained expression of cell cycle genes in hepatocytes. These experiments identify a higher level of regulation of the cell cycle in vivo in which signals regulating organ size are dominant regulators of the core cell cycle machinery.
Subject(s)
Cell Proliferation , Liver/growth & development , Retinoblastoma-Like Protein p107/metabolism , Retinoblastoma-Like Protein p130/metabolism , ADP-Ribosylation Factors/metabolism , Adaptor Proteins, Signal Transducing/genetics , Adaptor Proteins, Signal Transducing/metabolism , Animals , Cell Cycle Proteins , Cyclin-Dependent Kinase Inhibitor p21/metabolism , Genes, cdc , Hepatocytes/cytology , Hepatocytes/metabolism , Hepatocytes/physiology , Liver/metabolism , Mice , Organ Size , Phosphoproteins/genetics , Phosphoproteins/metabolism , Retinoblastoma-Like Protein p107/genetics , Retinoblastoma-Like Protein p130/genetics , Tumor Suppressor Protein p53/metabolism , YAP-Signaling ProteinsABSTRACT
Advances in sequencing technology have led to a sharp decrease in the cost of 'data generation'. But is this sufficient to ensure cost-effective and efficient 'knowledge generation'?
Subject(s)
Genomics/economics , Genomics/methods , Sequence Analysis, DNA/economics , Sequence Analysis, DNA/methods , Costs and Cost Analysis , Database Management Systems , Genome, Human , HumansABSTRACT
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.