Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
1.
Cell ; 135(6): 1053-64, 2008 Dec 12.
Article in English | MEDLINE | ID: mdl-19070576

ABSTRACT

Vascular development begins when mesodermal cells differentiate into endothelial cells, which then form primitive vessels. It has been hypothesized that endothelial-specific gene expression may be regulated combinatorially, but the transcriptional mechanisms governing specificity in vascular gene expression remain incompletely understood. Here, we identify a 44 bp transcriptional enhancer that is sufficient to direct expression specifically and exclusively to the developing vascular endothelium. This enhancer is regulated by a composite cis-acting element, the FOX:ETS motif, which is bound and synergistically activated by Forkhead and Ets transcription factors. We demonstrate that coexpression of the Forkhead protein FoxC2 and the Ets protein Etv2 induces ectopic expression of vascular genes in Xenopus embryos, and that combinatorial knockdown of the orthologous genes in zebrafish embryos disrupts vascular development. Finally, we show that FOX:ETS motifs are present in many known endothelial-specific enhancers and that this motif is an efficient predictor of endothelial enhancers in the human genome.


Subject(s)
Enhancer Elements, Genetic , Forkhead Transcription Factors/metabolism , Gene Expression Regulation, Developmental , Proto-Oncogene Proteins c-ets/metabolism , Animals , Blood Vessels/embryology , Embryo, Mammalian/cytology , Embryo, Nonmammalian/metabolism , Endothelium/embryology , Fibroblasts/metabolism , Humans , Mice , Xenopus , Zebrafish
2.
Bioinformatics ; 29(16): 2059-61, 2013 Aug 15.
Article in English | MEDLINE | ID: mdl-23736530

ABSTRACT

SUMMARY: We have developed a web-based query tool, Whole-Genome rVISTA (WGRV), that determines enrichment of transcription factors (TFs) and associated target genes in sets of co-regulated genes. WGRV enables users to query databases containing pre-computed genome coordinates of evolutionarily conserved transcription factor binding sites in the proximal promoters (from 100 bp to 5 kb upstream) of human, mouse and Drosophila genomes. TF binding sites are based on position-weight matrices from the TRANSFAC Professional database. For a given set of co-regulated genes, WGRV returns statistically enriched and evolutionarily conserved binding sites, mapped by the regulatory VISTA (rVISTA) algorithm. Users can then retrieve a list of genes from the query set containing the enriched TF binding sites and their location in the query set promoters. Results are exported in a BED format for rapid visualization in the UCSC genome browser. Flat files of mapped conserved sites and their genomic coordinates are also available for analysis with stand-alone software. AVAILABILITY: http://genome.lbl.gov/cgi-bin/WGRVistaInputCommon.pl.


Subject(s)
Gene Expression Profiling , Promoter Regions, Genetic , Software , Transcription Factors/metabolism , Algorithms , Animals , Genomics , Humans , Internet , Mice
3.
Nucleic Acids Res ; 40(Database issue): D26-32, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22110030

ABSTRACT

The Department of Energy (DOE) Joint Genome Institute (JGI) is a national user facility with massive-scale DNA sequencing and analysis capabilities dedicated to advancing genomics for bioenergy and environmental applications. Beyond generating tens of trillions of DNA bases annually, the Institute develops and maintains data management systems and specialized analytical capabilities to manage and interpret complex genomic data sets, and to enable an expanding community of users around the world to analyze these data in different contexts over the web. The JGI Genome Portal (http://genome.jgi.doe.gov) provides a unified access point to all JGI genomic databases and analytical tools. A user can find all DOE JGI sequencing projects and their status, search for and download assemblies and annotations of sequenced genomes, and interactively explore those genomes and compare them with other sequenced microbes, fungi, plants or metagenomes using specialized systems tailored to each particular class of organisms. We describe here the general organization of the Genome Portal and the most recent addition, MycoCosm (http://jgi.doe.gov/fungi), a new integrated fungal genomics resource.


Subject(s)
Databases, Genetic , Genomics , Sequence Analysis, DNA , Cluster Analysis , Genome, Fungal , Internet , Molecular Sequence Annotation , Software , Systems Integration
4.
Bioinformatics ; 27(18): 2595-7, 2011 Sep 15.
Article in English | MEDLINE | ID: mdl-21791533

ABSTRACT

SUMMARY: Current genome browsers are designed for linear browsing of individual genomic regions, but the high-throughput nature of experiments aiming to elucidate the genetic component of human disease makes it very important to develop user-friendly tools for comparing several genomic regions in parallel and prioritizing them based on their functional content. We introduce VISTA Region Viewer (RViewer), an interactive online tool that allows for efficient screening and prioritization of regions of the human genome for follow-up studies. The tool takes as input genetic variation data from different biomedical studies, determines a number of various functional parameters for both coding and non-coding sequences in each region and allows for sorting and searching the results of the analysis in multiple ways. AVAILABILITY AND IMPLEMENTATION: The tool is implemented as a web application and is freely accessible on the Web at http://rviewer.lbl.gov CONTACT: rviewer@lbl.gov; ildubchak@lbl.gov.


Subject(s)
Data Mining/methods , Genome , Animals , Follow-Up Studies , Gene Dosage , Genome, Human , Humans , Infant , Internet , Mice , Models, Genetic , Polymorphism, Single Nucleotide , Software
5.
Nature ; 444(7118): 499-502, 2006 Nov 23.
Article in English | MEDLINE | ID: mdl-17086198

ABSTRACT

Identifying the sequences that direct the spatial and temporal expression of genes and defining their function in vivo remains a significant challenge in the annotation of vertebrate genomes. One major obstacle is the lack of experimentally validated training sets. In this study, we made use of extreme evolutionary sequence conservation as a filter to identify putative gene regulatory elements, and characterized the in vivo enhancer activity of a large group of non-coding elements in the human genome that are conserved in human-pufferfish, Takifugu (Fugu) rubripes, or ultraconserved in human-mouse-rat. We tested 167 of these extremely conserved sequences in a transgenic mouse enhancer assay. Here we report that 45% of these sequences functioned reproducibly as tissue-specific enhancers of gene expression at embryonic day 11.5. While directing expression in a broad range of anatomical structures in the embryo, the majority of the 75 enhancers directed expression to various regions of the developing nervous system. We identified sequence signatures enriched in a subset of these elements that targeted forebrain expression, and used these features to rank all approximately 3,100 non-coding elements in the human genome that are conserved between human and Fugu. The testing of the top predictions in transgenic mice resulted in a threefold enrichment for sequences with forebrain enhancer activity. These data dramatically expand the catalogue of human gene enhancers that have been characterized in vivo, and illustrate the utility of such training sets for a variety of biological applications, including decoding the regulatory vocabulary of the human genome.


Subject(s)
Enhancer Elements, Genetic , Genome, Human , Animals , Base Sequence , Chromosomes, Human, Pair 16 , Conserved Sequence , Embryo, Mammalian/metabolism , Embryo, Nonmammalian , Gene Expression , Genomics/methods , Humans , Mice , Mice, Transgenic , Nervous System/embryology , Nervous System/metabolism , Prosencephalon/embryology , Prosencephalon/metabolism , Takifugu/genetics , Transcription Factors/genetics
6.
Nucleic Acids Res ; 35(Database issue): D88-92, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17130149

ABSTRACT

Despite the known existence of distant-acting cis-regulatory elements in the human genome, only a small fraction of these elements has been identified and experimentally characterized in vivo. This paucity of enhancer collections with defined activities has thus hindered computational approaches for the genome-wide prediction of enhancers and their functions. To fill this void, we utilize comparative genome analysis to identify candidate enhancer elements in the human genome coupled with the experimental determination of their in vivo enhancer activity in transgenic mice [L. A. Pennacchio et al. (2006) Nature, in press]. These data are available through the VISTA Enhancer Browser (http://enhancer.lbl.gov). This growing database currently contains over 250 experimentally tested DNA fragments, of which more than 100 have been validated as tissue-specific enhancers. For each positive enhancer, we provide digital images of whole-mount embryo staining at embryonic day 11.5 and an anatomical description of the reporter gene expression pattern. Users can retrieve elements near single genes of interest, search for enhancers that target reporter gene expression to a particular tissue, or download entire collections of enhancers with a defined tissue specificity or conservation depth. These experimentally validated training sets are expected to provide a basis for a wide range of downstream computational and functional studies of enhancer function.


Subject(s)
Databases, Nucleic Acid , Enhancer Elements, Genetic , Animals , Computational Biology , Embryo, Mammalian/metabolism , Gene Expression , Genome, Human , Genomics , Humans , Internet , Mice , User-Computer Interface
7.
Nucleic Acids Res ; 35(Web Server issue): W669-74, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17488840

ABSTRACT

The VISTA portal for comparative genomics is designed to give biomedical scientists a unified set of tools to lead them from the raw DNA sequences through the alignment and annotation to the visualization of the results. The VISTA portal also hosts the alignments of a number of genomes computed by our group, allowing users to study the regions of their interest without having to manually download the individual sequences. Here we describe various algorithmic and functional improvements implemented in the VISTA portal over the last 2 years. The VISTA Portal is accessible at http://genome.lbl.gov/vista.


Subject(s)
Computational Biology/methods , Genomics , Internet , Sequence Alignment , Software , Animals , Base Sequence , Chickens , Dogs , Genome, Human , Humans , Molecular Sequence Data , Polymorphism, Single Nucleotide , Sequence Analysis, DNA , Sequence Homology, Nucleic Acid
8.
Nucleic Acids Res ; 35(14): 4845-57, 2007.
Article in English | MEDLINE | ID: mdl-17626050

ABSTRACT

Correlation of motif occurrences with gene expression intensity is an effective strategy for elucidating transcriptional cis-regulatory logic. Here we demonstrate that this approach can also identify cis-regulatory elements for alternative pre-mRNA splicing. Using data from a human exon microarray, we identified 56 cassette exons that exhibited higher transcript-normalized expression in muscle than in other normal adult tissues. Intron sequences flanking these exons were then analyzed to identify candidate regulatory motifs for muscle-specific alternative splicing. Correlation of motif parameters with gene-normalized exon expression levels was examined using linear regression and linear splines on RNA words and degenerate weight matrices, respectively. Our unbiased analysis uncovered multiple candidate regulatory motifs for muscle-specific splicing, many of which are phylogenetically conserved among vertebrate genomes. The most prominent downstream motifs were binding sites for Fox1- and CELF-related splicing factors, and a branchpoint-like element acuaac; pyrimidine-rich elements resembling PTB-binding sites were most significant in upstream introns. Intriguingly, our systematic study indicates a paucity of novel muscle-specific elements that are dominant in short proximal intronic regions. We propose that Fox and CELF proteins play major roles in enforcing the muscle-specific alternative splicing program, facilitating expression of unique isoforms of cytoskeletal proteins critical to muscle cell function.


Subject(s)
Alternative Splicing , Computational Biology/methods , Introns , Regulatory Sequences, Ribonucleic Acid , Sequence Analysis, RNA/methods , Animals , Base Sequence , Binding Sites , Conserved Sequence , Cytoskeletal Proteins/genetics , Cytoskeletal Proteins/metabolism , Exons , Gene Expression Profiling , Humans , Muscle, Skeletal/metabolism , Myocardium/metabolism , RNA Precursors/chemistry , RNA, Messenger/chemistry , RNA, Messenger/metabolism , RNA-Binding Proteins/metabolism , Transcription, Genetic
9.
Nucleic Acids Res ; 35(Database issue): D407-12, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17142223

ABSTRACT

RegTransBase is a manually curated database of regulatory interactions in prokaryotes that captures the knowledge in public scientific literature using a controlled vocabulary. Although several databases describing interactions between regulatory proteins and their binding sites are already being maintained, they either focus mostly on the model organisms Escherichia coli and Bacillus subtilis or are entirely computationally derived. RegTransBase describes a large number of regulatory interactions reported in many organisms and contains the following types of experimental data: the activation or repression of transcription by an identified direct regulator, determining the transcriptional regulatory function of a protein (or RNA) directly binding to DNA (RNA), mapping or prediction of a binding site for a regulatory protein and characterization of regulatory mutations. Currently, RegTransBase content is derived from about 3000 relevant articles describing over 7000 experiments in relation to 128 microbes. It contains data on the regulation of about 7500 genes and evidence for 6500 interactions with 650 regulators. RegTransBase also contains manually created position weight matrices (PWM) that can be used to identify candidate regulatory sites in over 60 species. RegTransBase is available at http://regtransbase.lbl.gov.


Subject(s)
Bacterial Proteins/metabolism , Databases, Nucleic Acid , Gene Expression Regulation, Bacterial , Genome, Bacterial , Regulatory Elements, Transcriptional , Transcription Factors/metabolism , Binding Sites , Internet , User-Computer Interface
10.
Bioinformatics ; 23(6): 764-6, 2007 Mar 15.
Article in English | MEDLINE | ID: mdl-17234642

ABSTRACT

UNLABELLED: We describe a general multiplatform exploratory tool called TreeQ-Vista, designed for presenting functional annotations in a phylogenetic context. Traits, such as phenotypic and genomic properties, are interactively queried from a user-provided relational database with a user-friendly interface which provides a set of tools for users with or without SQL knowledge. The query results are projected onto a phylogenetic tree and can be displayed in multiple color groups. A rich set of browsing, grouping and query tools are provided to facilitate trait exploration, comparison and analysis. AVAILABILITY: The program, detailed tutorial and examples are available online (http:/genome.lbl.gov/vista/TreeQVista).


Subject(s)
Chromosome Mapping/methods , Databases, Genetic , Evolution, Molecular , Information Storage and Retrieval/methods , Models, Genetic , Software , User-Computer Interface , Computer Graphics , Computer Simulation , Database Management Systems , Phylogeny
11.
BMC Genomics ; 8: 378, 2007 Oct 18.
Article in English | MEDLINE | ID: mdl-17945028

ABSTRACT

BACKGROUND: A substantial fraction of non-coding DNA sequences of multicellular eukaryotes is under selective constraint. In particular, approximately 5% of the human genome consists of conserved non-coding sequences (CNSs). CNSs differ from other genomic sequences in their nucleotide composition and must play important functional roles, which mostly remain obscure. RESULTS: We investigated relative abundances of short sequence motifs in all human CNSs present in the human/mouse whole-genome alignments vs. three background sets of sequences: (i) weakly conserved or unconserved non-coding sequences (non-CNSs); (ii) near-promoter sequences (located between nucleotides -500 and -1500, relative to a start of transcription); and (iii) random sequences with the same nucleotide composition as that of CNSs. When compared to non-CNSs and near-promoter sequences, CNSs possess an excess of AT-rich motifs, often containing runs of identical nucleotides. In contrast, when compared to random sequences, CNSs contain an excess of GC-rich motifs which, however, lack CpG dinucleotides. Thus, abundance of short sequence motifs in human CNSs, taken as a whole, is mostly determined by their overall compositional properties and not by overrepresentation of any specific short motifs. These properties are: (i) high AT-content of CNSs, (ii) a tendency, probably due to context-dependent mutation, of A's and T's to clump, (iii) presence of short GC-rich regions, and (iv) avoidance of CpG contexts, due to their hypermutability. Only a small number of short motifs, overrepresented in all human CNSs are similar to binding sites of transcription factors from the FOX family. CONCLUSION: Human CNSs as a whole appear to be too broad a class of sequences to possess strong footprints of any short sequence-specific functions. Such footprints should be studied at the level of functional subclasses of CNSs, such as those which flank genes with a particular pattern of expression. Overall properties of CNSs are affected by patterns in mutation, suggesting that selection which causes their conservation is not always very strong.


Subject(s)
Conserved Sequence , DNA/genetics , Humans
12.
Nucleic Acids Res ; 33(2): 714-24, 2005.
Article in English | MEDLINE | ID: mdl-15691898

ABSTRACT

Previous studies have identified UGCAUG as an intron splicing enhancer that is frequently located adjacent to tissue-specific alternative exons in the human genome. Here, we show that UGCAUG is phylogenetically and spatially conserved in introns that flank brain-enriched alternative exons from fish to man. Analysis of sequence from the mouse, rat, dog, chicken and pufferfish genomes revealed a strongly statistically significant association of UGCAUG with the proximal intron region downstream of brain-enriched alternative exons. The number, position and sequence context of intronic UGCAUG elements were highly conserved among mammals and in chicken, but more divergent in fish. Control datasets, including constitutive exons and non-tissue-specific alternative exons, exhibited a much lower incidence of closely linked UGCAUG elements. We propose that the high sequence specificity of the UGCAUG element, and its unique association with tissue-specific alternative exons, mark it as a critical component of splicing switch mechanism(s) designed to activate a limited repertoire of splicing events in cell type-specific patterns. We further speculate that highly conserved UGCAUG-binding protein(s) related to the recently described Fox-1 splicing factor play a critical role in mediating this specificity.


Subject(s)
Alternative Splicing , Introns , Phylogeny , Regulatory Sequences, Ribonucleic Acid , Animals , Base Sequence , Brain/metabolism , Chickens/genetics , Conserved Sequence , Dogs , Exons , Humans , Mice , Rats , Tetraodontiformes/genetics , Tissue Distribution
13.
BMC Bioinformatics ; 6: 292, 2005 Dec 08.
Article in English | MEDLINE | ID: mdl-16336665

ABSTRACT

BACKGROUND: Recent advances in sequencing technologies promise to provide a better understanding of the genetics of human disease as well as the evolution of microbial populations. Single Nucleotide Polymorphisms (SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it has become possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease in an attempt to identify causative mutations. In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples enables more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at http://genome.lbl.gov/vista/snpvista1. RESULTS: We have developed and present two modifications of an interactive visualization tool, SNP-VISTA, to aid in the analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein evolutionary conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. CONCLUSION: The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNP data by the user.


Subject(s)
Computational Biology/methods , Polymorphism, Single Nucleotide , Software , Algorithms , Alleles , Chromosome Mapping , Cluster Analysis , Computer Graphics , DNA Mutational Analysis , Databases, Nucleic Acid , Expressed Sequence Tags , Genome, Human , Haplotypes , Humans , Linkage Disequilibrium , Recombination, Genetic , Sequence Alignment , Sequence Analysis, DNA , User-Computer Interface
14.
Proc Natl Acad Sci U S A ; 102(24): 8561-6, 2005 Jun 14.
Article in English | MEDLINE | ID: mdl-15939874

ABSTRACT

Although a substantial number of hormones and drugs increase cellular cAMP levels, the global impact of cAMP and its major effector mechanism, protein kinase A (PKA), on gene expression is not known. Here we show that treatment of murine wild-type S49 lymphoma cells for 24 h with 8-(4-chlorophenylthio)-cAMP (8-CPT-cAMP), a PKA-selective cAMP analog, alters the expression of approximately 4,500 of approximately 13,600 unique genes. By contrast, gene expression was unaltered in Kin- S49 cells (that lack PKA) incubated with 8-CPT-cAMP. Changes in mRNA and protein expression of several cell-cycle regulators accompanied cAMP-induced G1-phase cell-cycle arrest of wild-type S49 cells. Within 2 h, 8-CPT-cAMP altered expression of 152 genes that contain evolutionarily conserved cAMP-response elements within 5 kb of transcriptional start sites, including the circadian clock gene Per1. Thus, cAMP through its activation of PKA produces extensive transcriptional regulation in eukaryotic cells. These transcriptional networks include a primary group of cAMP-response element-containing genes and secondary networks that include the circadian clock.


Subject(s)
Cell Cycle Proteins/metabolism , Cell Cycle/physiology , Cyclic AMP-Dependent Protein Kinases/metabolism , Cyclic AMP/analogs & derivatives , Cyclic AMP/pharmacology , Gene Expression Regulation/physiology , Thionucleotides/pharmacology , Animals , Cell Cycle/drug effects , Cell Cycle Proteins/genetics , Cluster Analysis , Gene Expression Regulation/drug effects , Immunoblotting , Mice , Microarray Analysis , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , Period Circadian Proteins , Reverse Transcriptase Polymerase Chain Reaction , Tumor Cells, Cultured
SELECTION OF CITATIONS
SEARCH DETAIL