Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
1.
Nature ; 424(6950): 788-93, 2003 Aug 14.
Article in English | MEDLINE | ID: mdl-12917688

ABSTRACT

The systematic comparison of genomic sequences from different organisms represents a central focus of contemporary genome analysis. Comparative analyses of vertebrate sequences can identify coding and conserved non-coding regions, including regulatory elements, and provide insight into the forces that have rendered modern-day genomes. As a complement to whole-genome sequencing efforts, we are sequencing and comparing targeted genomic regions in multiple, evolutionarily diverse vertebrates. Here we report the generation and analysis of over 12 megabases (Mb) of sequence from 12 species, all derived from the genomic region orthologous to a segment of about 1.8 Mb on human chromosome 7 containing ten genes, including the gene mutated in cystic fibrosis. These sequences show conservation reflecting both functional constraints and the neutral mutational events that shaped this genomic region. In particular, we identify substantial numbers of conserved non-coding segments beyond those previously identified experimentally, most of which are not detectable by pair-wise sequence comparisons alone. Analysis of transposable element insertions highlights the variation in genome dynamics among these species and confirms the placement of rodents as a sister group to the primates.


Subject(s)
Conserved Sequence/genetics , Evolution, Molecular , Genomics , Vertebrates/genetics , Animals , Chromosomes, Human, Pair 7/genetics , Cystic Fibrosis Transmembrane Conductance Regulator/genetics , DNA Transposable Elements/genetics , Genome , Humans , Mammals/genetics , Mutagenesis/genetics , Phylogeny , Sequence Alignment , Sequence Homology, Nucleic Acid , Species Specificity
2.
Nucleic Acids Res ; 35(Database issue): D668-73, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17142222

ABSTRACT

The University of California, Santa Cruz Genome Browser Database contains, as of September 2006, sequence and annotation data for the genomes of 13 vertebrate and 19 invertebrate species. The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up to a full chromosome and includes assembly data, genes and gene predictions, mRNA and EST alignments, and comparative genomics, regulation, expression and variation data. The database is optimized for fast interactive performance with web tools that provide powerful visualization and querying capabilities for mining the data. In the past year, 22 new assemblies and several new sets of human variation annotation have been released. New features include VisiGene, a fully integrated in situ hybridization image browser; phyloGif, for drawing evolutionary tree diagrams; a redesigned Custom Track feature; an expanded SNP annotation track; and many new display options. The Genome Browser, other tools, downloadable data files and links to documentation and other information can be found at http://genome.ucsc.edu/.


Subject(s)
Databases, Genetic , Genomics , Animals , Base Sequence , Cattle , Computer Graphics , Conserved Sequence , Genome, Human , Humans , Internet , Linkage Disequilibrium , Mice , Open Reading Frames , Polymorphism, Single Nucleotide , Rats , Regulatory Sequences, Nucleic Acid , User-Computer Interface
3.
Nucleic Acids Res ; 34(Database issue): D590-8, 2006 Jan 01.
Article in English | MEDLINE | ID: mdl-16381938

ABSTRACT

The University of California Santa Cruz Genome Browser Database (GBD) contains sequence and annotation data for the genomes of about a dozen vertebrate species and several major model organisms. Genome annotations typically include assembly data, sequence composition, genes and gene predictions, mRNA and expressed sequence tag evidence, comparative genomics, regulation, expression and variation data. The database is optimized to support fast interactive performance with web tools that provide powerful visualization and querying capabilities for mining the data. The Genome Browser displays a wide variety of annotations at all scales from single nucleotide level up to a full chromosome. The Table Browser provides direct access to the database tables and sequence data, enabling complex queries on genome-wide datasets. The Proteome Browser graphically displays protein properties. The Gene Sorter allows filtering and comparison of genes by several metrics including expression data and several gene properties. BLAT and In Silico PCR search for sequences in entire genomes in seconds. These tools are highly integrated and provide many hyperlinks to other databases and websites. The GBD, browsing tools, downloadable data files and links to documentation and other information can be found at http://genome.ucsc.edu/.


Subject(s)
Databases, Genetic , Genomics , Amino Acid Sequence , Animals , California , Computer Graphics , Dogs , Gene Expression , Genes , Humans , Internet , Mice , Polymorphism, Single Nucleotide , Proteins/chemistry , Proteins/genetics , Proteins/metabolism , Proteomics , Rats , Sequence Alignment , Software , User-Computer Interface
4.
Novartis Found Symp ; 236: 59-81; discussion 81-4, 2001.
Article in English | MEDLINE | ID: mdl-11387987

ABSTRACT

The distinguishing feature of the 'new biology' is that it is information intensive. Not only does it demand access to and assimilation of vast data sets accumulated by engineered laboratory processes, but it also demands a previously unimaginable level of data integration across data types and sources. There are various information resources available for rice. In addition, there are various information resources that are not focused on rice but that contain rice data. The challenge for rice researchers and breeders is to access this wealth of data meaningfully. This challenge will grow significantly as international efforts aimed at sequencing the entire rice genome come into full swing. Only through concerted efforts in bioinformatics will the power of these public data be brought to bear on the needs of rice researchers and breeders worldwide. These efforts will need to focus on two large but distinct areas: (1) development of an effective bioinformatics infrastructure (hardware systems, software systems, and software engineers and support staff) and (2) computational biology research in visualization and analysis of very large, complex data sets, such as those that will be developed using high-throughput expression technologies, large-scale insertional mutagenesis, and biochemical profiling of various types. In the midst of the large flow of high-throughput data that the international rice genome sequencing efforts will produce, it is also imperative that integration of those data with unique germplasm data held in trust by the CGIAR be a part of the informatics infrastructure. This paper will focus on the state of rice information resources, the needs of the rice community, and some proposed bioinformatics activities to support these needs.


Subject(s)
Computational Biology , Oryza/genetics , Algorithms , Computational Biology/organization & administration , Computational Biology/trends , Databases, Factual , Edible Grain/genetics , Gene Expression , Genome, Plant , Internet , Molecular Biology , Oryza/growth & development , Oryza/physiology , Phenotype , Systems Integration
6.
Bioinformatics ; 17(1): 83-94, 2001 Jan.
Article in English | MEDLINE | ID: mdl-11222265

ABSTRACT

MOTIVATION: Heterogeneity of databases and software resources continues to hamper the integration of biological information. Top-down solutions are not feasible for the full-scale problem of integration across biological species and data types. Bottom-up solutions so far have not integrated, in a maximally flexible way, dynamic and interactive graphical-user-interface components with data repositories and analysis tools. RESULTS: We present a component-based approach that relies on a generalized platform for component integration. The platform enables independently-developed components to synchronize their behavior and exchange services, without direct knowledge of one another. An interface-based data model allows the exchange of information with minimal component interdependency. From these interactions an integrated system results, which we call ISYSf1.gif" BORDER="0">. By allowing services to be discovered dynamically based on selected objects, ISYS encourages a kind of exploratory navigation that we believe to be well-suited for applications in genomic research.


Subject(s)
Computational Biology , Software , Arabidopsis/genetics , Databases, Factual , Genome, Plant , Quantitative Trait, Heritable
7.
Nucleic Acids Res ; 25(1): 18-23, 1997 Jan 01.
Article in English | MEDLINE | ID: mdl-9016496

ABSTRACT

The Genome Sequence DataBase (GSDB) has completed its conversion to an improved relational database. The new database, GSDB 1.0, is fully operational and publicly available. Data contributions, including both original sequence submissions and community annotation, are being accomplished through the use of a graphical client-server interface tool, the GSDB Annotator, and via GIO (GSDB Input/Output) files. Data retrieval services are being provided through a new Web Query Tool and direct SQL. All methods of data contribution and data retrieval fully support the new data types that have been incorporated into GSDB, including discontiguous sequences, multiple sequence alignments, and community annotation.


Subject(s)
Base Sequence , Databases, Factual , Animals , Humans , Private Sector , Software
8.
Nucleic Acids Res ; 26(1): 21-6, 1998 Jan 01.
Article in English | MEDLINE | ID: mdl-9399793

ABSTRACT

In 1997 the primary focus of the Genome Sequence DataBase (GSDB; www. ncgr.org/gsdb ) located at the National Center for Genome Resources was to improve data quality and accessibility. Efforts to increase the quality of data within the database included two major projects; one to identify and remove all vector contamination from sequences in the database and one to create premier sequence sets (including both alignments and discontiguous sequences). Data accessibility was improved during the course of the last year in several ways. First, a graphical database sequence viewer was made available to researchers. Second, an update process was implemented for the web-based query tool, Maestro. Third, a web-based tool, Excerpt, was developed to retrieve selected regions of any sequence in the database. And lastly, a GSDB flatfile that contains annotation unique to GSDB (e.g., sequence analysis and alignment data) was developed. Additionally, the GSDB web site provides a tool for the detection of matrix attachment regions (MARs), which can be used to identify regions of high coding potential. The ultimate goal of this work is to make GSDB a more useful resource for genomic comparison studies and gene level studies by improving data quality and by providing data access capabilities that are consistent with the needs of both types of studies.


Subject(s)
Databases, Factual , Genome , Base Sequence , Computer Communication Networks , Forecasting , Information Storage and Retrieval
9.
Nucleic Acids Res ; 27(1): 35-8, 1999 Jan 01.
Article in English | MEDLINE | ID: mdl-9847136

ABSTRACT

During 1998 the primary focus of the Genome Sequence DataBase (GSDB; http://www.ncgr.org/gsdb ) located at the National Center for Genome Resources (NCGR) has been to improve data quality, improve data collections, and provide new methods and tools to access and analyze data. Data quality has been improved by extensive curation of certain data fields necessary for maintaining data collections and for using certain tools. Data quality has also been increased by improvements to the suite of programs that import data from the International Nucleotide Sequence Database Collaboration (IC). The Sequence Tag Alignment and Consensus Knowledgebase (STACK), a database of human expressed gene sequences developed by the South African National Bioinformatics Institute (SANBI), became available within the last year, allowing public access to this valuable resource of expressed sequences. Data access was improved by the addition of the Sequence Viewer, a platform-independent graphical viewer for GSDB sequence data. This tool has also been integrated with other searching and data retrieval tools. A BLAST homology search service was also made available, allowing researchers to search all of the data, including the unique data, that are available from GSDB. These improvements are designed to make GSDB more accessible to users, extend the rich searching capability already present in GSDB, and to facilitate the transition to an integrated system containing many different types of biological data.


Subject(s)
Base Sequence , Databases, Factual , Genome , Information Storage and Retrieval , Animals , Computational Biology , Consensus Sequence , Gene Expression , Genome, Human , Humans , Sequence Alignment
SELECTION OF CITATIONS
SEARCH DETAIL