Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
1.
Nucleic Acids Res ; 43(Database issue): D662-9, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25352552

ABSTRACT

Ensembl (http://www.ensembl.org) is a genomic interpretation system providing the most up-to-date annotations, querying tools and access methods for chordates and key model organisms. This year we released updated annotation (gene models, comparative genomics, regulatory regions and variation) on the new human assembly, GRCh38, although we continue to support researchers using the GRCh37.p13 assembly through a dedicated site (http://grch37.ensembl.org). Our Regulatory Build has been revamped to identify regulatory regions of interest and to efficiently highlight their activity across disparate epigenetic data sets. A number of new interfaces allow users to perform large-scale comparisons of their data against our annotations. The REST server (http://rest.ensembl.org), which allows programs written in any language to query our databases, has moved to a full service alongside our upgraded website tools. Our online Variant Effect Predictor tool has been updated to process more variants and calculate summary statistics. Lastly, the WiggleTools package enables users to summarize large collections of data sets and view them as single tracks in Ensembl. The Ensembl code base itself is more accessible: it is now hosted on our GitHub organization page (https://github.com/Ensembl) under an Apache 2.0 open source license.


Subject(s)
Databases, Nucleic Acid , Genomics , Animals , Epigenesis, Genetic , Genetic Variation , Genome, Human , Humans , Internet , Mice , Molecular Sequence Annotation , Regulatory Sequences, Nucleic Acid , Software
2.
Nucleic Acids Res ; 42(Database issue): D749-55, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24316576

ABSTRACT

Ensembl (http://www.ensembl.org) creates tools and data resources to facilitate genomic analysis in chordate species with an emphasis on human, major vertebrate model organisms and farm animals. Over the past year we have increased the number of species that we support to 77 and expanded our genome browser with a new scrollable overview and improved variation and phenotype views. We also report updates to our core datasets and improvements to our gene homology relationships from the addition of new species. Our REST service has been extended with additional support for comparative genomics and ontology information. Finally, we provide updated information about our methods for data access and resources for user training.


Subject(s)
Databases, Genetic , Genomics , Animals , Chordata/genetics , Genetic Variation , Humans , Internet , Mice , Molecular Sequence Annotation , Phenotype , Rats
3.
Nucleic Acids Res ; 41(Database issue): D48-55, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23203987

ABSTRACT

The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets for all supported species; large-scale whole genome multiple species alignments across vertebrates and clade-specific alignments for eutherian mammals, primates, birds and fish; variation data resources for 17 species and regulation annotations based on ENCODE and other data sets. Ensembl data are accessible through the genome browser at http://www.ensembl.org and through other tools and programmatic interfaces.


Subject(s)
Databases, Genetic , Genomics , Animals , Gene Expression Regulation , Genetic Variation , Humans , Internet , Mice , Molecular Sequence Annotation , Rats , Software , Zebrafish/genetics
4.
Nucleic Acids Res ; 40(Database issue): D84-90, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22086963

ABSTRACT

The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project.


Subject(s)
Databases, Genetic , Genomics , Animals , Gene Expression Regulation , Genetic Variation , Humans , Mice , Molecular Sequence Annotation , Rats
5.
Nucleic Acids Res ; 39(Database issue): D800-6, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21045057

ABSTRACT

The Ensembl project (http://www.ensembl.org) seeks to enable genomic science by providing high quality, integrated annotation on chordate and selected eukaryotic genomes within a consistent and accessible infrastructure. All supported species include comprehensive, evidence-based gene annotations and a selected set of genomes includes additional data focused on variation, comparative, evolutionary, functional and regulatory annotation. The most advanced resources are provided for key species including human, mouse, rat and zebrafish reflecting the popularity and importance of these species in biomedical research. As of Ensembl release 59 (August 2010), 56 species are supported of which 5 have been added in the past year. Since our previous report, we have substantially improved the presentation and integration of both data of disease relevance and the regulatory state of different cell types.


Subject(s)
Databases, Genetic , Genomics , Animals , Genetic Variation , Humans , Mice , Molecular Sequence Annotation , Rats , Regulatory Sequences, Nucleic Acid , Software , Zebrafish/genetics
6.
Nucleic Acids Res ; 38(Database issue): D557-62, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19906699

ABSTRACT

Ensembl (http://www.ensembl.org) integrates genomic information for a comprehensive set of chordate genomes with a particular focus on resources for human, mouse, rat, zebrafish and other high-value sequenced genomes. We provide complete gene annotations for all supported species in addition to specific resources that target genome variation, function and evolution. Ensembl data is accessible in a variety of formats including via our genome browser, API and BioMart. This year marks the tenth anniversary of Ensembl and in that time the project has grown with advances in genome technology. As of release 56 (September 2009), Ensembl supports 51 species including marmoset, pig, zebra finch, lizard, gorilla and wallaby, which were added in the past year. Major additions and improvements to Ensembl since our previous report include the incorporation of the human GRCh37 assembly, enhanced visualisation and data-mining options for the Ensembl regulatory features and continued development of our software infrastructure.


Subject(s)
Computational Biology/methods , Databases, Genetic , Databases, Nucleic Acid , Access to Information , Animals , Computational Biology/trends , Databases, Protein , Genetic Variation , Genomics/methods , Humans , Information Storage and Retrieval/methods , Internet , Protein Structure, Tertiary , Software , Species Specificity
7.
BMC Bioinformatics ; 9 Suppl 8: S3, 2008 Jul 22.
Article in English | MEDLINE | ID: mdl-18673527

ABSTRACT

BACKGROUND: The Distributed Annotation System (DAS) is a widely adopted protocol for dynamically integrating a wide range of biological data from geographically diverse sources. DAS continues to expand its applicability and evolve in response to new challenges facing integrative bioinformatics. RESULTS: Here we describe the various infrastructure components of DAS and present a new extended version of the DAS specification. Version 1.53E incorporates several recent developments, including its extension to serve new data types and an ontology for protein features. CONCLUSION: Our extensions to the DAS protocol have facilitated the integration of new data types, and our improvements to the existing DAS infrastructure have addressed recent challenges. The steadily increasing numbers of available data sources demonstrates further adoption of the DAS protocol.


Subject(s)
Database Management Systems , Databases, Genetic , Information Storage and Retrieval/methods , Computational Biology/methods , Systems Integration
8.
BMC Bioinformatics ; 8: 333, 2007 Sep 12.
Article in English | MEDLINE | ID: mdl-17850653

ABSTRACT

BACKGROUND: The Distributed Annotation System (DAS) is a network protocol for exchanging biological data. It is frequently used to share annotations of genomes and protein sequence. RESULTS: Here we present several extensions to the current DAS 1.5 protocol. These provide new commands to share alignments, three dimensional molecular structure data, add the possibility for registration and discovery of DAS servers, and provide a convention how to provide different types of data plots. We present examples of web sites and applications that use the new extensions. We operate a public registry of DAS sources, which now includes entries for more than 250 distinct sources. CONCLUSION: Our DAS extensions are essential for the management of the growing number of services and exchange of diverse biological data sets. In addition the extensions allow new types of applications to be developed and scientific questions to be addressed. The registry of DAS sources is available at http://www.dasregistry.org.


Subject(s)
Computational Biology/methods , Database Management Systems , Databases, Genetic , Information Storage and Retrieval/methods , Internet , Sequence Analysis/methods , User-Computer Interface , Algorithms , Chromosome Mapping/methods , Systems Integration
9.
Database (Oxford) ; 2017(1)2017 01 01.
Article in English | MEDLINE | ID: mdl-28365736

ABSTRACT

The Ensembl software resources are a stable infrastructure to store, access and manipulate genome assemblies and their functional annotations. The Ensembl 'Core' database and Application Programming Interface (API) was our first major piece of software infrastructure and remains at the centre of all of our genome resources. Since its initial design more than fifteen years ago, the number of publicly available genomic, transcriptomic and proteomic datasets has grown enormously, accelerated by continuous advances in DNA-sequencing technology. Initially intended to provide annotation for the reference human genome, we have extended our framework to support the genomes of all species as well as richer assembly models. Cross-referenced links to other informatics resources facilitate searching our database with a variety of popular identifiers such as UniProt and RefSeq. Our comprehensive and robust framework storing a large diversity of genome annotations in one location serves as a platform for other groups to generate and maintain their own tailored annotation. We welcome reuse and contributions: our databases and APIs are publicly available, all of our source code is released with a permissive Apache v2.0 licence at http://github.com/Ensembl and we have an active developer mailing list ( http://www.ensembl.org/info/about/contact/index.html ). Database URL: http://www.ensembl.org.


Subject(s)
Databases, Nucleic Acid , Genome, Human , Molecular Sequence Annotation/methods , Sequence Analysis, DNA/methods , User-Computer Interface , Humans
10.
Eur J Hum Genet ; 25(11): 1253-1260, 2017 11.
Article in English | MEDLINE | ID: mdl-28832569

ABSTRACT

Here we describe the SweGen data set, a comprehensive map of genetic variation in the Swedish population. These data represent a basic resource for clinical genetics laboratories as well as for sequencing-based association studies by providing information on genetic variant frequencies in a cohort that is well matched to national patient cohorts. To select samples for this study, we first examined the genetic structure of the Swedish population using high-density SNP-array data from a nation-wide cohort of over 10 000 Swedish-born individuals included in the Swedish Twin Registry. A total of 1000 individuals, reflecting a cross-section of the population and capturing the main genetic structure, were selected for whole-genome sequencing. Analysis pipelines were developed for automated alignment, variant calling and quality control of the sequencing data. This resulted in a genome-wide collection of aggregated variant frequencies in the Swedish population that we have made available to the scientific community through the website https://swefreq.nbis.se. A total of 29.2 million single-nucleotide variants and 3.8 million indels were detected in the 1000 samples, with 9.9 million of these variants not present in current databases. Each sample contributed with an average of 7199 individual-specific variants. In addition, an average of 8645 larger structural variants (SVs) were detected per individual, and we demonstrate that the population frequencies of these SVs can be used for efficient filtering analyses. Finally, our results show that the genetic diversity within Sweden is substantial compared with the diversity among continental European populations, underscoring the relevance of establishing a local reference data set.


Subject(s)
Genome, Human , Polymorphism, Single Nucleotide , Registries , Datasets as Topic , Genome-Wide Association Study , Humans , Sweden , Twins/genetics
11.
Article in English | MEDLINE | ID: mdl-27337980

ABSTRACT

The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail.Database URL: http://www.ensembl.org/index.html.


Subject(s)
Databases, Nucleic Acid , Databases, Protein , Internet , Molecular Sequence Annotation/methods , Animals , Humans , Mice
12.
Database (Oxford) ; 2011: bar030, 2011.
Article in English | MEDLINE | ID: mdl-21785142

ABSTRACT

For a number of years the BioMart data warehousing system has proven to be a valuable resource for scientists seeking a fast and versatile means of accessing the growing volume of genomic data provided by the Ensembl project. The launch of the Ensembl Genomes project in 2009 complemented the Ensembl project by utilizing the same visualization, interactive and programming tools to provide users with a means for accessing genome data from a further five domains: protists, bacteria, metazoa, plants and fungi. The Ensembl and Ensembl Genomes BioMarts provide a point of access to the high-quality gene annotation, variation data, functional and regulatory annotation and evolutionary relationships from genomes spanning the taxonomic space. This article aims to give a comprehensive overview of the Ensembl and Ensembl Genomes BioMarts as well as some useful examples and a description of current data content and future objectives. Database URLs: http://www.ensembl.org/biomart/martview/; http://metazoa.ensembl.org/biomart/martview/; http://plants.ensembl.org/biomart/martview/; http://protists.ensembl.org/biomart/martview/; http://fungi.ensembl.org/biomart/martview/; http://bacteria.ensembl.org/biomart/martview/.


Subject(s)
Classification/methods , Databases, Genetic , Information Storage and Retrieval/methods , Animals , Anopheles/genetics , Computational Biology , Genome/genetics , Humans , Open Reading Frames/genetics , Polymorphism, Single Nucleotide/genetics , Search Engine
13.
Bioinformatics ; 21(14): 3198-9, 2005 Jul 15.
Article in English | MEDLINE | ID: mdl-15905273

ABSTRACT

In this study, we present two freely available and complementary Distributed Annotation System (DAS) resources: a DAS reference server that provides up-to-date sequence and annotation from UniProt, with additional feature links and database cross-references from InterPro and a DAS client implemented using Java and Macromedia Flash that is optimized for the display of protein features.


Subject(s)
Models, Chemical , Models, Molecular , Proteins/chemistry , Sequence Analysis, Protein/methods , Software , User-Computer Interface , Algorithms , Computer Graphics , Computer Simulation , Database Management Systems , Internet , Proteins/analysis , Sequence Alignment/methods , Sequence Homology, Amino Acid , Systems Integration
14.
Genome Res ; 14(5): 925-8, 2004 May.
Article in English | MEDLINE | ID: mdl-15078858

ABSTRACT

Ensembl (http://www.ensembl.org/) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of individual genomes, and of the synteny and orthology relationships between them. It is also a framework for integration of any biological data that can be mapped onto features derived from the genomic sequence. Ensembl is available as an interactive Web site, a set of flat files, and as a complete, portable open source software system for handling genomes. All data are provided without restriction, and code is freely available. Ensembl's aims are to continue to "widen" this biological integration to include other model organisms relevant to understanding human biology as they become available; to "deepen" this integration to provide an ever more seamless linkage between equivalent components in different species; and to provide further classification of functional elements in the genome that have been previously elusive.


Subject(s)
Computational Biology/trends
SELECTION OF CITATIONS
SEARCH DETAIL