Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Nucleic Acids Res ; 48(D1): D376-D382, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31724711

RESUMO

The Structural Classification of Proteins (SCOP) database is a classification of protein domains organised according to their evolutionary and structural relationships. We report a major effort to increase the coverage of structural data, aiming to provide classification of almost all domain superfamilies with representatives in the PDB. We have also improved the database schema, provided a new API and modernised the web interface. This is by far the most significant update in coverage since SCOP 1.75 and builds on the advances in schema from the SCOP 2 prototype. The database is accessible from http://scop.mrc-lmb.cam.ac.uk.


Assuntos
Bases de Dados de Proteínas , Domínios Proteicos , Proteínas/química , Evolução Molecular , Internet , Proteínas/metabolismo , Software , Interface Usuário-Computador
2.
Nucleic Acids Res ; 44(D1): D688-93, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26476449

RESUMO

PhytoPath (www.phytopathdb.org) is a resource for genomic and phenotypic data from plant pathogen species, that integrates phenotypic data for genes from PHI-base, an expertly curated catalog of genes with experimentally verified pathogenicity, with the Ensembl tools for data visualization and analysis. The resource is focused on fungi, protists (oomycetes) and bacterial plant pathogens that have genomes that have been sequenced and annotated. Genes with associated PHI-base data can be easily identified across all plant pathogen species using a BioMart-based query tool and visualized in their genomic context on the Ensembl genome browser. The PhytoPath resource contains data for 135 genomic sequences from 87 plant pathogen species, and 1364 genes curated for their role in pathogenicity and as targets for chemical intervention. Support for community annotation of gene models is provided using the WebApollo online gene editor, and we are working with interested communities to improve reference annotation for selected species.


Assuntos
Bases de Dados Genéticas , Genômica , Interações Hospedeiro-Patógeno/genética , Doenças das Plantas/microbiologia , Genes Bacterianos , Genes Fúngicos , Genoma Bacteriano , Genoma Fúngico , Oomicetos/genética , Fenótipo , Alinhamento de Sequência
3.
Nucleic Acids Res ; 44(D1): D574-80, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26578574

RESUMO

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces.


Assuntos
Bases de Dados Genéticas , Genoma Bacteriano , Genoma Fúngico , Genoma de Planta , Invertebrados/genética , Animais , Diploide , Eucariotos/genética , Variação Genética , Genoma , Poliploidia , Alinhamento de Sequência
4.
Nucleic Acids Res ; 43(Database issue): D123-9, 2015 01.
Artigo em Inglês | MEDLINE | ID: mdl-25352543

RESUMO

The field of non-coding RNA biology has been hampered by the lack of availability of a comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the first release of RNAcentral, a database that collates and integrates information from an international consortium of established RNA sequence databases. The initial release contains over 8.1 million sequences, including representatives of all major functional classes. A web portal (http://rnacentral.org) provides free access to data, search functionality, cross-references, source code and an integrated genome browser for selected species.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA não Traduzido/química , Mapeamento Cromossômico , Humanos , Internet , RNA não Traduzido/genética , Análise de Sequência de RNA
5.
Nucleic Acids Res ; 42(Database issue): D310-4, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24293656

RESUMO

We present a prototype of a new structural classification of proteins, SCOP2 (http://scop2.mrc-lmb.cam.ac.uk/), that we have developed recently. SCOP2 is a successor to the Structural Classification of Proteins (SCOP, http://scop.mrc-lmb.cam.ac.uk/scop/) database. Similarly to SCOP, the main focus of SCOP2 is to organize structurally characterized proteins according to their structural and evolutionary relationships. SCOP2 was designed to provide a more advanced framework for protein structure annotation and classification. It defines a new approach to the classification of proteins that is essentially different from SCOP, but retains its best features. The SCOP2 classification is described in terms of a directed acyclic graph in which nodes form a complex network of many-to-many relationships and are represented by a region of protein structure and sequence. The new classification project is expected to ensure new advances in the field and open new areas of research.


Assuntos
Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Mineração de Dados , Internet , Proteínas/classificação
6.
Nucleic Acids Res ; 42(Database issue): D546-52, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24163254

RESUMO

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.


Assuntos
Bases de Dados Genéticas , Genoma , Animais , Grão Comestível/genética , Genoma Bacteriano , Genoma Fúngico , Genoma de Planta , Genômica , Internet , Anotação de Sequência Molecular , Software
7.
Nucleic Acids Res ; 42(Database issue): D749-55, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24316576

RESUMO

Ensembl (http://www.ensembl.org) creates tools and data resources to facilitate genomic analysis in chordate species with an emphasis on human, major vertebrate model organisms and farm animals. Over the past year we have increased the number of species that we support to 77 and expanded our genome browser with a new scrollable overview and improved variation and phenotype views. We also report updates to our core datasets and improvements to our gene homology relationships from the addition of new species. Our REST service has been extended with additional support for comparative genomics and ontology information. Finally, we provide updated information about our methods for data access and resources for user training.


Assuntos
Bases de Dados Genéticas , Genômica , Animais , Cordados/genética , Variação Genética , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Fenótipo , Ratos
8.
Nat Methods ; 9(5): 459-62, 2012 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-22543379

RESUMO

The 1000 Genomes Project was launched as one of the largest distributed data collection and analysis projects ever undertaken in biology. In addition to the primary scientific goals of creating both a deep catalog of human genetic variation and extensive methods to accurately discover and characterize variation using new sequencing technologies, the project makes all of its data publicly available. Members of the project data coordination center have developed and deployed several tools to enable widespread data access.


Assuntos
Bases de Dados Genéticas , Genoma Humano , Genômica/métodos , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos , Variação Genética , Humanos
9.
Nucleic Acids Res ; 41(Database issue): D48-55, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23203987

RESUMO

The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets for all supported species; large-scale whole genome multiple species alignments across vertebrates and clade-specific alignments for eutherian mammals, primates, birds and fish; variation data resources for 17 species and regulation annotations based on ENCODE and other data sets. Ensembl data are accessible through the genome browser at http://www.ensembl.org and through other tools and programmatic interfaces.


Assuntos
Bases de Dados Genéticas , Genômica , Animais , Regulação da Expressão Gênica , Variação Genética , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Ratos , Software , Peixe-Zebra/genética
10.
Nucleic Acids Res ; 40(Database issue): D91-7, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22067447

RESUMO

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.


Assuntos
Bases de Dados Genéticas , Genômica , Animais , Genoma , Genoma Bacteriano , Genoma Fúngico , Genoma de Planta , Invertebrados/genética , Anotação de Sequência Molecular , Integração de Sistemas
11.
Nucleic Acids Res ; 40(Database issue): D84-90, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22086963

RESUMO

The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project.


Assuntos
Bases de Dados Genéticas , Genômica , Animais , Regulação da Expressão Gênica , Variação Genética , Humanos , Camundongos , Anotação de Sequência Molecular , Ratos
12.
Nucleic Acids Res ; 39(Database issue): D800-6, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21045057

RESUMO

The Ensembl project (http://www.ensembl.org) seeks to enable genomic science by providing high quality, integrated annotation on chordate and selected eukaryotic genomes within a consistent and accessible infrastructure. All supported species include comprehensive, evidence-based gene annotations and a selected set of genomes includes additional data focused on variation, comparative, evolutionary, functional and regulatory annotation. The most advanced resources are provided for key species including human, mouse, rat and zebrafish reflecting the popularity and importance of these species in biomedical research. As of Ensembl release 59 (August 2010), 56 species are supported of which 5 have been added in the past year. Since our previous report, we have substantially improved the presentation and integration of both data of disease relevance and the regulatory state of different cell types.


Assuntos
Bases de Dados Genéticas , Genômica , Animais , Variação Genética , Humanos , Camundongos , Anotação de Sequência Molecular , Ratos , Sequências Reguladoras de Ácido Nucleico , Software , Peixe-Zebra/genética
13.
Nucleic Acids Res ; 38(Database issue): D557-62, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19906699

RESUMO

Ensembl (http://www.ensembl.org) integrates genomic information for a comprehensive set of chordate genomes with a particular focus on resources for human, mouse, rat, zebrafish and other high-value sequenced genomes. We provide complete gene annotations for all supported species in addition to specific resources that target genome variation, function and evolution. Ensembl data is accessible in a variety of formats including via our genome browser, API and BioMart. This year marks the tenth anniversary of Ensembl and in that time the project has grown with advances in genome technology. As of release 56 (September 2009), Ensembl supports 51 species including marmoset, pig, zebra finch, lizard, gorilla and wallaby, which were added in the past year. Major additions and improvements to Ensembl since our previous report include the incorporation of the human GRCh37 assembly, enhanced visualisation and data-mining options for the Ensembl regulatory features and continued development of our software infrastructure.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Acesso à Informação , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Variação Genética , Genômica/métodos , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Estrutura Terciária de Proteína , Software , Especificidade da Espécie
14.
BMC Genomics ; 11: 293, 2010 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-20459805

RESUMO

BACKGROUND: The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics. DESCRIPTION: The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl. CONCLUSIONS: Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org.


Assuntos
Bases de Dados Genéticas , Variação Genética , Genômica/métodos , Algoritmos , Animais , Sequência de Bases , Bovinos , Genótipo , Humanos , Internet , Desequilíbrio de Ligação , Camundongos , Fenótipo , Filogenia , Polimorfismo de Nucleotídeo Único , Ratos , Análise de Sequência de DNA , Interface Usuário-Computador
15.
BMC Bioinformatics ; 9 Suppl 8: S3, 2008 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-18673527

RESUMO

BACKGROUND: The Distributed Annotation System (DAS) is a widely adopted protocol for dynamically integrating a wide range of biological data from geographically diverse sources. DAS continues to expand its applicability and evolve in response to new challenges facing integrative bioinformatics. RESULTS: Here we describe the various infrastructure components of DAS and present a new extended version of the DAS specification. Version 1.53E incorporates several recent developments, including its extension to serve new data types and an ontology for protein features. CONCLUSION: Our extensions to the DAS protocol have facilitated the integration of new data types, and our improvements to the existing DAS infrastructure have addressed recent challenges. The steadily increasing numbers of available data sources demonstrates further adoption of the DAS protocol.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Armazenamento e Recuperação da Informação/métodos , Biologia Computacional/métodos , Integração de Sistemas
16.
Bioinformatics ; 23(12): 1568-70, 2007 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-17237073

RESUMO

SUMMARY: The increasing size and complexity of biological databases has led to a growing trend to federate rather than duplicate them. In order to share data between federated databases, protocols for the exchange mechanism must be developed. One such data exchange protocol that is widely used is the Distributed Annotation System (DAS). For example, DAS has enabled small experimental groups to integrate their data into the Ensembl genome browser. We have developed ProServer, a simple, lightweight, Perl-based DAS server that does not depend on a separate HTTP server. The ProServer package is easily extensible, allowing data to be served from almost any underlying data model. Recent additions to the DAS protocol have enabled both structure and alignment (sequence and structural) data to be exchanged. ProServer allows both of these data types to be served. AVAILABILITY: ProServer can be downloaded from http://www.sanger.ac.uk/proserver/ or CPAN http://search.cpan.org/~rpettett/. Details on the system requirements and installation of ProServer can be found at http://www.sanger.ac.uk/proserver/.


Assuntos
Biologia Computacional/métodos , Redes de Comunicação de Computadores , Bases de Dados Genéticas , Genoma Humano , Humanos , Internet , Linguagens de Programação , Análise de Sequência de Proteína , Software , Relação Estrutura-Atividade
17.
BMC Bioinformatics ; 8: 333, 2007 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-17850653

RESUMO

BACKGROUND: The Distributed Annotation System (DAS) is a network protocol for exchanging biological data. It is frequently used to share annotations of genomes and protein sequence. RESULTS: Here we present several extensions to the current DAS 1.5 protocol. These provide new commands to share alignments, three dimensional molecular structure data, add the possibility for registration and discovery of DAS servers, and provide a convention how to provide different types of data plots. We present examples of web sites and applications that use the new extensions. We operate a public registry of DAS sources, which now includes entries for more than 250 distinct sources. CONCLUSION: Our DAS extensions are essential for the management of the growing number of services and exchange of diverse biological data sets. In addition the extensions allow new types of applications to be developed and scientific questions to be addressed. The registry of DAS sources is available at http://www.dasregistry.org.


Assuntos
Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Armazenamento e Recuperação da Informação/métodos , Internet , Análise de Sequência/métodos , Interface Usuário-Computador , Algoritmos , Mapeamento Cromossômico/métodos , Integração de Sistemas
18.
Artigo em Inglês | MEDLINE | ID: mdl-26896847

RESUMO

Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org.


Assuntos
Biologia Computacional/métodos , Genoma , Genômica , Algoritmos , Animais , DNA Complementar/genética , Bases de Dados Genéticas , Evolução Molecular , Etiquetas de Sequências Expressas , Humanos , Filogenia , Controle de Qualidade , RNA não Traduzido/genética , Alinhamento de Sequência , Análise de Sequência de RNA , Software
19.
Curr Protoc Bioinformatics ; 49: 1.26.1-1.26.21, 2015 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-25754991

RESUMO

SCOP2 is a successor to the Structural Classification of Proteins (SCOP) database that organizes proteins of known structure according to their structural and evolutionary relationships. It was designed to provide a more advanced framework for the classification of proteins. The SCOP2 classification is described in terms of a directed acyclic graph in which each node defines a relationship of particular type that is represented by a region of protein structure and sequence. The SCOP2 data are accessible via SCOP2-Browser and SCOP2-Graph. This protocol unit describes different ways to explore and investigate the SCOP2 evolutionary and structural groupings.


Assuntos
Bases de Dados de Proteínas , Evolução Molecular , Proteínas/química , Sequência de Aminoácidos , Internet , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA