Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 101
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Nature ; 596(7873): 590-596, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34293799

RESUMO

Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure1. Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold2, at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective.


Assuntos
Biologia Computacional/normas , Aprendizado Profundo/normas , Modelos Moleculares , Conformação Proteica , Proteoma/química , Conjuntos de Dados como Assunto/normas , Diacilglicerol O-Aciltransferase/química , Glucose-6-Fosfatase/química , Humanos , Proteínas de Membrana/química , Dobramento de Proteína , Reprodutibilidade dos Testes
2.
Nucleic Acids Res ; 51(D1): D1503-D1511, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36440762

RESUMO

Public archiving in structural biology is well established with the Protein Data Bank (PDB; wwPDB.org) catering for atomic models and the Electron Microscopy Data Bank (EMDB; emdb-empiar.org) for 3D reconstructions from cryo-EM experiments. Even before the recent rapid growth in cryo-EM, there was an expressed community need for a public archive of image data from cryo-EM experiments for validation, software development, testing and training. Concomitantly, the proliferation of 3D imaging techniques for cells, tissues and organisms using volume EM (vEM) and X-ray tomography (XT) led to calls from these communities to publicly archive such data as well. EMPIAR (empiar.org) was developed as a public archive for raw cryo-EM image data and for 3D reconstructions from vEM and XT experiments and now comprises over a thousand entries totalling over 2 petabytes of data. EMPIAR resources include a deposition system, entry pages, facilities to search, visualize and download datasets, and a REST API for programmatic access to entry metadata. The success of EMPIAR also poses significant challenges for the future in dealing with the very fast growth in the volume of data and in enhancing its reusability.


Assuntos
Bases de Dados Factuais , Microscopia Eletrônica , Software , Imageamento Tridimensional
3.
Nucleic Acids Res ; 51(D1): D9-D17, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36477213

RESUMO

The European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) is one of the world's leading sources of public biomolecular data. Based at the Wellcome Genome Campus in Hinxton, UK, EMBL-EBI is one of six sites of the European Molecular Biology Laboratory (EMBL), Europe's only intergovernmental life sciences organisation. This overview summarises the status of services that EMBL-EBI data resources provide to scientific communities globally. The scale, openness, rich metadata and extensive curation of EMBL-EBI added-value databases makes them particularly well-suited as training sets for deep learning, machine learning and artificial intelligence applications, a selection of which are described here. The data resources at EMBL-EBI can catalyse such developments because they offer sustainable, high-quality data, collected in some cases over decades and made openly availability to any researcher, globally. Our aim is for EMBL-EBI data resources to keep providing the foundations for tools and research insights that transform fields across the life sciences.


Assuntos
Inteligência Artificial , Biologia Computacional , Gerenciamento de Dados , Bases de Dados Factuais , Genoma , Internet
4.
Nucleic Acids Res ; 50(D1): D439-D444, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34791371

RESUMO

The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unprecedented expansion of the structural coverage of the known protein-sequence space. AlphaFold DB provides programmatic access to and interactive visualization of predicted atomic coordinates, per-residue and pairwise model-confidence estimates and predicted aligned errors. The initial release of AlphaFold DB contains over 360,000 predicted structures across 21 model-organism proteomes, which will soon be expanded to cover most of the (over 100 million) representative sequences from the UniRef90 data set.


Assuntos
Bases de Dados de Proteínas , Dobramento de Proteína , Proteínas/química , Software , Sequência de Aminoácidos , Animais , Bactérias/genética , Bactérias/metabolismo , Conjuntos de Dados como Assunto , Dictyostelium/genética , Dictyostelium/metabolismo , Fungos/genética , Fungos/metabolismo , Humanos , Internet , Modelos Moleculares , Plantas/genética , Plantas/metabolismo , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Proteínas/genética , Proteínas/metabolismo , Trypanosoma cruzi/genética , Trypanosoma cruzi/metabolismo
5.
Histochem Cell Biol ; 160(3): 211-221, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37537341

RESUMO

Biological imaging is one of the primary tools by which we understand living systems across scales from atoms to organisms. Rapid advances in imaging technology have increased both the spatial and temporal resolutions at which we examine those systems, as well as enabling visualisation of larger tissue volumes. These advances have huge potential but also generate ever increasing amounts of imaging data that must be stored and analysed. Public image repositories provide a critical scientific service through open data provision, supporting reproducibility of scientific results, access to reference imaging datasets and reuse of data for new scientific discovery and acceleration of image analysis methods development. The scale and scope of imaging data provides both challenges and opportunities for open sharing of image data. In this article, we provide a perspective influenced by decades of provision of open data resources for biological information, suggesting areas to focus on and a path towards global interoperability.


Assuntos
Processamento de Imagem Assistida por Computador , Reprodutibilidade dos Testes
6.
Nucleic Acids Res ; 48(D1): D335-D343, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31691821

RESUMO

The Protein Data Bank in Europe (PDBe), a founding member of the Worldwide Protein Data Bank (wwPDB), actively participates in the deposition, curation, validation, archiving and dissemination of macromolecular structure data. PDBe supports diverse research communities in their use of macromolecular structures by enriching the PDB data and by providing advanced tools and services for effective data access, visualization and analysis. This paper details the enrichment of data at PDBe, including mapping of RNA structures to Rfam, and identification of molecules that act as cofactors. PDBe has developed an advanced search facility with ∼100 data categories and sequence searches. New features have been included in the LiteMol viewer at PDBe, with updated visualization of carbohydrates and nucleic acids. Small molecules are now mapped more extensively to external databases and their visual representation has been enhanced. These advances help users to more easily find and interpret macromolecular structure data in order to solve scientific problems.


Assuntos
Bases de Dados de Proteínas , Software , Análise por Conglomerados , Confiabilidade dos Dados , Europa (Continente) , Conformação Proteica , Interface Usuário-Computador
9.
Nucleic Acids Res ; 46(D1): D486-D492, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29126160

RESUMO

The Protein Data Bank in Europe (PDBe, pdbe.org) is actively engaged in the deposition, annotation, remediation, enrichment and dissemination of macromolecular structure data. This paper describes new developments and improvements at PDBe addressing three challenging areas: data enrichment, data dissemination and functional reusability. New features of the PDBe Web site are discussed, including a context dependent menu providing links to raw experimental data and improved presentation of structures solved by hybrid methods. The paper also summarizes the features of the LiteMol suite, which is a set of services enabling fast and interactive 3D visualization of structures, with associated experimental maps, annotations and quality assessment information. We introduce a library of Web components which can be easily reused to port data and functionality available at PDBe to other services. We also introduce updates to the SIFTS resource which maps PDB data to other bioinformatics resources, and the PDBe REST API.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas/química , Análise de Sequência de Proteína/métodos , Interface Usuário-Computador , Sequência de Aminoácidos , Gráficos por Computador , Bases de Dados como Assunto , Europa (Continente) , Humanos , Disseminação de Informação , Internet , Modelos Moleculares , Anotação de Sequência Molecular , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Proteínas/genética , Proteínas/metabolismo
11.
Nucleic Acids Res ; 44(D1): D396-403, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26578576

RESUMO

Three-dimensional Electron Microscopy (3DEM) has become a key experimental method in structural biology for a broad spectrum of biological specimens from molecules to cells. The EMDataBank project provides a unified portal for deposition, retrieval and analysis of 3DEM density maps, atomic models and associated metadata (emdatabank.org). We provide here an overview of the rapidly growing 3DEM structural data archives, which include maps in EM Data Bank and map-derived models in the Protein Data Bank. In addition, we describe progress and approaches toward development of validation protocols and methods, working with the scientific community, in order to create a validation pipeline for 3DEM data.


Assuntos
Bases de Dados Factuais , Imageamento Tridimensional , Substâncias Macromoleculares/química , Microscopia Eletrônica , Bases de Dados de Proteínas , Modelos Moleculares , Proteínas/química
12.
Nucleic Acids Res ; 44(D1): D385-95, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26476444

RESUMO

The Protein Data Bank in Europe (http://pdbe.org) accepts and annotates depositions of macromolecular structure data in the PDB and EMDB archives and enriches, integrates and disseminates structural information in a variety of ways. The PDBe website has been redesigned based on an analysis of user requirements, and now offers intuitive access to improved and value-added macromolecular structure information. Unique value-added information includes lists of reviews and research articles that cite or mention PDB entries as well as access to figures and legends from full-text open-access publications that describe PDB entries. A powerful new query system not only shows all the PDB entries that match a given query, but also shows the 'best structures' for a given macromolecule, ligand complex or sequence family using data-quality information from the wwPDB validation reports. A PDBe RESTful API has been developed to provide unified access to macromolecular structure data available in the PDB and EMDB archives as well as value-added annotations, e.g. regarding structure quality and up-to-date cross-reference information from the SIFTS resource. Taken together, these new developments facilitate unified access to macromolecular structure data in an intuitive way for non-expert users and support expert users in analysing macromolecular structure data.


Assuntos
Bases de Dados de Proteínas , Conformação Proteica , Internet , Microscopia Eletrônica , Modelos Moleculares , Interface Usuário-Computador
13.
Nucleic Acids Res ; 43(Database issue): D382-6, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25348407

RESUMO

Genome3D (http://www.genome3d.eu) is a collaborative resource that provides predicted domain annotations and structural models for key sequences. Since introducing Genome3D in a previous NAR paper, we have substantially extended and improved the resource. We have annotated representatives from Pfam families to improve coverage of diverse sequences and added a fast sequence search to the website to allow users to find Genome3D-annotated sequences similar to their own. We have improved and extended the Genome3D data, enlarging the source data set from three model organisms to 10, and adding VIVACE, a resource new to Genome3D. We have analysed and updated Genome3D's SCOP/CATH mapping. Finally, we have improved the superposition tools, which now give users a more powerful interface for investigating similarities and differences between structural models.


Assuntos
Bases de Dados de Proteínas , Anotação de Sequência Molecular , Estrutura Terciária de Proteína , Algoritmos , Genômica , Internet , Modelos Moleculares , Estrutura Terciária de Proteína/genética , Análise de Sequência de Proteína
14.
J Struct Biol ; 194(2): 164-70, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-26876163

RESUMO

We describe the functionality and design of the Volume slicer - a web-based slice viewer for EMDB entries. This tool uniquely provides the facility to view slices from 3D EM reconstructions along the three orthogonal axes and to rapidly switch between them and navigate through the volume. We have employed multiple rounds of user-experience testing with members of the EM community to ensure that the interface is easy and intuitive to use and the information provided is relevant. The impetus to develop the Volume slicer has been calls from the EM community to provide web-based interactive visualisation of 2D slice data. This would be useful for quick initial checks of the quality of a reconstruction. Again in response to calls from the community, we plan to further develop the Volume slicer into a fully-fledged Volume browser that provides integrated visualisation of EMDB and PDB entries from the molecular to the cellular scale.


Assuntos
Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Imageamento Tridimensional/estatística & dados numéricos , Microscopia Eletrônica , Software , Bases de Dados de Proteínas , Humanos , Internet
15.
Nucleic Acids Res ; 42(Database issue): D285-91, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24288376

RESUMO

The Protein Data Bank in Europe (pdbe.org) is a founding member of the Worldwide PDB consortium (wwPDB; wwpdb.org) and as such is actively engaged in the deposition, annotation, remediation and dissemination of macromolecular structure data through the single global archive for such data, the PDB. Similarly, PDBe is a member of the EMDataBank organisation (emdatabank.org), which manages the EMDB archive for electron microscopy data. PDBe also develops tools that help the biomedical science community to make effective use of the data in the PDB and EMDB for their research. Here we describe new or improved services, including updated SIFTS mappings to other bioinformatics resources, a new browser for the PDB archive based on Gene Ontology (GO) annotation, updates to the analysis of Nuclear Magnetic Resonance-derived structures, redesigned search and browse interfaces, and new or updated visualisation and validation tools for EMDB entries.


Assuntos
Bases de Dados de Proteínas , Conformação Proteica , Gráficos por Computador , Europa (Continente) , Ontologia Genética , Internet , Ressonância Magnética Nuclear Biomolecular , Análise de Sequência de Proteína , Software
16.
Nat Methods ; 9(3): 245-53, 2012 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-22373911

RESUMO

Data-intensive research depends on tools that manage multidimensional, heterogeneous datasets. We built OME Remote Objects (OMERO), a software platform that enables access to and use of a wide range of biological data. OMERO uses a server-based middleware application to provide a unified interface for images, matrices and tables. OMERO's design and flexibility have enabled its use for light-microscopy, high-content-screening, electron-microscopy and even non-image-genotype data. OMERO is open-source software, available at http://openmicroscopy.org/.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Modelos Biológicos , Software , Interface Usuário-Computador , Animais , Biologia/métodos , Simulação por Computador , Humanos
17.
Nucleic Acids Res ; 41(Database issue): D483-9, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23203869

RESUMO

The Structure Integration with Function, Taxonomy and Sequences resource (SIFTS; http://pdbe.org/sifts) is a close collaboration between the Protein Data Bank in Europe (PDBe) and UniProt. The two teams have developed a semi-automated process for maintaining up-to-date cross-reference information to UniProt entries, for all protein chains in the PDB entries present in the UniProt database. This process is carried out for every weekly PDB release and the information is stored in the SIFTS database. The SIFTS process includes cross-references to other biological resources such as Pfam, SCOP, CATH, GO, InterPro and the NCBI taxonomy database. The information is exported in XML format, one file for each PDB entry, and is made available by FTP. Many bioinformatics resources use SIFTS data to obtain cross-references between the PDB and other biological databases so as to provide their users with up-to-date information.


Assuntos
Bases de Dados de Proteínas , Proteínas/química , Internet , Anotação de Sequência Molecular , Conformação Proteica , Proteínas/classificação , Proteínas/fisiologia , Análise de Sequência de Proteína , Integração de Sistemas
18.
Nucleic Acids Res ; 41(Database issue): D773-80, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23175605

RESUMO

The availability of comprehensive information about enzymes plays an important role in answering questions relevant to interdisciplinary fields such as biochemistry, enzymology, biofuels, bioengineering and drug discovery. At the EMBL European Bioinformatics Institute, we have developed an enzyme portal (http://www.ebi.ac.uk/enzymeportal) to provide this wealth of information on enzymes from multiple in-house resources addressing particular data classes: protein sequence and structure, reactions, pathways and small molecules. The fact that these data reside in separate databases makes information discovery cumbersome. The main goal of the portal is to simplify this process for end users.


Assuntos
Bases de Dados de Proteínas , Enzimas/química , Enzimas/metabolismo , Doença , Enzimas/genética , Internet , Conformação Proteica , Interface Usuário-Computador
19.
Nucleic Acids Res ; 41(Database issue): D499-507, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23203986

RESUMO

Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).


Assuntos
Bases de Dados de Proteínas , Estrutura Terciária de Proteína , Genômica , Humanos , Internet , Anotação de Sequência Molecular , Proteínas/química , Proteínas/classificação , Proteínas/genética , Software
20.
Acta Crystallogr D Biol Crystallogr ; 70(Pt 10): 2780, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-25286863

RESUMO

The wwPDB responds to the article On the prompt update of literature references in the Protein Data Bank [Wlodawer (2014), Acta Cryst. D70, 2779].


Assuntos
Cristalografia por Raios X , Bases de Dados de Proteínas , Publicações
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA