Pesquisa | BVS Integralidade em Saúde

1.

PDBImages: a command-line tool for automated macromolecular structure visualization.

Midlik, Adam; Nair, Sreenath; Anyango, Stephen; Deshpande, Mandar; Sehnal, David; Varadi, Mihaly; Velankar, Sameer.

Bioinformatics ; 39(12)2023 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-38085238

RESUMO

SUMMARY: PDBImages is an innovative, open-source Node.js package that harnesses the power of the popular macromolecule structure visualization software Mol*. Designed for use by the scientific community, PDBImages provides a means to generate high-quality images for PDB and AlphaFold DB models. Its unique ability to render and save images directly to files in a browserless mode sets it apart, offering users a streamlined, automated process for macromolecular structure visualization. Here, we detail the implementation of PDBImages, enumerating its diverse image types, and elaborating on its user-friendly setup. This powerful tool opens a new gateway for researchers to visualize, analyse, and share their work, fostering a deeper understanding of bioinformatics. AVAILABILITY AND IMPLEMENTATION: PDBImages is available as an npm package from https://www.npmjs.com/package/pdb-images. The source code is available from https://github.com/PDBeurope/pdb-images.

Assuntos

Biologia Computacional , Software , Estrutura Molecular , Biologia Computacional/métodos

2.

AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.

Varadi, Mihaly; Anyango, Stephen; Deshpande, Mandar; Nair, Sreenath; Natassia, Cindy; Yordanova, Galabina; Yuan, David; Stroe, Oana; Wood, Gemma; Laydon, Agata; Zídek, Augustin; Green, Tim; Tunyasuvunakool, Kathryn; Petersen, Stig; Jumper, John; Clancy, Ellen; Green, Richard; Vora, Ankur; Lutfi, Mira; Figurnov, Michael; Cowie, Andrew; Hobbs, Nicole; Kohli, Pushmeet; Kleywegt, Gerard; Birney, Ewan; Hassabis, Demis; Velankar, Sameer.

Nucleic Acids Res ; 50(D1): D439-D444, 2022 01 07.

Artigo em Inglês | MEDLINE | ID: mdl-34791371

RESUMO

The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unprecedented expansion of the structural coverage of the known protein-sequence space. AlphaFold DB provides programmatic access to and interactive visualization of predicted atomic coordinates, per-residue and pairwise model-confidence estimates and predicted aligned errors. The initial release of AlphaFold DB contains over 360,000 predicted structures across 21 model-organism proteomes, which will soon be expanded to cover most of the (over 100 million) representative sequences from the UniRef90 data set.

Assuntos

Bases de Dados de Proteínas , Dobramento de Proteína , Proteínas/química , Software , Sequência de Aminoácidos , Animais , Bactérias/genética , Bactérias/metabolismo , Conjuntos de Dados como Assunto , Dictyostelium/genética , Dictyostelium/metabolismo , Fungos/genética , Fungos/metabolismo , Humanos , Internet , Modelos Moleculares , Plantas/genética , Plantas/metabolismo , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Proteínas/genética , Proteínas/metabolismo , Trypanosoma cruzi/genética , Trypanosoma cruzi/metabolismo

3.

PDBe aggregated API: programmatic access to an integrative knowledge graph of molecular structure data.

Nair, Sreenath; Váradi, Mihály; Nadzirin, Nurul; Pravda, Lukás; Anyango, Stephen; Mir, Saqib; Berrisford, John; Armstrong, David; Gutmanas, Aleksandras; Velankar, Sameer.

Bioinformatics ; 37(21): 3950-3952, 2021 11 05.

Artigo em Inglês | MEDLINE | ID: mdl-34081107

RESUMO

SUMMARY: The PDBe aggregated API is an open-access and open-source RESTful API that provides programmatic access to a wealth of macromolecular structural data and their functional and biophysical annotations through 80+ API endpoints. The API is powered by the PDBe graph database (https://pdbe.org/graph-schema), an open-access integrative knowledge graph that can be used as a discovery tool to answer complex biological questions. AVAILABILITY AND IMPLEMENTATION: The PDBe aggregated API provides up-to-date access to the PDBe graph database, which has weekly releases with the latest data from the Protein Data Bank, integrated with updated annotations from UniProt, Pfam, CATH, SCOP and the PDBe-KB partner resources. The complete list of all the available API endpoints and their descriptions are available at https://pdbe.org/graph-api. The source code of the Python 3.6+ API application is publicly available at https://gitlab.ebi.ac.uk/pdbe-kb/services/pdbe-graph-api. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Reconhecimento Automatizado de Padrão , Software , Estrutura Molecular , Bases de Dados de Proteínas , Conformação Proteica

4.

PDBe: improved findability of macromolecular structure data in the PDB.

Armstrong, David R; Berrisford, John M; Conroy, Matthew J; Gutmanas, Aleksandras; Anyango, Stephen; Choudhary, Preeti; Clark, Alice R; Dana, Jose M; Deshpande, Mandar; Dunlop, Roisin; Gane, Paul; Gáborová, Romana; Gupta, Deepti; Haslam, Pauline; Koca, Jaroslav; Mak, Lora; Mir, Saqib; Mukhopadhyay, Abhik; Nadzirin, Nurul; Nair, Sreenath; Paysan-Lafosse, Typhaine; Pravda, Lukas; Sehnal, David; Salih, Osman; Smart, Oliver; Tolchard, James; Varadi, Mihaly; Svobodova-Vareková, Radka; Zaki, Hossam; Kleywegt, Gerard J; Velankar, Sameer.

Nucleic Acids Res ; 48(D1): D335-D343, 2020 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-31691821

RESUMO

The Protein Data Bank in Europe (PDBe), a founding member of the Worldwide Protein Data Bank (wwPDB), actively participates in the deposition, curation, validation, archiving and dissemination of macromolecular structure data. PDBe supports diverse research communities in their use of macromolecular structures by enriching the PDB data and by providing advanced tools and services for effective data access, visualization and analysis. This paper details the enrichment of data at PDBe, including mapping of RNA structures to Rfam, and identification of molecules that act as cofactors. PDBe has developed an advanced search facility with â¼100 data categories and sequence searches. New features have been included in the LiteMol viewer at PDBe, with updated visualization of carbohydrates and nucleic acids. Small molecules are now mapped more extensively to external databases and their visual representation has been enhanced. These advances help users to more easily find and interpret macromolecular structure data in order to solve scientific problems.

Assuntos

Bases de Dados de Proteínas , Software , Análise por Conglomerados , Confiabilidade dos Dados , Europa (Continente) , Conformação Proteica , Interface Usuário-Computador

5.

PDBeCIF: an open-source mmCIF/CIF parsing and processing package.

van Ginkel, Glen; Pravda, Lukás; Dana, José M; Varadi, Mihaly; Keller, Peter; Anyango, Stephen; Velankar, Sameer.

BMC Bioinformatics ; 22(1): 383, 2021 Jul 23.

Artigo em Inglês | MEDLINE | ID: mdl-34301175

RESUMO

BACKGROUND: Biomacromolecular structural data outgrew the legacy Protein Data Bank (PDB) format which the scientific community relied on for decades, yet the use of its successor PDBx/Macromolecular Crystallographic Information File format (PDBx/mmCIF) is still not widespread. Perhaps one of the reasons is the availability of easy to use tools that only support the legacy format, but also the inherent difficulties of processing mmCIF files correctly, given the number of edge cases that make efficient parsing problematic. Nevertheless, to fully exploit macromolecular structure data and their associated annotations such as multiscale structures from integrative/hybrid methods or large macromolecular complexes determined using traditional methods, it is necessary to fully adopt the new format as soon as possible. RESULTS: To this end, we developed PDBeCIF, an open-source Python project for manipulating mmCIF and CIF files. It is part of the official list of mmCIF parsers recorded by the wwPDB and is heavily employed in the processes of the Protein Data Bank in Europe. The package is freely available both from the PyPI repository ( http://pypi.org/project/pdbecif ) and from GitHub ( https://github.com/pdbeurope/pdbecif ) along with rich documentation and many ready-to-use examples. CONCLUSIONS: PDBeCIF is an efficient and lightweight Python 2.6+/3+ package with no external dependencies. It can be readily integrated with 3rd party libraries as well as adopted for broad scientific analyses.

Assuntos

Software , Bases de Dados de Proteínas , Europa (Continente) , Substâncias Macromoleculares , Estrutura Molecular

6.

PDBe: towards reusable data delivery infrastructure at protein data bank in Europe.

Mir, Saqib; Alhroub, Younes; Anyango, Stephen; Armstrong, David R; Berrisford, John M; Clark, Alice R; Conroy, Matthew J; Dana, Jose M; Deshpande, Mandar; Gupta, Deepti; Gutmanas, Aleksandras; Haslam, Pauline; Mak, Lora; Mukhopadhyay, Abhik; Nadzirin, Nurul; Paysan-Lafosse, Typhaine; Sehnal, David; Sen, Sanchayita; Smart, Oliver S; Varadi, Mihaly; Kleywegt, Gerard J; Velankar, Sameer.

Nucleic Acids Res ; 46(D1): D486-D492, 2018 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-29126160

RESUMO

The Protein Data Bank in Europe (PDBe, pdbe.org) is actively engaged in the deposition, annotation, remediation, enrichment and dissemination of macromolecular structure data. This paper describes new developments and improvements at PDBe addressing three challenging areas: data enrichment, data dissemination and functional reusability. New features of the PDBe Web site are discussed, including a context dependent menu providing links to raw experimental data and improved presentation of structures solved by hybrid methods. The paper also summarizes the features of the LiteMol suite, which is a set of services enabling fast and interactive 3D visualization of structures, with associated experimental maps, annotations and quality assessment information. We introduce a library of Web components which can be easily reused to port data and functionality available at PDBe to other services. We also introduce updates to the SIFTS resource which maps PDB data to other bioinformatics resources, and the PDBe REST API.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas/química , Análise de Sequência de Proteína/métodos , Interface Usuário-Computador , Sequência de Aminoácidos , Gráficos por Computador , Bases de Dados como Assunto , Europa (Continente) , Humanos , Disseminação de Informação , Internet , Modelos Moleculares , Anotação de Sequência Molecular , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Proteínas/genética , Proteínas/metabolismo

7.

Identifying protein conformational states in the Protein Data Bank: Toward unlocking the potential of integrative dynamics studies.

Ellaway, Joseph I J; Anyango, Stephen; Nair, Sreenath; Zaki, Hossam A; Nadzirin, Nurul; Powell, Harold R; Gutmanas, Aleksandras; Varadi, Mihaly; Velankar, Sameer.

Struct Dyn ; 11(3): 034701, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38774441

RESUMO

Studying protein dynamics and conformational heterogeneity is crucial for understanding biomolecular systems and treating disease. Despite the deposition of over 215 000 macromolecular structures in the Protein Data Bank and the advent of AI-based structure prediction tools such as AlphaFold2, RoseTTAFold, and ESMFold, static representations are typically produced, which fail to fully capture macromolecular motion. Here, we discuss the importance of integrating experimental structures with computational clustering to explore the conformational landscapes that manifest protein function. We describe the method developed by the Protein Data Bank in Europe - Knowledge Base to identify distinct conformational states, demonstrate the resource's primary use cases, through examples, and discuss the need for further efforts to annotate protein conformations with functional information. Such initiatives will be crucial in unlocking the potential of protein dynamics data, expediting drug discovery research, and deepening our understanding of macromolecular mechanisms.

8.

Fine particulate matter air pollution and health implications for Nairobi, Kenya.

Oguge, Otienoh; Nyamondo, Joshua; Adera, Noah; Okolla, Lydia; Okoth, Beldine; Anyango, Stephen; Afulo, Augustine; Kumie, Abera; Samet, Jonathan; Berhane, Kiros.

Environ Epidemiol ; 8(3): e307, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38799266

RESUMO

Background: Continuous ambient air quality monitoring in Kenya has been limited, resulting in a sparse data base on the health impacts of air pollution for the country. We have operated a centrally located monitor in Nairobi for measuring fine particulate matter (PM2.5), the pollutant that has demonstrated impact on health. Here, we describe the temporal levels and trends in PM2.5 data for Nairobi and evaluate associated health implications. Methods: We used a centrally located reference sensor, the beta attenuation monitor (BAM-1022), to measure hourly PM2.5 concentrations over a 3-year period (21 August 2019 to 20 August 2022). We used, at minimum, 75% of the daily hourly concentration to represent the 24-hour concentrations for a given calendar day. To estimate the deaths attributable to air pollution, we used the World Health Organization (WHO) AirQ+ tool with input as PM2.5 concentration data, local mortality statistics, and population sizes. Results: The daily (24-hour) mean (±SEM) PM2.5 concentration was 19. 2 ± 0.6 (µg/m3). Pollutant levels were lowest at 03:00 and, peaked at 20:00. Sundays had the lowest daily concentrations, which increased on Mondays and remained high through Saturdays. By season, the pollutant concentrations were lowest in April and highest in August. The mean annual concentration was 18.4 ± 7.1 (µg/m3), which was estimated to lead to between 400 and 1,400 premature deaths of the city's population in 2021 hence contributing 5%-8% of the 17,432 adult deaths excluding accidents when referenced to WHO recommended 2021 air quality guideline for annual thresholds of 5 µg/m3. Conclusion: Fine particulate matter air pollution in Nairobi showed daily, day-of-week, and seasonal fluctuations consistent with the anthropogenic source mix, particularly from motor vehicles. The long-term population exposure to PM2.5 was 3.7 times higher than the WHO annual guideline of 5 µg/m3 and estimated to lead to a substantial burden of attributable deaths. An updated regulation targeting measures to reduce vehicular emissions is recommended.

9.

Unified access to up-to-date residue-level annotations from UniProtKB and other biological databases for PDB data.

Choudhary, Preeti; Anyango, Stephen; Berrisford, John; Tolchard, James; Varadi, Mihaly; Velankar, Sameer.

Sci Data ; 10(1): 204, 2023 04 12.

Artigo em Inglês | MEDLINE | ID: mdl-37045837

RESUMO

More than 61,000 proteins have up-to-date correspondence between their amino acid sequence (UniProtKB) and their 3D structures (PDB), enabled by the Structure Integration with Function, Taxonomy and Sequences (SIFTS) resource. SIFTS incorporates residue-level annotations from many other biological resources. SIFTS data is available in various formats like XML, CSV and TSV format or also accessible via the PDBe REST API but always maintained separately from the structure data (PDBx/mmCIF file) in the PDB archive. Here, we extended the wwPDB PDBx/mmCIF data dictionary with additional categories to accommodate SIFTS data and added the UniProtKB, Pfam, SCOP2, and CATH residue-level annotations directly into the PDBx/mmCIF files from the PDB archive. With the integrated UniProtKB annotations, these files now provide consistent numbering of residues in different PDB entries allowing easy comparison of structure models. The extended dictionary yields a more consistent, standardised metadata description without altering the core PDB information. This development enables up-to-date cross-reference information at the residue level resulting in better data interoperability, supporting improved data analysis and visualisation.

10.

PDBe CCDUtils: an RDKit-based toolkit for handling and analysing small molecules in the Protein Data Bank.

Kunnakkattu, Ibrahim Roshan; Choudhary, Preeti; Pravda, Lukas; Nadzirin, Nurul; Smart, Oliver S; Yuan, Qi; Anyango, Stephen; Nair, Sreenath; Varadi, Mihaly; Velankar, Sameer.

J Cheminform ; 15(1): 117, 2023 Dec 02.

Artigo em Inglês | MEDLINE | ID: mdl-38042830

RESUMO

While the Protein Data Bank (PDB) contains a wealth of structural information on ligands bound to macromolecules, their analysis can be challenging due to the large amount and diversity of data. Here, we present PDBe CCDUtils, a versatile toolkit for processing and analysing small molecules from the PDB in PDBx/mmCIF format. PDBe CCDUtils provides streamlined access to all the metadata for small molecules in the PDB and offers a set of convenient methods to compute various properties using RDKit, such as 2D depictions, 3D conformers, physicochemical properties, scaffolds, common fragments, and cross-references to small molecule databases using UniChem. The toolkit also provides methods for identifying all the covalently attached chemical components in a macromolecular structure and calculating similarity among small molecules. By providing a broad range of functionality, PDBe CCDUtils caters to the needs of researchers in cheminformatics, structural biology, bioinformatics and computational chemistry.

11.

Annotating Macromolecular Complexes in the Protein Data Bank: Improving the FAIRness of Structure Data.

Appasamy, Sri Devan; Berrisford, John; Gaborova, Romana; Nair, Sreenath; Anyango, Stephen; Grudinin, Sergei; Deshpande, Mandar; Armstrong, David; Pidruchna, Ivanna; Ellaway, Joseph I J; Leines, Grisell Díaz; Gupta, Deepti; Harrus, Deborah; Varadi, Mihaly; Velankar, Sameer.

Sci Data ; 10(1): 853, 2023 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-38040737

RESUMO

Macromolecular complexes are essential functional units in nearly all cellular processes, and their atomic-level understanding is critical for elucidating and modulating molecular mechanisms. The Protein Data Bank (PDB) serves as the global repository for experimentally determined structures of macromolecules. Structural data in the PDB offer valuable insights into the dynamics, conformation, and functional states of biological assemblies. However, the current annotation practices lack standardised naming conventions for assemblies in the PDB, complicating the identification of instances representing the same assembly. In this study, we introduce a method leveraging resources external to PDB, such as the Complex Portal, UniProt and Gene Ontology, to describe assemblies and contextualise them within their biological settings accurately. Employing the proposed approach, we assigned standard names to over 90% of unique assemblies in the PDB and provided persistent identifiers for each assembly. This standardisation of assembly data enhances the PDB, facilitating a deeper understanding of macromolecular complexes. Furthermore, the data standardisation improves the PDB's FAIR attributes, fostering more effective basic and translational research and scientific education.

Assuntos

Pesquisa Translacional Biomédica , Conformação Molecular , Bases de Dados de Proteínas , Substâncias Macromoleculares , Conformação Proteica

12.

PDBe and PDBe-KB: Providing high-quality, up-to-date and integrated resources of macromolecular structures to support basic and applied research and education.

Varadi, Mihaly; Anyango, Stephen; Appasamy, Sri Devan; Armstrong, David; Bage, Marcus; Berrisford, John; Choudhary, Preeti; Bertoni, Damian; Deshpande, Mandar; Leines, Grisell Diaz; Ellaway, Joseph; Evans, Genevieve; Gaborova, Romana; Gupta, Deepti; Gutmanas, Aleksandras; Harrus, Deborah; Kleywegt, Gerard J; Bueno, Weslley Morellato; Nadzirin, Nurul; Nair, Sreenath; Pravda, Lukas; Afonso, Marcelo Querino Lima; Sehnal, David; Tanweer, Ahsan; Tolchard, James; Abrams, Charlotte; Dunlop, Roisin; Velankar, Sameer.

Protein Sci ; 31(10): e4439, 2022 10.

Artigo em Inglês | MEDLINE | ID: mdl-36173162

RESUMO

The archiving and dissemination of protein and nucleic acid structures as well as their structural, functional and biophysical annotations is an essential task that enables the broader scientific community to conduct impactful research in multiple fields of the life sciences. The Protein Data Bank in Europe (PDBe; pdbe.org) team develops and maintains several databases and web services to address this fundamental need. From data archiving as a member of the Worldwide PDB consortium (wwPDB; wwpdb.org), to the PDBe Knowledge Base (PDBe-KB; pdbekb.org), we provide data, data-access mechanisms, and visualizations that facilitate basic and applied research and education across the life sciences. Here, we provide an overview of the structural data and annotations that we integrate and make freely available. We describe the web services and data visualization tools we offer, and provide information on how to effectively use or even further develop them. Finally, we discuss the direction of our data services, and how we aim to tackle new challenges that arise from the recent, unprecedented advances in the field of structure determination and protein structure modeling.

Assuntos

Ácidos Nucleicos , Proteínas , Bases de Dados de Proteínas , Europa (Continente) , Conformação Proteica , Proteínas/química

13.

3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources.

Varadi, Mihaly; Nair, Sreenath; Sillitoe, Ian; Tauriello, Gerardo; Anyango, Stephen; Bienert, Stefan; Borges, Clemente; Deshpande, Mandar; Green, Tim; Hassabis, Demis; Hatos, Andras; Hegedus, Tamas; Hekkelman, Maarten L; Joosten, Robbie; Jumper, John; Laydon, Agata; Molodenskiy, Dmitry; Piovesan, Damiano; Salladini, Edoardo; Salzberg, Steven L; Sommer, Markus J; Steinegger, Martin; Suhajda, Erzsebet; Svergun, Dmitri; Tenorio-Ku, Luiggi; Tosatto, Silvio; Tunyasuvunakool, Kathryn; Waterhouse, Andrew Mark; Zídek, Augustin; Schwede, Torsten; Orengo, Christine; Velankar, Sameer.

Gigascience ; 112022 11 30.

Artigo em Inglês | MEDLINE | ID: mdl-36448847

RESUMO

While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.

Assuntos

Metadados , Registros , Sequência de Aminoácidos , Bases de Dados de Proteínas , Simulação por Computador

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa