Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 101
Filter
1.
IUCrJ ; 11(Pt 2): 140-151, 2024 Mar 01.
Article in English | MEDLINE | ID: mdl-38358351

ABSTRACT

In January 2020, a workshop was held at EMBL-EBI (Hinxton, UK) to discuss data requirements for the deposition and validation of cryoEM structures, with a focus on single-particle analysis. The meeting was attended by 47 experts in data processing, model building and refinement, validation, and archiving of such structures. This report describes the workshop's motivation and history, the topics discussed, and the resulting consensus recommendations. Some challenges for future methods-development efforts in this area are also highlighted, as is the implementation to date of some of the recommendations.


Subject(s)
Data Curation , Cryoelectron Microscopy/methods
2.
ArXiv ; 2024 Feb 02.
Article in English | MEDLINE | ID: mdl-38076521

ABSTRACT

In January 2020, a workshop was held at EMBL-EBI (Hinxton, UK) to discuss data requirements for deposition and validation of cryoEM structures, with a focus on single-particle analysis. The meeting was attended by 47 experts in data processing, model building and refinement, validation, and archiving of such structures. This report describes the workshop's motivation and history, the topics discussed, and consensus recommendations resulting from the workshop. Some challenges for future methods-development efforts in this area are also highlighted, as is the implementation to date of some of the recommendations.

3.
Histochem Cell Biol ; 160(3): 211-221, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37537341

ABSTRACT

Biological imaging is one of the primary tools by which we understand living systems across scales from atoms to organisms. Rapid advances in imaging technology have increased both the spatial and temporal resolutions at which we examine those systems, as well as enabling visualisation of larger tissue volumes. These advances have huge potential but also generate ever increasing amounts of imaging data that must be stored and analysed. Public image repositories provide a critical scientific service through open data provision, supporting reproducibility of scientific results, access to reference imaging datasets and reuse of data for new scientific discovery and acceleration of image analysis methods development. The scale and scope of imaging data provides both challenges and opportunities for open sharing of image data. In this article, we provide a perspective influenced by decades of provision of open data resources for biological information, suggesting areas to focus on and a path towards global interoperability.


Subject(s)
Image Processing, Computer-Assisted , Reproducibility of Results
4.
Methods Cell Biol ; 177: 389-399, 2023.
Article in English | MEDLINE | ID: mdl-37451775

ABSTRACT

Volume electron microscopy (vEM) techniques produce scientifically important datasets which are time and resource intensive to generate (Peddie et al., 2022). Public archival of such datasets, usually described in the literature, provides many benefits to the data depositors, to those making use of research results based on the datasets, and to the vEM community at large, both now and in the future. In this chapter we discuss these benefits, explain how EMBL-EBI's image data services support archival of both vEM and correlative imaging data, and discuss how future developments will unlock more value from these vEM datasets.


Subject(s)
Data Curation , Volume Electron Microscopy
6.
Nucleic Acids Res ; 51(D1): D9-D17, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36477213

ABSTRACT

The European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) is one of the world's leading sources of public biomolecular data. Based at the Wellcome Genome Campus in Hinxton, UK, EMBL-EBI is one of six sites of the European Molecular Biology Laboratory (EMBL), Europe's only intergovernmental life sciences organisation. This overview summarises the status of services that EMBL-EBI data resources provide to scientific communities globally. The scale, openness, rich metadata and extensive curation of EMBL-EBI added-value databases makes them particularly well-suited as training sets for deep learning, machine learning and artificial intelligence applications, a selection of which are described here. The data resources at EMBL-EBI can catalyse such developments because they offer sustainable, high-quality data, collected in some cases over decades and made openly availability to any researcher, globally. Our aim is for EMBL-EBI data resources to keep providing the foundations for tools and research insights that transform fields across the life sciences.


Subject(s)
Artificial Intelligence , Computational Biology , Data Management , Databases, Factual , Genome , Internet
7.
Nucleic Acids Res ; 51(D1): D1503-D1511, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36440762

ABSTRACT

Public archiving in structural biology is well established with the Protein Data Bank (PDB; wwPDB.org) catering for atomic models and the Electron Microscopy Data Bank (EMDB; emdb-empiar.org) for 3D reconstructions from cryo-EM experiments. Even before the recent rapid growth in cryo-EM, there was an expressed community need for a public archive of image data from cryo-EM experiments for validation, software development, testing and training. Concomitantly, the proliferation of 3D imaging techniques for cells, tissues and organisms using volume EM (vEM) and X-ray tomography (XT) led to calls from these communities to publicly archive such data as well. EMPIAR (empiar.org) was developed as a public archive for raw cryo-EM image data and for 3D reconstructions from vEM and XT experiments and now comprises over a thousand entries totalling over 2 petabytes of data. EMPIAR resources include a deposition system, entry pages, facilities to search, visualize and download datasets, and a REST API for programmatic access to entry metadata. The success of EMPIAR also poses significant challenges for the future in dealing with the very fast growth in the volume of data and in enhancing its reusability.


Subject(s)
Databases, Factual , Microscopy, Electron , Software , Imaging, Three-Dimensional
8.
F1000Res ; 122023.
Article in English | MEDLINE | ID: mdl-38486614

ABSTRACT

Organised data is easy to use but the rapid developments in the field of bioimaging, with improvements in instrumentation, detectors, software and experimental techniques, have resulted in an explosion of the volumes of data being generated, making well-organised data an elusive goal. This guide offers a handful of recommendations for bioimage depositors, analysts and microscope and software developers, whose implementation would contribute towards better organised data in preparation for archival. Based on our experience archiving large image datasets in EMPIAR, the BioImage Archive and BioStudies, we propose a number of strategies that we believe would improve the usability (clarity, orderliness, learnability, navigability, self-documentation, coherence and consistency of identifiers, accessibility, succinctness) of future data depositions more useful to the bioimaging community (data authors and analysts, researchers, clinicians, funders, collaborators, industry partners, hardware/software producers, journals, archive developers as well as interested but non-specialist users of bioimaging data). The recommendations that may also find use in other data-intensive disciplines. To facilitate the process of analysing data organisation, we present bandbox, a Python package that provides users with an assessment of their data by flagging potential issues, such as redundant directories or invalid characters in file or folder names, that should be addressed before archival. We offer these recommendations as a starting point and hope to engender more substantial conversations across and between the various data-rich communities.


Subject(s)
Communication , Industry , Humans , Research Design , Research Personnel , Software
9.
Protein Sci ; 31(10): e4439, 2022 10.
Article in English | MEDLINE | ID: mdl-36173162

ABSTRACT

The archiving and dissemination of protein and nucleic acid structures as well as their structural, functional and biophysical annotations is an essential task that enables the broader scientific community to conduct impactful research in multiple fields of the life sciences. The Protein Data Bank in Europe (PDBe; pdbe.org) team develops and maintains several databases and web services to address this fundamental need. From data archiving as a member of the Worldwide PDB consortium (wwPDB; wwpdb.org), to the PDBe Knowledge Base (PDBe-KB; pdbekb.org), we provide data, data-access mechanisms, and visualizations that facilitate basic and applied research and education across the life sciences. Here, we provide an overview of the structural data and annotations that we integrate and make freely available. We describe the web services and data visualization tools we offer, and provide information on how to effectively use or even further develop them. Finally, we discuss the direction of our data services, and how we aim to tackle new challenges that arise from the recent, unprecedented advances in the field of structure determination and protein structure modeling.


Subject(s)
Nucleic Acids , Proteins , Databases, Protein , Europe , Protein Conformation , Proteins/chemistry
10.
IUCrJ ; 9(Pt 4): 399-400, 2022 Jul 01.
Article in English | MEDLINE | ID: mdl-35844485

ABSTRACT

The scientific impact of accurate protein-structure prediction methods is being felt already, but how might they affect the work and careers of structural biologists?

11.
Methods Mol Biol ; 2449: 43-91, 2022.
Article in English | MEDLINE | ID: mdl-35507259

ABSTRACT

Databases of three-dimensional structures of proteins (and their associated molecules) provide: (a) Curated repositories of coordinates of experimentally determined structures, including extensive metadata; for instance information about provenance, details about data collection and interpretation, and validation of results. (b) Information-retrieval tools to allow searching to identify entries of interest and provide access to them. (c) Links among databases, especially to databases of amino-acid and genetic sequences, and of protein function; and links to software for analysis of amino-acid sequence and protein structure, and for structure prediction. (d) Collections of predicted three-dimensional structures of proteins. These will become more and more important after the breakthrough in structure prediction achieved by AlphaFold2. The single global archive of experimentally determined biomacromolecular structures is the Protein Data Bank (PDB). It is managed by wwPDB, a consortium of five partner institutions: the Protein Data Bank in Europe (PDBe), the Research Collaboratory for Structural Bioinformatics (RCSB), the Protein Data Bank Japan (PDBj), the BioMagResBank (BMRB), and the Electron Microscopy Data Bank (EMDB). In addition to jointly managing the PDB repository, the individual wwPDB partners offer many tools for analysis of protein and nucleic acid structures and their complexes, including providing computer-graphic representations. Their collective and individual websites serve as hubs of the community of structural biologists, offering newsletters, reports from Task Forces, training courses, and "helpdesks," as well as links to external software.Many specialized projects are based on the information contained in the PDB. Especially important are SCOP, CATH, and ECOD, which present classifications of protein domains.


Subject(s)
Proteins , Software , Computational Biology , Databases, Protein , Protein Conformation , Proteins/chemistry
12.
Acta Crystallogr D Struct Biol ; 78(Pt 5): 542-552, 2022 May 01.
Article in English | MEDLINE | ID: mdl-35503203

ABSTRACT

The Electron Microscopy Data Bank (EMDB) is the central archive of the electron cryo-microscopy (cryo-EM) community for storing and disseminating volume maps and tomograms. With input from the community, EMDB has developed new resources for the validation of cryo-EM structures, focusing on the quality of the volume data alone and that of the fit of any models, themselves archived in the Protein Data Bank (PDB), to the volume data. Based on recommendations from community experts, the validation resources are developed in a three-tiered system. Tier 1 covers an extensive and evolving set of validation metrics, including tried and tested metrics as well as more experimental ones, which are calculated for all EMDB entries and presented in the Validation Analysis (VA) web resource. This system is particularly useful for cryo-EM experts, both to validate individual structures and to assess the utility of new validation metrics. Tier 2 comprises a subset of the validation metrics covered by the VA resource that have been subjected to extensive testing and are considered to be useful for specialists as well as nonspecialists. These metrics are presented on the entry-specific web pages for the entire archive on the EMDB website. As more experience is gained with the metrics included in the VA resource, it is expected that consensus will emerge in the community regarding a subset that is suitable for inclusion in the tier 2 system. Tier 3, finally, consists of the validation reports and servers that are produced by the Worldwide Protein Data Bank (wwPDB) Consortium. Successful metrics from tier 2 will be proposed for inclusion in the wwPDB validation pipeline and reports. The details of the new resource are described, with an emphasis on the tier 1 system. The output of all three tiers is publicly available, either through the EMDB website (tiers 1 and 2) or through the wwPDB ftp sites (tier 3), although the content of all three will evolve over time (fastest for tier 1 and slowest for tier 3). It is our hope that these validation resources will help the cryo-EM community to obtain a better understanding of the quality and of the best ways to assess the quality of cryo-EM structures in EMDB and PDB.


Subject(s)
Databases, Protein , Cryoelectron Microscopy , Microscopy, Electron , Protein Conformation
13.
J Mol Biol ; 434(11): 167505, 2022 06 15.
Article in English | MEDLINE | ID: mdl-35189131

ABSTRACT

Despite the huge impact of data resources in genomics and structural biology, until now there has been no central archive for biological data for all imaging modalities. The BioImage Archive is a new data resource at the European Bioinformatics Institute (EMBL-EBI) designed to fill this gap. In its initial development BioImage Archive accepts bioimaging data associated with publications, in any format, from any imaging modality from the molecular to the organism scale, excluding medical imaging. The BioImage Archive will ensure reproducibility of published studies that derive results from image data and reduce duplication of effort. Most importantly, the BioImage Archive will help scientists to generate new insights through reuse of existing data to answer new biological questions, and provision of training, testing and benchmarking data for development of tools for image analysis. The archive is available at https://www.ebi.ac.uk/bioimage-archive/.


Subject(s)
Archives , Internet Use , Microscopy , Databases, Factual , Reproducibility of Results
15.
Carbohydr Polym ; 277: 118771, 2022 Feb 01.
Article in English | MEDLINE | ID: mdl-34893216

ABSTRACT

The enzymatic hydrolysis of barley beta-glucan, konjac glucomannan and carboxymethyl cellulose by a ß-1,4-D-endoglucanase MeCel45A from blue mussel, Mytilus edulis, which belongs to subfamily B of glycoside hydrolase family 45 (GH45), was compared with GH45 members of subfamilies A (Humicola insolens HiCel45A), B (Trichoderma reesei TrCel45A) and C (Phanerochaete chrysosporium PcCel45A). Furthermore, the crystal structure of MeCel45A is reported. Initial rates and hydrolysis yields were determined by reducing sugar assays and product formation was characterized using NMR spectroscopy. The subfamily B and C enzymes exhibited mannanase activity, whereas the subfamily A member was uniquely able to produce monomeric glucose. All enzymes were confirmed to be inverting glycoside hydrolases. MeCel45A appears to be cold adapted by evolution, as it maintained 70% activity on cellohexaose at 4 °C relative to 30 °C, compared to 35% for TrCel45A. Both enzymes produced cellobiose and cellotetraose from cellohexaose, but TrCel45A additionally produced cellotriose.


Subject(s)
Glycoside Hydrolases/metabolism , Mannans/metabolism , Mytilus edulis/enzymology , beta-Glucans/metabolism , Animals , Fungal Genus Humicola/enzymology , Glycoside Hydrolases/chemistry , Hypocreales/enzymology , Isoenzymes/chemistry , Isoenzymes/metabolism , Phanerochaete/enzymology
16.
Nucleic Acids Res ; 50(D1): D439-D444, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34791371

ABSTRACT

The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unprecedented expansion of the structural coverage of the known protein-sequence space. AlphaFold DB provides programmatic access to and interactive visualization of predicted atomic coordinates, per-residue and pairwise model-confidence estimates and predicted aligned errors. The initial release of AlphaFold DB contains over 360,000 predicted structures across 21 model-organism proteomes, which will soon be expanded to cover most of the (over 100 million) representative sequences from the UniRef90 data set.


Subject(s)
Databases, Protein , Protein Folding , Proteins/chemistry , Software , Amino Acid Sequence , Animals , Bacteria/genetics , Bacteria/metabolism , Datasets as Topic , Dictyostelium/genetics , Dictyostelium/metabolism , Fungi/genetics , Fungi/metabolism , Humans , Internet , Models, Molecular , Plants/genetics , Plants/metabolism , Protein Conformation, alpha-Helical , Protein Conformation, beta-Strand , Proteins/genetics , Proteins/metabolism , Trypanosoma cruzi/genetics , Trypanosoma cruzi/metabolism
17.
Nature ; 596(7873): 590-596, 2021 08.
Article in English | MEDLINE | ID: mdl-34293799

ABSTRACT

Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure1. Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold2, at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective.


Subject(s)
Computational Biology/standards , Deep Learning/standards , Models, Molecular , Protein Conformation , Proteome/chemistry , Datasets as Topic/standards , Diacylglycerol O-Acyltransferase/chemistry , Glucose-6-Phosphatase/chemistry , Humans , Membrane Proteins/chemistry , Protein Folding , Reproducibility of Results
19.
Methods Cell Biol ; 162: 417-430, 2021.
Article in English | MEDLINE | ID: mdl-33707022

ABSTRACT

Few would have thought that when Porter and colleagues used light microscopy to target their cell of interest to be analyzed in the electron microscope in the 1940s, that Correlative Imaging would develop into the thriving field it is today. Even though the first use of Correlative Light Electron Microscopy (CLEM) was established in the 1940s, it is only since the year 2000 that there has been a real surge in the application of CLEM technology. The power of CLEM is recognized in the scientific community as evidenced by the growing number of publications and dedicated sessions at scientific meetings. The field is also broadening, incorporating a multitude of other techniques including preclinical research and diagnostics, and slowly but surely the overarching field of Correlative Multimodality Imaging (CMI) is taking its place as an established technique and a research area in its own right. In this chapter, we will look at the initiatives that are being developed within the scientific world to build a coherent CMI community, with a particular emphasis on the developments in Europe. To achieve this aim, the community will need to design mechanisms for the interdisciplinary exchange of knowledge and benefits, set up training schemes, and develop standards for CMI technology and its data.


Subject(s)
Microscopy, Fluorescence , Microscopy, Electron
20.
STAR Protoc ; 2(1): 100253, 2021 03 19.
Article in English | MEDLINE | ID: mdl-33490973

ABSTRACT

This protocol illustrates the steps necessary to deposit correlated 3D cryo-imaging data from cryo-structured illumination microscopy and cryo-soft X-ray tomography with the BioStudies and EMPIAR deposition databases of the European Bioinformatics Institute. There is currently a real need for a robust method of data deposition to ensure unhindered access to and independent validation of correlative light and X-ray microscopy data to allow use in further comparative studies, educational activities, and data mining. For complete details on the use and execution of this protocol, please refer to Kounatidis et al. (2020).


Subject(s)
Databases, Factual , Imaging, Three-Dimensional , Tomography, X-Ray
SELECTION OF CITATIONS
SEARCH DETAIL
...