RESUMO
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein-nucleic acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: Escherichia coli beta-galactosidase with inhibitor, SARS-CoV-2 virus RNA-dependent RNA polymerase with covalently bound nucleotide analog and SARS-CoV-2 virus ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. The quality of submitted ligand models and surrounding atoms were analyzed by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics and contact scores. A composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.
Assuntos
Microscopia Crioeletrônica , Modelos Moleculares , Microscopia Crioeletrônica/métodos , Ligantes , SARS-CoV-2 , COVID-19/virologia , Escherichia coli , beta-Galactosidase/química , beta-Galactosidase/metabolismo , Conformação Proteica , Reprodutibilidade dos TestesRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), founding member of the Worldwide Protein Data Bank (wwPDB), is the US data center for the open-access PDB archive. As wwPDB-designated Archive Keeper, RCSB PDB is also responsible for PDB data security. Annually, RCSB PDB serves >10 000 depositors of three-dimensional (3D) biostructures working on all permanently inhabited continents. RCSB PDB delivers data from its research-focused RCSB.org web portal to many millions of PDB data consumers based in virtually every United Nations-recognized country, territory, etc. This Database Issue contribution describes upgrades to the research-focused RCSB.org web portal that created a one-stop-shop for open access to â¼200 000 experimentally-determined PDB structures of biological macromolecules alongside >1 000 000 incorporated Computed Structure Models (CSMs) predicted using artificial intelligence/machine learning methods. RCSB.org is a 'living data resource.' Every PDB structure and CSM is integrated weekly with related functional annotations from external biodata resources, providing up-to-date information for the entire corpus of 3D biostructure data freely available from RCSB.org with no usage limitations. Within RCSB.org, PDB structures and the CSMs are clearly identified as to their provenance and reliability. Both are fully searchable, and can be analyzed and visualized using the full complement of RCSB.org web portal capabilities.
Assuntos
Inteligência Artificial , Bases de Dados de Proteínas , Proteínas , Aprendizado de Máquina , Conformação Proteica , Proteínas/química , Reprodutibilidade dos TestesRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), the US data center for the global PDB archive and a founding member of the Worldwide Protein Data Bank partnership, serves tens of thousands of data depositors in the Americas and Oceania and makes 3D macromolecular structure data available at no charge and without restrictions to millions of RCSB.org users around the world, including >660 000 educators, students and members of the curious public using PDB101.RCSB.org. PDB data depositors include structural biologists using macromolecular crystallography, nuclear magnetic resonance spectroscopy, 3D electron microscopy and micro-electron diffraction. PDB data consumers accessing our web portals include researchers, educators and students studying fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. During the past 2 years, the research-focused RCSB PDB web portal (RCSB.org) has undergone a complete redesign, enabling improved searching with full Boolean operator logic and more facile access to PDB data integrated with >40 external biodata resources. New features and resources are described in detail using examples that showcase recently released structures of SARS-CoV-2 proteins and host cell proteins relevant to understanding and addressing the COVID-19 global pandemic.
Assuntos
Biologia Computacional/métodos , Bases de Dados de Proteínas , Substâncias Macromoleculares/química , Conformação Proteica , Proteínas/química , Bioengenharia/métodos , Pesquisa Biomédica/métodos , Biotecnologia/métodos , COVID-19/epidemiologia , COVID-19/prevenção & controle , COVID-19/virologia , Humanos , Substâncias Macromoleculares/metabolismo , Pandemias , Proteínas/genética , Proteínas/metabolismo , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiologia , Software , Proteínas Virais/química , Proteínas Virais/genética , Proteínas Virais/metabolismoRESUMO
Since 1971, the Protein Data Bank (PDB) has served as the single global archive for experimentally determined 3D structures of biological macromolecules made freely available to the global community according to the FAIR principles of Findability-Accessibility-Interoperability-Reusability. During the first 50 years of continuous PDB operations, standards for data representation have evolved to better represent rich and complex biological phenomena. Carbohydrate molecules present in more than 14,000 PDB structures have recently been reviewed and remediated to conform to a new standardized format. This machine-readable data representation for carbohydrates occurring in the PDB structures and the corresponding reference data improves the findability, accessibility, interoperability and reusability of structural information pertaining to these molecules. The PDB Exchange MacroMolecular Crystallographic Information File data dictionary now supports (i) standardized atom nomenclature that conforms to International Union of Pure and Applied Chemistry-International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) recommendations for carbohydrates, (ii) uniform representation of branched entities for oligosaccharides, (iii) commonly used linear descriptors of carbohydrates developed by the glycoscience community and (iv) annotation of glycosylation sites in proteins. For the first time, carbohydrates in PDB structures are consistently represented as collections of standardized monosaccharides, which precisely describe oligosaccharide structures and enable improved carbohydrate visualization, structure validation, robust quantitative and qualitative analyses, search for dendritic structures and classification. The uniform representation of carbohydrate molecules in the PDB described herein will facilitate broader usage of the resource by the glycoscience community and researchers studying glycoproteins.
Assuntos
Carboidratos , Proteínas , Carboidratos/química , Bases de Dados de Proteínas , Proteínas/químicaRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, rcsb.org), the US data center for the global PDB archive, serves thousands of Data Depositors in the Americas and Oceania and makes 3D macromolecular structure data available at no charge and without usage restrictions to more than 1 million rcsb.org Users worldwide and 600 000 pdb101.rcsb.org education-focused Users around the globe. PDB Data Depositors include structural biologists using macromolecular crystallography, nuclear magnetic resonance spectroscopy and 3D electron microscopy. PDB Data Consumers include researchers, educators and students studying Fundamental Biology, Biomedicine, Biotechnology and Energy. Recent reorganization of RCSB PDB activities into four integrated, interdependent services is described in detail, together with tools and resources added over the past 2 years to RCSB PDB web portals in support of a 'Structural View of Biology.'
Assuntos
Bases de Dados de Proteínas , Conformação Proteica , Pesquisa Biomédica/educação , Biotecnologia/educação , Curadoria de Dados , SoftwareRESUMO
The Drug Design Data Resource (D3R) aims to identify best practice methods for computer aided drug design through blinded ligand pose prediction and affinity challenges. Herein, we report on the results of Grand Challenge 4 (GC4). GC4 focused on proteins beta secretase 1 and Cathepsin S, and was run in an analogous manner to prior challenges. In Stage 1, participant ability to predict the pose and affinity of BACE1 ligands were assessed. Following the completion of Stage 1, all BACE1 co-crystal structures were released, and Stage 2 tested affinity rankings with co-crystal structures. We provide an analysis of the results and discuss insights into determined best practice methods.
Assuntos
Secretases da Proteína Precursora do Amiloide/antagonistas & inibidores , Ácido Aspártico Endopeptidases/antagonistas & inibidores , Desenho de Fármacos , Inibidores Enzimáticos/farmacologia , Bibliotecas de Moléculas Pequenas/farmacologia , Secretases da Proteína Precursora do Amiloide/metabolismo , Ácido Aspártico Endopeptidases/metabolismo , Inibidores Enzimáticos/química , Humanos , Ligantes , Aprendizado de Máquina , Simulação de Acoplamento Molecular , Bibliotecas de Moléculas Pequenas/química , TermodinâmicaRESUMO
The Drug Design Data Resource aims to test and advance the state of the art in protein-ligand modeling by holding community-wide blinded, prediction challenges. Here, we report on our third major round, Grand Challenge 3 (GC3). Held 2017-2018, GC3 centered on the protein Cathepsin S and the kinases VEGFR2, JAK2, p38-α, TIE2, and ABL1, and included both pose-prediction and affinity-ranking components. GC3 was structured much like the prior challenges GC2015 and GC2. First, Stage 1 tested pose prediction and affinity ranking methods; then all available crystal structures were released, and Stage 2 tested only affinity rankings, now in the context of the available structures. Unique to GC3 was the addition of a Stage 1b self-docking subchallenge, in which the protein coordinates from all of the cocrystal structures used in the cross-docking challenge were released, and participants were asked to predict the pose of CatS ligands using these newly released structures. We provide an overview of the outcomes and discuss insights into trends and best-practices.
Assuntos
Catepsinas/química , Simulação de Acoplamento Molecular/métodos , Inibidores de Proteínas Quinases/química , Proteínas Quinases/química , Sítios de Ligação , Desenho Assistido por Computador , Cristalografia por Raios X , Bases de Dados de Proteínas , Desenho de Fármacos , Ligantes , Ligação Proteica , Conformação Proteica , TermodinâmicaRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, http://rcsb.org), the US data center for the global PDB archive, makes PDB data freely available to all users, from structural biologists to computational biologists and beyond. New tools and resources have been added to the RCSB PDB web portal in support of a 'Structural View of Biology.' Recent developments have improved the User experience, including the high-speed NGL Viewer that provides 3D molecular visualization in any web browser, improved support for data file download and enhanced organization of website pages for query, reporting and individual structure exploration. Structure validation information is now visible for all archival entries. PDB data have been integrated with external biological resources, including chromosomal position within the human genome; protein modifications; and metabolic pathways. PDB-101 educational materials have been reorganized into a searchable website and expanded to include new features such as the Geis Digital Archive.
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Proteínas/química , Proteínas/genética , Conjuntos de Dados como Assunto , Redes e Vias Metabólicas , Modelos Moleculares , Conformação Proteica , Proteínas/metabolismo , Software , Relação Estrutura-Atividade , Interface Usuário-Computador , NavegadorRESUMO
The Drug Design Data Resource (D3R) ran Grand Challenge 2 (GC2) from September 2016 through February 2017. This challenge was based on a dataset of structures and affinities for the nuclear receptor farnesoid X receptor (FXR), contributed by F. Hoffmann-La Roche. The dataset contained 102 IC50 values, spanning six orders of magnitude, and 36 high-resolution co-crystal structures with representatives of four major ligand classes. Strong global participation was evident, with 49 participants submitting 262 prediction submission packages in total. Procedurally, GC2 mimicked Grand Challenge 2015 (GC2015), with a Stage 1 subchallenge testing ligand pose prediction methods and ranking and scoring methods, and a Stage 2 subchallenge testing only ligand ranking and scoring methods after the release of all blinded co-crystal structures. Two smaller curated sets of 18 and 15 ligands were developed to test alchemical free energy methods. This overview summarizes all aspects of GC2, including the dataset details, challenge procedures, and participant results. We also consider implications for progress in the field, while highlighting methodological areas that merit continued development. Similar to GC2015, the outcome of GC2 underscores the pressing need for methods development in pose prediction, particularly for ligand scaffolds not currently represented in the Protein Data Bank ( http://www.pdb.org ), and in affinity ranking and scoring of bound ligands.
Assuntos
Desenho de Fármacos , Receptores Citoplasmáticos e Nucleares/metabolismo , Desenho Assistido por Computador , Bases de Dados de Proteínas , Humanos , Concentração Inibidora 50 , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Receptores Citoplasmáticos e Nucleares/agonistas , Receptores Citoplasmáticos e Nucleares/antagonistas & inibidores , Receptores Citoplasmáticos e Nucleares/química , Software , TermodinâmicaRESUMO
UNLABELLED: The Chemical Component Dictionary (CCD) is a chemical reference data resource that describes all residue and small molecule components found in Protein Data Bank (PDB) entries. The CCD contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands and solvent molecules. Each chemical definition includes descriptions of chemical properties such as stereochemical assignments, chemical descriptors, systematic chemical names and idealized coordinates. The content, preparation, validation and distribution of this CCD chemical reference dataset are described. AVAILABILITY AND IMPLEMENTATION: The CCD is updated regularly in conjunction with the scheduled weekly release of new PDB structure data. The CCD and amino acid variant reference datasets are hosted in the public PDB ftp repository at ftp://ftp.wwpdb.org/pub/pdb/data/monomers/components.cif.gz, ftp://ftp.wwpdb.org/pub/pdb/data/monomers/aa-variants-v1.cif.gz, and its mirror sites, and can be accessed from http://wwpdb.org. CONTACT: jwest@rcsb.rutgers.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Bases de Dados de Compostos Químicos , Bases de Dados de Proteínas , Dicionários Químicos como Assunto , Substâncias Macromoleculares/química , Anotação de Sequência Molecular , Internet , Ligantes , Interface Usuário-ComputadorRESUMO
With the accumulation of a large number and variety of molecules in the Protein Data Bank (PDB) comes the need on occasion to review and improve their representation. The Worldwide PDB (wwPDB) partners have periodically updated various aspects of structural data representation to improve the integrity and consistency of the archive. The remediation effort described here was focused on improving the representation of peptide-like inhibitor and antibiotic molecules so that they can be easily identified and analyzed. Peptide-like inhibitors or antibiotics were identified in over 1000 PDB entries, systematically reviewed and represented either as peptides with polymer sequence or as single components. For the majority of the single-component molecules, their peptide-like composition was captured in a new representation, called the subcomponent sequence. A novel concept called "group" was developed for representing complex peptide-like antibiotics and inhibitors that are composed of multiple polymer and nonpolymer components. In addition, a reference dictionary was developed with detailed information about these peptide-like molecules to aid in their annotation, identification and analysis. Based on the experience gained in this remediation, guidelines, procedures, and tools were developed to annotate new depositions containing peptide-like inhibitors and antibiotics accurately and consistently.
Assuntos
Antibacterianos/farmacologia , Bases de Dados de Proteínas , Peptídeos/farmacologia , Antibacterianos/química , Inibidores Enzimáticos/química , Inibidores Enzimáticos/farmacologia , Gramicidina/química , Gramicidina/farmacologia , Elastase Pancreática/antagonistas & inibidores , Peptídeos/química , Tioestreptona/química , Tioestreptona/farmacologia , Vancomicina/química , Vancomicina/farmacologiaRESUMO
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein/nucleic-acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: E. coli beta-galactosidase with inhibitor, SARS-CoV-2 RNA-dependent RNA polymerase with covalently bound nucleotide analog, and SARS-CoV-2 ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. We found that (1) the quality of submitted ligand models and surrounding atoms varied, as judged by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics, and contact scores, and (2) a composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.
RESUMO
Approximately 87% of the more than 190,000 atomic-level three-dimensional (3D) biostructures in the PDB were determined using macromolecular crystallography (MX). Agreement between 3D atomic coordinates and experimental data for >100 million individual amino acid residues occurring within â¼150,000 PDB MX structures was analyzed in detail. The real-space correlation coefficient (RSCC) calculated using the 3D atomic coordinates for each residue and experimental-data-derived electron density enables outlier detection of unreliable atomic coordinates (particularly important for poorly resolved side-chain atoms) and ready evaluation of local structure quality by PDB users. For human protein MX structures in PDB, comparisons of the per-residue RSCC metric with AlphaFold2-computed structure model confidence (pLDDT-predicted local distance difference test) document (1) that RSCC values and pLDDT scores are correlated (median correlation coefficient â¼0.41), and (2) that experimentally determined MX structures (3.5 Å resolution or better) are more reliable than AlphaFold2-computed structure models and should be used preferentially whenever possible.
Assuntos
Aminoácidos , Bases de Dados de Proteínas , Humanos , Substâncias Macromoleculares , Proteínas de Resistência a Myxovirus , Conformação ProteicaRESUMO
More than 70% of the experimentally determined macromolecular structures in the Protein Data Bank (PDB) contain small-molecule ligands. Quality indicators of â¼643,000 ligands present in â¼106,000 PDB X-ray crystal structures have been analyzed. Ligand quality varies greatly with regard to goodness of fit between ligand structure and experimental data, deviations in bond lengths and angles from known chemical structures, and inappropriate interatomic clashes between the ligand and its surroundings. Based on principal component analysis, correlated quality indicators of ligand structure have been aggregated into two largely orthogonal composite indicators measuring goodness of fit to experimental data and deviation from ideal chemical structure. Ranking of the composite quality indicators across the PDB archive enabled construction of uniformly distributed composite ranking score. This score is implemented at RCSB.org to compare chemically identical ligands in distinct PDB structures with easy-to-interpret two-dimensional ligand quality plots, allowing PDB users to quickly assess ligand structure quality and select the best exemplars.
Assuntos
Proteínas/química , Proteínas/metabolismo , Bibliotecas de Moléculas Pequenas/farmacologia , Bases de Dados de Proteínas , Ligantes , Modelos Moleculares , Conformação ProteicaRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the United States National Science Foundation, National Institutes of Health, and Department of Energy, supports structural biologists and Protein Data Bank (PDB) data users around the world. The RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, serves as the US data center for the global PDB archive housing experimentally-determined three-dimensional (3D) structure data for biological macromolecules. As the wwPDB-designated Archive Keeper, RCSB PDB is also responsible for the security of PDB data and weekly update of the archive. RCSB PDB serves tens of thousands of data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) annually working on all permanently inhabited continents. RCSB PDB makes PDB data available from its research-focused web portal at no charge and without usage restrictions to many millions of PDB data consumers around the globe. It also provides educators, students, and the general public with an introduction to the PDB and related training materials through its outreach and education-focused web portal. This review article describes growth of the PDB, examines evolution of experimental methods for structure determination viewed through the lens of the PDB archive, and provides a detailed accounting of PDB archival holdings and their utilization by researchers, educators, and students worldwide.
Assuntos
Biologia Computacional , Proteínas , Humanos , Conformação Proteica , Bases de Dados de Proteínas , Biologia Computacional/métodos , Proteínas/química , EstudantesRESUMO
PDBx/mmCIF, Protein Data Bank Exchange (PDBx) macromolecular Crystallographic Information Framework (mmCIF), has become the data standard for structural biology. With its early roots in the domain of small-molecule crystallography, PDBx/mmCIF provides an extensible data representation that is used for deposition, archiving, remediation, and public dissemination of experimentally determined three-dimensional (3D) structures of biological macromolecules by the Worldwide Protein Data Bank (wwPDB, wwpdb.org). Extensions of PDBx/mmCIF are similarly used for computed structure models by ModelArchive (modelarchive.org), integrative/hybrid structures by PDB-Dev (pdb-dev.wwpdb.org), small angle scattering data by Small Angle Scattering Biological Data Bank SASBDB (sasbdb.org), and for models computed generated with the AlphaFold 2.0 deep learning software suite (alphafold.ebi.ac.uk). Community-driven development of PDBx/mmCIF spans three decades, involving contributions from researchers, software and methods developers in structural sciences, data repository providers, scientific publishers, and professional societies. Having a semantically rich and extensible data framework for representing a wide range of structural biology experimental and computational results, combined with expertly curated 3D biostructure data sets in public repositories, accelerates the pace of scientific discovery. Herein, we describe the architecture of the PDBx/mmCIF data standard, tools used to maintain representations of the data standard, governance, and processes by which data content standards are extended, plus community tools/software libraries available for processing and checking the integrity of PDBx/mmCIF data. Use cases exemplify how the members of the Worldwide Protein Data Bank have used PDBx/mmCIF as the foundation for its pipeline for delivering Findable, Accessible, Interoperable, and Reusable (FAIR) data to many millions of users worldwide.
Assuntos
Biologia Computacional , Cristalografia , Bases de Dados de Proteínas , Software , Substâncias Macromoleculares/química , Biologia Molecular , Conformação Proteica , SemânticaRESUMO
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB), funded by the US National Science Foundation, National Institutes of Health, and Department of Energy, has served structural biologists and Protein Data Bank (PDB) data consumers worldwide since 1999. RCSB PDB, a founding member of the Worldwide Protein Data Bank (wwPDB) partnership, is the US data center for the global PDB archive housing biomolecular structure data. RCSB PDB is also responsible for the security of PDB data, as the wwPDB-designated Archive Keeper. Annually, RCSB PDB serves tens of thousands of three-dimensional (3D) macromolecular structure data depositors (using macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction) from all inhabited continents. RCSB PDB makes PDB data available from its research-focused RCSB.org web portal at no charge and without usage restrictions to millions of PDB data consumers working in every nation and territory worldwide. In addition, RCSB PDB operates an outreach and education PDB101.RCSB.org web portal that was used by more than 800,000 educators, students, and members of the public during calendar year 2020. This invited Tools Issue contribution describes (i) how the archive is growing and evolving as new experimental methods generate ever larger and more complex biomolecular structures; (ii) the importance of data standards and data remediation in effective management of the archive and facile integration with more than 50 external data resources; and (iii) new tools and features for 3D structure analysis and visualization made available during the past year via the RCSB.org web portal.
Assuntos
Biologia Computacional/história , Bases de Dados de Proteínas/história , Interface Usuário-Computador , Aniversários e Eventos Especiais , História do Século XX , História do Século XXIRESUMO
Now in its 52nd year of continuous operations, the Protein Data Bank (PDB) is the premiere open-access global archive housing three-dimensional (3D) biomolecular structure data. It is jointly managed by the Worldwide Protein Data Bank (wwPDB) partnership. The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) is funded by the National Science Foundation, National Institutes of Health, and US Department of Energy and serves as the US data center for the wwPDB. RCSB PDB is also responsible for the security of PDB data in its role as wwPDB-designated Archive Keeper. Every year, RCSB PDB serves tens of thousands of depositors of 3D macromolecular structure data (coming from macromolecular crystallography, nuclear magnetic resonance spectroscopy, electron microscopy, and micro-electron diffraction). The RCSB PDB research-focused web portal (RCSB.org) makes PDB data available at no charge and without usage restrictions to many millions of PDB data consumers around the world. The RCSB PDB training, outreach, and education web portal (PDB101.RCSB.org) serves nearly 700 K educators, students, and members of the public worldwide. This invited Tools Issue contribution describes how RCSB PDB (i) is organized; (ii) works with wwPDB partners to process new depositions; (iii) serves as the wwPDB-designated Archive Keeper; (iv) enables exploration and 3D visualization of PDB data via RCSB.org; and (v) supports training, outreach, and education via PDB101.RCSB.org. New tools and features at RCSB.org are presented using examples drawn from high-resolution structural studies of proteins relevant to treatment of human cancers by targeting immune checkpoints.
Assuntos
Biologia Computacional , Proteínas , Humanos , Conformação Proteica , Bases de Dados de Proteínas , Proteínas/química , Biologia Computacional/métodos , Substâncias Macromoleculares/químicaRESUMO
Analyses of publicly available structural data reveal interesting insights into the impact of the three-dimensional (3D) structures of protein targets important for discovery of new drugs (e.g., G-protein-coupled receptors, voltage-gated ion channels, ligand-gated ion channels, transporters, and E3 ubiquitin ligases). The Protein Data Bank (PDB) archive currently holds > 155,000 atomic-level 3D structures of biomolecules experimentally determined using crystallography, nuclear magnetic resonance spectroscopy, and electron microscopy. The PDB was established in 1971 as the first open-access, digital-data resource in biology, and is now managed by the Worldwide PDB partnership (wwPDB; wwPDB.org). US PDB operations are the responsibility of the Research Collaboratory for Structural Bioinformatics PDB (RCSB PDB). The RCSB PDB serves millions of RCSB.org users worldwide by delivering PDB data integrated with â¼40 external biodata resources, providing rich structural views of fundamental biology, biomedicine, and energy sciences. Recently published work showed that the PDB archival holdings facilitated discovery of â¼90% of the 210 new drugs approved by the US Food and Drug Administration 2010-2016. We review user-driven development of RCSB PDB services, examine growth of the PDB archive in terms of size and complexity, and present examples and opportunities for structure-guided drug discovery for challenging targets (e.g., integral membrane proteins).