Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 206
Filtrar
1.
bioRxiv ; 2024 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-39386449

RESUMO

Diffusion models have shown promise in addressing the protein docking problem. Traditionally, these models are used solely for sampling docked poses, with a separate confidence model for ranking. We introduce DFMDock (Denoising Force Matching Dock), a diffusion model that unifies sampling and ranking within a single framework. DFMDock features two output heads: one for predicting forces and the other for predicting energies. The forces are trained using a denoising force matching objective, while the energy gradients are trained to align with the forces. This design enables our model to sample using the predicted forces and rank poses using the predicted energies, thereby eliminating the need for an additional confidence model. Our approach outperforms the previous diffusion model for protein docking, DiffDock-PP, with a sampling success rate of 44% compared to its 8%, and a Top- 1 ranking success rate of 16% compared to 0% on the Docking Benchmark 5.5 test set. In successful decoy cases, the DFMDock Energy forms a binding funnel similar to the physics-based Rosetta Energy, suggesting that DFMDock can capture the underlying energy landscape.

2.
J Am Pharm Assoc (2003) ; : 102214, 2024 Aug 26.
Artigo em Inglês | MEDLINE | ID: mdl-39197588

RESUMO

BACKGROUND: Sustainable career advancement opportunities for pharmacy technicians will be a critical part of patient-centered community pharmacy environments as the role of the pharmacist provider expands. OBJECTIVES: (1) To determine the impact of a Pharmacy Technician Certification Board pharmacy (PTCB) certification on career advancement and professional growth metrics; (2) To assess technicians' role in advanced pharmacy services before and after certification; and (3) To identify changes in pharmacist services when a certified pharmacy technician (CPhT) was added to the provider team. METHODS: A 73-question web-based survey was distributed to all PTCB certified pharmacy technicians (CPhT) in the United States, Washing D.C., Puerto Rico, Guam, and the US Virgin Islands. The survey was distributed by PTCB in April 2021 with a 28-day collection period. The survey included multiple choice, rating scale, and free text questions centered on five domains: Practice experience, Career aspirations, Compensation, Pharmacy practice motivations, and Impact of COVID-19 pandemic. RESULTS: 23,007 CPhTs completed the survey. Respondents were primarily female (85.5%), age 30-39 (32.8%), and ≥ 10 years CPhT experience (42.8%). The majority of respondents cited improvement of patient health (77.4%), career advancement opportunities (53.5%), the ability to expand their role during emergencies (e.g., COVID-19) (52.6%), and future career advancement opportunities (51.7%) as benefits of CPhT certification. Increases in job responsibility after certification included changes occurring in roles related to clinical pharmacy services, patient education, preventive health services, provider communication, and staff training. Respondents agreed that PTCB-certification allowed for the expansion of pharmacists' services where they practiced, including clinical services (18.5%), patient education (18.3%), and preventive health services (18.1%). CONCLUSION: CPhT's value certification for its benefits on career advancement, personal growth, and salary enhancement. Affirmation of skill and training through certification is also recognized to positively influence patient care and the pharmacy's ability to provide advanced patient care and services.

3.
bioRxiv ; 2024 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-39026849

RESUMO

The oligomerization of protein macromolecules on cell membranes plays a fundamental role in regulating cellular function. From modulating signal transduction to directing immune response, membrane proteins (MPs) play a crucial role in biological processes and are often the target of many pharmaceutical drugs. Despite their biological relevance, the challenges in experimental determination have hampered the structural availability of membrane proteins and their complexes. Computational docking provides a promising alternative to model membrane protein complex structures. Here, we present Rosetta-MPDock, a flexible transmembrane (TM) protein docking protocol that captures binding-induced conformational changes. Rosetta-MPDock samples large conformational ensembles of flexible monomers and docks them within an implicit membrane environment. We benchmarked this method on 29 TM-protein complexes of variable backbone flexibility. These complexes are classified based on the root-mean-square deviation between the unbound and bound states (RMSDUB) as: rigid (RMSDUB <1.2 Å), moderately-flexible (RMSDUB ∈ [1.2, 2.2) Å), and flexible targets (RMSDUB > 2.2 Å). In a local docking scenario, i.e. with membrane protein partners starting ≈10 Å apart embedded in the membrane in their unbound conformations, Rosetta-MPDock successfully predicts the correct interface (success defined as achieving 3 near-native structures in the 5 top-ranked models) for 67% moderately flexible targets and 60% of the highly flexible targets, a substantial improvement from the existing membrane protein docking methods. Further, by integrating AlphaFold2-multimer for structure determination and using Rosetta-MPDock for docking and refinement, we demonstrate improved success rates over the benchmark targets from 64% to 73%. Rosetta-MPDock advances the capabilities for membrane protein complex structure prediction and modeling to tackle key biological questions and elucidate functional mechanisms in the membrane environment. The benchmark set and the code is available for public use at github.com/Graylab/MPDock.

4.
bioRxiv ; 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38854075

RESUMO

Animal venoms, distinguished by their unique structural features and potent bioactivities, represent a vast and relatively untapped reservoir of therapeutic molecules. However, limitations associated with extracting or expressing large numbers of individual venoms and venom-like molecules have precluded their therapeutic evaluation via high throughput screening. Here, we developed an innovative computational approach to design a highly diverse library of animal venoms and "metavenoms". We employed programmable M13 hyperphage display to preserve critical disulfide-bonded structures for highly parallelized single-round biopanning with quantitation via high-throughput DNA sequencing. Our approach led to the discovery of Kunitz type domain containing proteins that target the human itch receptor Mas-related G protein-coupled receptor X4 (MRGPRX4), which plays a crucial role in itch perception. Deep learning-based structural homology mining identified two endogenous human homologs, tissue factor pathway inhibitor (TFPI) and serine peptidase inhibitor, Kunitz type 2 (SPINT2), which exhibit agonist-dependent potentiation of MRGPRX4. Highly multiplexed screening of animal venoms and metavenoms is therefore a promising approach to uncover new drug candidates.

5.
J Am Chem Soc ; 146(26): 17801-17816, 2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38887845

RESUMO

Gangliosides, sialic acid bearing glycosphingolipids, are components of the outer leaflet of plasma membranes of all vertebrate cells. They contribute to cell regulation by interacting with proteins in their own membranes (cis) or their extracellular milieu (trans). As amphipathic membrane constituents, gangliosides present challenges for identifying their ganglioside protein interactome. To meet these challenges, we synthesized bifunctional clickable photoaffinity gangliosides, delivered them to plasma membranes of cultured cells, then captured and identified their interactomes using proteomic mass spectrometry. Installing probes on ganglioside lipid and glycan moieties, we captured cis and trans ganglioside-protein interactions. Ganglioside interactomes varied with the ganglioside structure, cell type, and site of the probe (lipid or glycan). Gene ontology revealed that gangliosides engage with transmembrane transporters and cell adhesion proteins including integrins, cadherins, and laminins. The approach developed is applicable to other gangliosides and cell types, promising to provide insights into molecular and cellular regulation by gangliosides.


Assuntos
Química Click , Gangliosídeos , Gangliosídeos/química , Gangliosídeos/metabolismo , Humanos , Marcadores de Fotoafinidade/química , Marcadores de Fotoafinidade/síntese química , Sondas Moleculares/química , Sondas Moleculares/síntese química , Membrana Celular/metabolismo , Membrana Celular/química
6.
PLoS Comput Biol ; 20(6): e1011895, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38913746

RESUMO

Carbohydrates and glycoproteins modulate key biological functions. However, experimental structure determination of sugar polymers is notoriously difficult. Computational approaches can aid in carbohydrate structure prediction, structure determination, and design. In this work, we developed a glycan-modeling algorithm, GlycanTreeModeler, that computationally builds glycans layer-by-layer, using adaptive kernel density estimates (KDE) of common glycan conformations derived from data in the Protein Data Bank (PDB) and from quantum mechanics (QM) calculations. GlycanTreeModeler was benchmarked on a test set of glycan structures of varying lengths, or "trees". Structures predicted by GlycanTreeModeler agreed with native structures at high accuracy for both de novo modeling and experimental density-guided building. We employed these tools to design de novo glycan trees into a protein nanoparticle vaccine to shield regions of the scaffold from antibody recognition, and experimentally verified shielding. This work will inform glycoprotein model prediction, glycan masking, and further aid computational methods in experimental structure determination and refinement.


Assuntos
Algoritmos , Biologia Computacional , Glicoproteínas , Modelos Moleculares , Polissacarídeos , Polissacarídeos/química , Biologia Computacional/métodos , Glicoproteínas/química , Bases de Dados de Proteínas , Software , Configuração de Carboidratos
7.
MAbs ; 16(1): 2362775, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38899735

RESUMO

Over the past two decades, therapeutic antibodies have emerged as a rapidly expanding domain within the field of biologics. In silico tools that can streamline the process of antibody discovery and optimization are critical to support a pipeline that is growing more numerous and complex every year. High-quality structural information remains critical for the antibody optimization process, but antibody-antigen complex structures are often unavailable and in silico antibody docking methods are still unreliable. In this study, DeepAb, a deep learning model for predicting antibody Fv structure directly from sequence, was used in conjunction with single-point experimental deep mutational scanning (DMS) enrichment data to design 200 potentially optimized variants of an anti-hen egg lysozyme (HEL) antibody. We sought to determine whether DeepAb-designed variants containing combinations of beneficial mutations from the DMS exhibit enhanced thermostability and whether this optimization affected their developability profile. The 200 variants were produced through a robust high-throughput method and tested for thermal and colloidal stability (Tonset, Tm, Tagg), affinity (KD) relative to the parental antibody, and for developability parameters (nonspecific binding, aggregation propensity, self-association). Of the designed clones, 91% and 94% exhibited increased thermal and colloidal stability and affinity, respectively. Of these, 10% showed a significantly increased affinity for HEL (5- to 21-fold increase) and thermostability (>2.5C increase in Tm1), with most clones retaining the favorable developability profile of the parental antibody. Additional in silico tests suggest that these methods would enrich for binding affinity even without first collecting experimental DMS measurements. These data open the possibility of in silico antibody optimization without the need to predict the antibody-antigen interface, which is notoriously difficult in the absence of crystal structures.


Assuntos
Afinidade de Anticorpos , Muramidase , Muramidase/química , Muramidase/imunologia , Muramidase/genética , Estabilidade Proteica , Humanos , Antígenos/imunologia , Antígenos/química , Animais , Simulação por Computador
8.
EClinicalMedicine ; 68: 102383, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38545090

RESUMO

Background: SARS-CoV-2 binding to ACE2 is potentially associated with severe pneumonia due to COVID-19. The aim of the study was to test whether Mas-receptor activation by 20-hydroxyecdysone (BIO101) could restore the Renin-Angiotensin System equilibrium and limit the frequency of respiratory failure and mortality in adults hospitalized with severe COVID-19. Methods: Double-blind, randomized, placebo-controlled phase 2/3 trial. Randomization: 1:1 oral BIO101 (350 mg BID) or placebo, up to 28 days or until an endpoint was reached. Primary endpoint: mortality or respiratory failure requiring high-flow oxygen, mechanical ventilation, or extra-corporeal membrane oxygenation. Key secondary endpoint: hospital discharge following recovery (ClinicalTrials.gov Number, NCT04472728). Findings: Due to low recruitment the planned sample size of 310 was not reached and 238 patients were randomized between August 26, 2020 and March 8, 2022. In the modified ITT population (233 patients; 126 BIO101 and 107 placebo), respiratory failure or early death by day 28 was 11.4% lower in the BIO101 (13.5%) than in the placebo (24.3%) group, (p = 0.0426). At day 28, proportions of patients discharged following recovery were 80.1%, and 70.9% in the BIO101 and placebo group respectively, (adjusted difference 11.0%, 95% CI [-0.4%, 22.4%], p = 0.0586). Hazard Ratio for time to death over 90 days: 0.554 (95% CI [0.285, 1.077]), a 44.6% mortality reduction in the BIO101 group (not statistically significant). Treatment emergent adverse events of respiratory failure were more frequent in the placebo group. Interpretation: BIO101 significantly reduced the risk of death or respiratory failure supporting its use in adults hospitalized with severe respiratory symptoms due to COVID-19. Funding: Biophytis.

9.
J Agric Food Chem ; 72(8): 4225-4236, 2024 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-38354215

RESUMO

GH 62 arabinofuranosidases are known for their excellent specificity for arabinoxylan of agroindustrial residues and their synergism with endoxylanases and other hemicellulases. However, the low thermostability of some GH enzymes hampers potential industrial applications. Protein engineering research highly desires mutations that can enhance thermostability. Therefore, we employed directed evolution using one round of error-prone PCR and site-saturation mutagenesis for thermostability enhancement of GH 62 arabinofuranosidase from Aspergillus fumigatus. Single mutants with enhanced thermostability showed significant ΔΔG changes (<-2.5 kcal/mol) and improvements in perplexity scores from evolutionary scale modeling inverse folding. The best mutant, G205K, increased the melting temperature by 5 °C and the energy of denaturation by 41.3%. We discussed the functional mechanisms for improved stability. Analyzing the adjustments in α-helices, ß-sheets, and loops resulting from point mutations, we have obtained significant knowledge regarding the potential impacts on protein stability, folding, and overall structural integrity.


Assuntos
Glicosídeo Hidrolases , Engenharia de Proteínas , Estabilidade Enzimática , Temperatura , Mutagênese
10.
PLoS Comput Biol ; 20(1): e1011296, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38252688

RESUMO

Membrane protein structure prediction and design are challenging due to the complexity of capturing the interactions in the lipid layer, such as those arising from electrostatics. Accurately capturing electrostatic energies in the low-dielectric membrane often requires expensive Poisson-Boltzmann calculations that are not scalable for membrane protein structure prediction and design. In this work, we have developed a fast-to-compute implicit energy function that considers the realistic characteristics of different lipid bilayers, making design calculations tractable. This method captures the impact of the lipid head group using a mean-field-based approach and uses a depth-dependent dielectric constant to characterize the membrane environment. This energy function Franklin2023 (F23) is built upon Franklin2019 (F19), which is based on experimentally derived hydrophobicity scales in the membrane bilayer. We evaluated the performance of F23 on five different tests probing (1) protein orientation in the bilayer, (2) stability, and (3) sequence recovery. Relative to F19, F23 has improved the calculation of the tilt angle of membrane proteins for 90% of WALP peptides, 15% of TM-peptides, and 25% of the adsorbed peptides. The performances for stability and design tests were equivalent for F19 and F23. The speed and calibration of the implicit model will help F23 access biophysical phenomena at long time and length scales and accelerate the membrane protein design pipeline.


Assuntos
Bicamadas Lipídicas , Proteínas de Membrana , Eletricidade Estática , Bicamadas Lipídicas/química , Proteínas de Membrana/química , Fenômenos Biofísicos , Peptídeos
11.
Protein Sci ; 33(2): e4862, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38148272

RESUMO

Conventional protein-protein docking algorithms usually rely on heavy candidate sampling and reranking, but these steps are time-consuming and hinder applications that require high-throughput complex structure prediction, for example, structure-based virtual screening. Existing deep learning methods for protein-protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding-induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multitrack iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments, GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. On the Database of Interacting Protein Structures (DIPS) test set, GeoDock achieves a 43% top-1 success rate, outperforming all other tested methods. However, in the standard DIPS train/test splits, we discovered contamination of close homologs in the training set. After decontaminating the training set, the success rate is 31%. On the DB5.5 test set and a benchmark dataset of antibody-antigen complexes, GeoDock outperforms the deep learning models trained using the same dataset but falls behind most of the conventional methods and AlphaFold-Multimer. GeoDock attains an average inference speed of under 1 s on a single GPU, enabling its application in large-scale structure screening. Although binding-induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.


Assuntos
Algoritmos , Proteínas , Salicilatos , Conformação Proteica , Ligação Proteica , Proteínas/química , Simulação de Acoplamento Molecular
12.
Cell Syst ; 14(11): 979-989.e4, 2023 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-37909045

RESUMO

Discovery and optimization of monoclonal antibodies for therapeutic applications relies on large sequence libraries but is hindered by developability issues such as low solubility, high aggregation, and high immunogenicity. Generative language models, trained on millions of protein sequences, are a powerful tool for the on-demand generation of realistic, diverse sequences. We present the Immunoglobulin Language Model (IgLM), a deep generative language model for creating synthetic antibody libraries. Compared with prior methods that leverage unidirectional context for sequence generation, IgLM formulates antibody design based on text-infilling in natural language, allowing it to re-design variable-length spans within antibody sequences using bidirectional context. We trained IgLM on 558 million (M) antibody heavy- and light-chain variable sequences, conditioning on each sequence's chain type and species of origin. We demonstrate that IgLM can generate full-length antibody sequences from a variety of species and its infilling formulation allows it to generate infilled complementarity-determining region (CDR) loop libraries with improved in silico developability profiles. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
Regiões Determinantes de Complementaridade , Biblioteca de Peptídeos , Sequência de Aminoácidos , Regiões Determinantes de Complementaridade/genética , Anticorpos Monoclonais
13.
Proteins ; 91(12): 1658-1683, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37905971

RESUMO

We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas , Mapeamento de Interação de Proteínas/métodos , Conformação Proteica , Ligação Proteica , Simulação de Acoplamento Molecular , Biologia Computacional/métodos , Software
15.
bioRxiv ; 2023 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-37546760

RESUMO

Despite the recent breakthrough of AlphaFold (AF) in the field of protein sequence-to-structure prediction, modeling protein interfaces and predicting protein complex structures remains challenging, especially when there is a significant conformational change in one or both binding partners. Prior studies have demonstrated that AF-multimer (AFm) can predict accurate protein complexes in only up to 43% of cases.1 In this work, we combine AlphaFold as a structural template generator with a physics-based replica exchange docking algorithm. Using a curated collection of 254 available protein targets with both unbound and bound structures, we first demonstrate that AlphaFold confidence measures (pLDDT) can be repurposed for estimating protein flexibility and docking accuracy for multimers. We incorporate these metrics within our ReplicaDock 2.0 protocol2 to complete a robust in-silico pipeline for accurate protein complex structure prediction. AlphaRED (AlphaFold-initiated Replica Exchange Docking) successfully docks failed AF predictions including 97 failure cases in Docking Benchmark Set 5.5. AlphaRED generates CAPRI acceptable-quality or better predictions for 66% of benchmark targets. Further, on a subset of antigen-antibody targets, which is challenging for AFm (19% success rate), AlphaRED demonstrates a success rate of 51%. This new strategy demonstrates the success possible by integrating deep-learning based architectures trained on evolutionary information with physics-based enhanced sampling. The pipeline is available at github.com/Graylab/AlphaRED.

16.
bioRxiv ; 2023 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-37425754

RESUMO

Conventional protein-protein docking algorithms usually rely on heavy candidate sampling and re-ranking, but these steps are time-consuming and hinder applications that require high-throughput complex structure prediction, e.g., structure-based virtual screening. Existing deep learning methods for protein-protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding-induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multi-track iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments (MSAs), GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. For a benchmark set of rigid targets, GeoDock obtains a 41% success rate, outperforming all the other tested methods. For a more challenging benchmark set of flexible targets, GeoDock achieves a similar number of top-model successes as the traditional method ClusPro [1], but fewer than ReplicaDock2 [2]. GeoDock attains an average inference speed of under one second on a single GPU, enabling its application in large-scale structure screening. Although binding-induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.

17.
bioRxiv ; 2023 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-37425950

RESUMO

Membrane protein structure prediction and design are challenging due to the complexity of capturing the interactions in the lipid layer, such as those arising from electrostatics. Accurately capturing electrostatic energies in the low-dielectric membrane often requires expensive Poisson-Boltzmann calculations that are not scalable for membrane protein structure prediction and design. In this work, we have developed a fast-to-compute implicit energy function that considers the realistic characteristics of different lipid bilayers, making design calculations tractable. This method captures the impact of the lipid head group using a mean-field-based approach and uses a depth-dependent dielectric constant to characterize the membrane environment. This energy function Franklin2023 (F23) is built upon Franklin2019 (F19), which is based on experimentally derived hydrophobicity scales in the membrane bilayer. We evaluated the performance of F23 on five different tests probing (1) protein orientation in the bilayer, (2) stability, and (3) sequence recovery. Relative to F19, F23 has improved the calculation of the tilt angle of membrane proteins for 90% of WALP peptides, 15% of TM-peptides, and 25% of the adsorbed peptides. The performances for stability and design tests were equivalent for F19 and F23. The speed and calibration of the implicit model will help F23 access biophysical phenomena at long time and length scales and accelerate the membrane protein design pipeline.

18.
Front Bioinform ; 3: 1186531, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37409346

RESUMO

Carbohydrates dynamically and transiently interact with proteins for cell-cell recognition, cellular differentiation, immune response, and many other cellular processes. Despite the molecular importance of these interactions, there are currently few reliable computational tools to predict potential carbohydrate-binding sites on any given protein. Here, we present two deep learning (DL) models named CArbohydrate-Protein interaction Site IdentiFier (CAPSIF) that predicts non-covalent carbohydrate-binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G). While both models outperform previous surrogate methods used for carbohydrate-binding site prediction, CAPSIF:V performs better than CAPSIF:G, achieving test Dice scores of 0.597 and 0.543 and test set Matthews correlation coefficients (MCCs) of 0.599 and 0.538, respectively. We further tested CAPSIF:V on AlphaFold2-predicted protein structures. CAPSIF:V performed equivalently on both experimentally determined structures and AlphaFold2-predicted structures. Finally, we demonstrate how CAPSIF models can be used in conjunction with local glycan-docking protocols, such as GlycanDock, to predict bound protein-carbohydrate structures.

19.
bioRxiv ; 2023 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-37503113

RESUMO

The optimal residue identity at each position in a protein is determined by its structural, evolutionary, and functional context. We seek to learn the representation space of the optimal amino-acid residue in different structural contexts in proteins. Inspired by masked language modeling (MLM), our training aims to transduce learning of amino-acid labels from non-masked residues to masked residues in their structural environments and from general (e.g., a residue in a protein) to specific contexts (e.g., a residue at the interface of a protein or antibody complex). Our results on native sequence recovery and forward folding with AlphaFold2 suggest that the amino acid label for a protein residue may be determined from its structural context alone (i.e., without knowledge of the sequence labels of surrounding residues). We further find that the sequence space sampled from our masked models recapitulate the evolutionary sequence neighborhood of the wildtype sequence. Remarkably, the sequences conditioned on highly plastic structures recapitulate the conformational flexibility encoded in the structures. Furthermore, maximum-likelihood interfaces designed with masked models recapitulate wildtype binding energies for a wide range of protein interfaces and binding strengths. We also propose and compare fine-tuning strategies to train models for designing CDR loops of antibodies in the structural context of the antibody-antigen interface by leveraging structural databases for proteins, antibodies (synthetic and experimental) and protein-protein complexes. We show that pretraining on more general contexts improves native sequence recovery for antibody CDR loops, especially for the hypervariable CDR H3, while fine-tuning helps to preserve patterns observed in special contexts.

20.
Artigo em Inglês | MEDLINE | ID: mdl-37484815

RESUMO

Therapeutic antibody engineering seeks to identify antibody sequences with specific binding to a target and optimized drug-like properties. When guided by deep learning, antibody generation methods can draw on prior knowledge and experimental efforts to improve this process. By leveraging the increasing quantity and quality of predicted structures of antibodies and target antigens, powerful structure-based generative models are emerging. In this review, we tie the advancements in deep learning-based protein structure prediction and design to the study of antibody therapeutics.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA