Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 199
Filter
1.
EClinicalMedicine ; 68: 102383, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38545090

ABSTRACT

Background: SARS-CoV-2 binding to ACE2 is potentially associated with severe pneumonia due to COVID-19. The aim of the study was to test whether Mas-receptor activation by 20-hydroxyecdysone (BIO101) could restore the Renin-Angiotensin System equilibrium and limit the frequency of respiratory failure and mortality in adults hospitalized with severe COVID-19. Methods: Double-blind, randomized, placebo-controlled phase 2/3 trial. Randomization: 1:1 oral BIO101 (350 mg BID) or placebo, up to 28 days or until an endpoint was reached. Primary endpoint: mortality or respiratory failure requiring high-flow oxygen, mechanical ventilation, or extra-corporeal membrane oxygenation. Key secondary endpoint: hospital discharge following recovery (ClinicalTrials.gov Number, NCT04472728). Findings: Due to low recruitment the planned sample size of 310 was not reached and 238 patients were randomized between August 26, 2020 and March 8, 2022. In the modified ITT population (233 patients; 126 BIO101 and 107 placebo), respiratory failure or early death by day 28 was 11.4% lower in the BIO101 (13.5%) than in the placebo (24.3%) group, (p = 0.0426). At day 28, proportions of patients discharged following recovery were 80.1%, and 70.9% in the BIO101 and placebo group respectively, (adjusted difference 11.0%, 95% CI [-0.4%, 22.4%], p = 0.0586). Hazard Ratio for time to death over 90 days: 0.554 (95% CI [0.285, 1.077]), a 44.6% mortality reduction in the BIO101 group (not statistically significant). Treatment emergent adverse events of respiratory failure were more frequent in the placebo group. Interpretation: BIO101 significantly reduced the risk of death or respiratory failure supporting its use in adults hospitalized with severe respiratory symptoms due to COVID-19. Funding: Biophytis.

2.
J Agric Food Chem ; 72(8): 4225-4236, 2024 Feb 28.
Article in English | MEDLINE | ID: mdl-38354215

ABSTRACT

GH 62 arabinofuranosidases are known for their excellent specificity for arabinoxylan of agroindustrial residues and their synergism with endoxylanases and other hemicellulases. However, the low thermostability of some GH enzymes hampers potential industrial applications. Protein engineering research highly desires mutations that can enhance thermostability. Therefore, we employed directed evolution using one round of error-prone PCR and site-saturation mutagenesis for thermostability enhancement of GH 62 arabinofuranosidase from Aspergillus fumigatus. Single mutants with enhanced thermostability showed significant ΔΔG changes (<-2.5 kcal/mol) and improvements in perplexity scores from evolutionary scale modeling inverse folding. The best mutant, G205K, increased the melting temperature by 5 °C and the energy of denaturation by 41.3%. We discussed the functional mechanisms for improved stability. Analyzing the adjustments in α-helices, ß-sheets, and loops resulting from point mutations, we have obtained significant knowledge regarding the potential impacts on protein stability, folding, and overall structural integrity.


Subject(s)
Glycoside Hydrolases , Protein Engineering , Enzyme Stability , Temperature , Mutagenesis
3.
PLoS Comput Biol ; 20(1): e1011296, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38252688

ABSTRACT

Membrane protein structure prediction and design are challenging due to the complexity of capturing the interactions in the lipid layer, such as those arising from electrostatics. Accurately capturing electrostatic energies in the low-dielectric membrane often requires expensive Poisson-Boltzmann calculations that are not scalable for membrane protein structure prediction and design. In this work, we have developed a fast-to-compute implicit energy function that considers the realistic characteristics of different lipid bilayers, making design calculations tractable. This method captures the impact of the lipid head group using a mean-field-based approach and uses a depth-dependent dielectric constant to characterize the membrane environment. This energy function Franklin2023 (F23) is built upon Franklin2019 (F19), which is based on experimentally derived hydrophobicity scales in the membrane bilayer. We evaluated the performance of F23 on five different tests probing (1) protein orientation in the bilayer, (2) stability, and (3) sequence recovery. Relative to F19, F23 has improved the calculation of the tilt angle of membrane proteins for 90% of WALP peptides, 15% of TM-peptides, and 25% of the adsorbed peptides. The performances for stability and design tests were equivalent for F19 and F23. The speed and calibration of the implicit model will help F23 access biophysical phenomena at long time and length scales and accelerate the membrane protein design pipeline.


Subject(s)
Lipid Bilayers , Membrane Proteins , Static Electricity , Lipid Bilayers/chemistry , Membrane Proteins/chemistry , Biophysical Phenomena , Peptides
4.
Protein Sci ; 33(2): e4862, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38148272

ABSTRACT

Conventional protein-protein docking algorithms usually rely on heavy candidate sampling and reranking, but these steps are time-consuming and hinder applications that require high-throughput complex structure prediction, for example, structure-based virtual screening. Existing deep learning methods for protein-protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding-induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multitrack iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments, GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. On the Database of Interacting Protein Structures (DIPS) test set, GeoDock achieves a 43% top-1 success rate, outperforming all other tested methods. However, in the standard DIPS train/test splits, we discovered contamination of close homologs in the training set. After decontaminating the training set, the success rate is 31%. On the DB5.5 test set and a benchmark dataset of antibody-antigen complexes, GeoDock outperforms the deep learning models trained using the same dataset but falls behind most of the conventional methods and AlphaFold-Multimer. GeoDock attains an average inference speed of under 1 s on a single GPU, enabling its application in large-scale structure screening. Although binding-induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.


Subject(s)
Algorithms , Proteins , Salicylates , Protein Conformation , Protein Binding , Proteins/chemistry , Molecular Docking Simulation
5.
Cell Syst ; 14(11): 979-989.e4, 2023 11 15.
Article in English | MEDLINE | ID: mdl-37909045

ABSTRACT

Discovery and optimization of monoclonal antibodies for therapeutic applications relies on large sequence libraries but is hindered by developability issues such as low solubility, high aggregation, and high immunogenicity. Generative language models, trained on millions of protein sequences, are a powerful tool for the on-demand generation of realistic, diverse sequences. We present the Immunoglobulin Language Model (IgLM), a deep generative language model for creating synthetic antibody libraries. Compared with prior methods that leverage unidirectional context for sequence generation, IgLM formulates antibody design based on text-infilling in natural language, allowing it to re-design variable-length spans within antibody sequences using bidirectional context. We trained IgLM on 558 million (M) antibody heavy- and light-chain variable sequences, conditioning on each sequence's chain type and species of origin. We demonstrate that IgLM can generate full-length antibody sequences from a variety of species and its infilling formulation allows it to generate infilled complementarity-determining region (CDR) loop libraries with improved in silico developability profiles. A record of this paper's transparent peer review process is included in the supplemental information.


Subject(s)
Complementarity Determining Regions , Peptide Library , Amino Acid Sequence , Complementarity Determining Regions/genetics , Antibodies, Monoclonal
6.
Proteins ; 91(12): 1658-1683, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37905971

ABSTRACT

We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homodimers, 3 homo-trimers, 13 heterodimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21 941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their five best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% of the targets compared to 8% two years earlier. This remarkable improvement is due to the wide use of the AlphaFold2 and AlphaFold2-Multimer software and the confidence metrics they provide. Notably, expanded sampling of candidate solutions by manipulating these deep learning inference engines, enriching multiple sequence alignments, or integration of advanced modeling tools, enabled top performing groups to exceed the performance of a standard AlphaFold2-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.


Subject(s)
Algorithms , Protein Interaction Mapping , Protein Interaction Mapping/methods , Protein Conformation , Protein Binding , Molecular Docking Simulation , Computational Biology/methods , Software
8.
bioRxiv ; 2023 Nov 25.
Article in English | MEDLINE | ID: mdl-37546760

ABSTRACT

Despite the recent breakthrough of AlphaFold (AF) in the field of protein sequence-to-structure prediction, modeling protein interfaces and predicting protein complex structures remains challenging, especially when there is a significant conformational change in one or both binding partners. Prior studies have demonstrated that AF-multimer (AFm) can predict accurate protein complexes in only up to 43% of cases.1 In this work, we combine AlphaFold as a structural template generator with a physics-based replica exchange docking algorithm. Using a curated collection of 254 available protein targets with both unbound and bound structures, we first demonstrate that AlphaFold confidence measures (pLDDT) can be repurposed for estimating protein flexibility and docking accuracy for multimers. We incorporate these metrics within our ReplicaDock 2.0 protocol2 to complete a robust in-silico pipeline for accurate protein complex structure prediction. AlphaRED (AlphaFold-initiated Replica Exchange Docking) successfully docks failed AF predictions including 97 failure cases in Docking Benchmark Set 5.5. AlphaRED generates CAPRI acceptable-quality or better predictions for 66% of benchmark targets. Further, on a subset of antigen-antibody targets, which is challenging for AFm (19% success rate), AlphaRED demonstrates a success rate of 51%. This new strategy demonstrates the success possible by integrating deep-learning based architectures trained on evolutionary information with physics-based enhanced sampling. The pipeline is available at github.com/Graylab/AlphaRED.

9.
Front Bioinform ; 3: 1186531, 2023.
Article in English | MEDLINE | ID: mdl-37409346

ABSTRACT

Carbohydrates dynamically and transiently interact with proteins for cell-cell recognition, cellular differentiation, immune response, and many other cellular processes. Despite the molecular importance of these interactions, there are currently few reliable computational tools to predict potential carbohydrate-binding sites on any given protein. Here, we present two deep learning (DL) models named CArbohydrate-Protein interaction Site IdentiFier (CAPSIF) that predicts non-covalent carbohydrate-binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G). While both models outperform previous surrogate methods used for carbohydrate-binding site prediction, CAPSIF:V performs better than CAPSIF:G, achieving test Dice scores of 0.597 and 0.543 and test set Matthews correlation coefficients (MCCs) of 0.599 and 0.538, respectively. We further tested CAPSIF:V on AlphaFold2-predicted protein structures. CAPSIF:V performed equivalently on both experimentally determined structures and AlphaFold2-predicted structures. Finally, we demonstrate how CAPSIF models can be used in conjunction with local glycan-docking protocols, such as GlycanDock, to predict bound protein-carbohydrate structures.

10.
bioRxiv ; 2023 Jul 29.
Article in English | MEDLINE | ID: mdl-37503113

ABSTRACT

The optimal residue identity at each position in a protein is determined by its structural, evolutionary, and functional context. We seek to learn the representation space of the optimal amino-acid residue in different structural contexts in proteins. Inspired by masked language modeling (MLM), our training aims to transduce learning of amino-acid labels from non-masked residues to masked residues in their structural environments and from general (e.g., a residue in a protein) to specific contexts (e.g., a residue at the interface of a protein or antibody complex). Our results on native sequence recovery and forward folding with AlphaFold2 suggest that the amino acid label for a protein residue may be determined from its structural context alone (i.e., without knowledge of the sequence labels of surrounding residues). We further find that the sequence space sampled from our masked models recapitulate the evolutionary sequence neighborhood of the wildtype sequence. Remarkably, the sequences conditioned on highly plastic structures recapitulate the conformational flexibility encoded in the structures. Furthermore, maximum-likelihood interfaces designed with masked models recapitulate wildtype binding energies for a wide range of protein interfaces and binding strengths. We also propose and compare fine-tuning strategies to train models for designing CDR loops of antibodies in the structural context of the antibody-antigen interface by leveraging structural databases for proteins, antibodies (synthetic and experimental) and protein-protein complexes. We show that pretraining on more general contexts improves native sequence recovery for antibody CDR loops, especially for the hypervariable CDR H3, while fine-tuning helps to preserve patterns observed in special contexts.

11.
bioRxiv ; 2023 Jul 01.
Article in English | MEDLINE | ID: mdl-37425754

ABSTRACT

Conventional protein-protein docking algorithms usually rely on heavy candidate sampling and re-ranking, but these steps are time-consuming and hinder applications that require high-throughput complex structure prediction, e.g., structure-based virtual screening. Existing deep learning methods for protein-protein docking, despite being much faster, suffer from low docking success rates. In addition, they simplify the problem to assume no conformational changes within any protein upon binding (rigid docking). This assumption precludes applications when binding-induced conformational changes play a role, such as allosteric inhibition or docking from uncertain unbound model structures. To address these limitations, we present GeoDock, a multi-track iterative transformer network to predict a docked structure from separate docking partners. Unlike deep learning models for protein structure prediction that input multiple sequence alignments (MSAs), GeoDock inputs just the sequences and structures of the docking partners, which suits the tasks when the individual structures are given. GeoDock is flexible at the protein residue level, allowing the prediction of conformational changes upon binding. For a benchmark set of rigid targets, GeoDock obtains a 41% success rate, outperforming all the other tested methods. For a more challenging benchmark set of flexible targets, GeoDock achieves a similar number of top-model successes as the traditional method ClusPro [1], but fewer than ReplicaDock2 [2]. GeoDock attains an average inference speed of under one second on a single GPU, enabling its application in large-scale structure screening. Although binding-induced conformational changes are still a challenge owing to limited training and evaluation data, our architecture sets up the foundation to capture this backbone flexibility. Code and a demonstration Jupyter notebook are available at https://github.com/Graylab/GeoDock.

12.
bioRxiv ; 2023 Jun 27.
Article in English | MEDLINE | ID: mdl-37425950

ABSTRACT

Membrane protein structure prediction and design are challenging due to the complexity of capturing the interactions in the lipid layer, such as those arising from electrostatics. Accurately capturing electrostatic energies in the low-dielectric membrane often requires expensive Poisson-Boltzmann calculations that are not scalable for membrane protein structure prediction and design. In this work, we have developed a fast-to-compute implicit energy function that considers the realistic characteristics of different lipid bilayers, making design calculations tractable. This method captures the impact of the lipid head group using a mean-field-based approach and uses a depth-dependent dielectric constant to characterize the membrane environment. This energy function Franklin2023 (F23) is built upon Franklin2019 (F19), which is based on experimentally derived hydrophobicity scales in the membrane bilayer. We evaluated the performance of F23 on five different tests probing (1) protein orientation in the bilayer, (2) stability, and (3) sequence recovery. Relative to F19, F23 has improved the calculation of the tilt angle of membrane proteins for 90% of WALP peptides, 15% of TM-peptides, and 25% of the adsorbed peptides. The performances for stability and design tests were equivalent for F19 and F23. The speed and calibration of the implicit model will help F23 access biophysical phenomena at long time and length scales and accelerate the membrane protein design pipeline.

13.
Article in English | MEDLINE | ID: mdl-37484815

ABSTRACT

Therapeutic antibody engineering seeks to identify antibody sequences with specific binding to a target and optimized drug-like properties. When guided by deep learning, antibody generation methods can draw on prior knowledge and experimental efforts to improve this process. By leveraging the increasing quantity and quality of predicted structures of antibodies and target antigens, powerful structure-based generative models are emerging. In this review, we tie the advancements in deep learning-based protein structure prediction and design to the study of antibody therapeutics.

14.
Stem Cells Transl Med ; 12(7): 444-458, 2023 07 14.
Article in English | MEDLINE | ID: mdl-37311043

ABSTRACT

Primary and metastatic lung cancer is a leading cause of cancer-related death and novel therapies are urgently needed. Epidermal growth factor receptor (EGFR) and death receptor (DR) 4/5 are both highly expressed in primary and metastatic non-small cell lung cancer (NSCLC); however, targeting these receptors individually has demonstrated limited therapeutic benefit in patients. In this study, we created and characterized diagnostic and therapeutic stem cells (SC), expressing EGFR-targeted nanobody (EV) fused to the extracellular domain of death DR4/5 ligand (DRL) (EVDRL) that simultaneously targets EGFR and DR4/5, in primary and metastatic NSCLC tumor models. We show that EVDRL targets both cell surface receptors, and induces caspase-mediated apoptosis in a broad spectrum of NSCLC cell lines. Utilizing real-time dual imaging and correlative immunohistochemistry, we show that allogeneic SCs home to tumors and when engineered to express EVDRL, alleviate tumor burden and significantly increase survival in primary and brain metastatic NSCLC. This study reports mechanistic insights into simultaneous targeting of EGFR- and DR4/5 in lung tumors and presents a promising approach for translation into the clinical setting.


Subject(s)
Brain Neoplasms , Carcinoma, Non-Small-Cell Lung , Hematopoietic Stem Cell Transplantation , Lung Neoplasms , Humans , Lung Neoplasms/drug therapy , Lung Neoplasms/metabolism , Carcinoma, Non-Small-Cell Lung/drug therapy , Carcinoma, Non-Small-Cell Lung/metabolism , ErbB Receptors/genetics , ErbB Receptors/metabolism , ErbB Receptors/therapeutic use , Cell Death , Brain Neoplasms/therapy , Cell Proliferation , Brain/pathology
15.
Nat Commun ; 14(1): 2389, 2023 04 25.
Article in English | MEDLINE | ID: mdl-37185622

ABSTRACT

Antibodies have the capacity to bind a diverse set of antigens, and they have become critical therapeutics and diagnostic molecules. The binding of antibodies is facilitated by a set of six hypervariable loops that are diversified through genetic recombination and mutation. Even with recent advances, accurate structural prediction of these loops remains a challenge. Here, we present IgFold, a fast deep learning method for antibody structure prediction. IgFold consists of a pre-trained language model trained on 558 million natural antibody sequences followed by graph networks that directly predict backbone atom coordinates. IgFold predicts structures of similar or better quality than alternative methods (including AlphaFold) in significantly less time (under 25 s). Accurate structure prediction on this timescale makes possible avenues of investigation that were previously infeasible. As a demonstration of IgFold's capabilities, we predicted structures for 1.4 million paired antibody sequences, providing structural insights to 500-fold more antibodies than have experimentally determined structures.


Subject(s)
Deep Learning , Protein Conformation , Antibodies/chemistry , Complementarity Determining Regions/chemistry , Antigens
16.
Heliyon ; 9(4): e15032, 2023 Apr.
Article in English | MEDLINE | ID: mdl-37035348

ABSTRACT

The human infectious disease COVID-19 caused by the SARS-CoV-2 virus has become a major threat to global public health. Developing a vaccine is the preferred prophylactic response to epidemics and pandemics. However, for individuals who have contracted the disease, the rapid design of antibodies that can target the SARS-CoV-2 virus fulfils a critical need. Further, discovering antibodies that bind multiple variants of SARS-CoV-2 can aid in the development of rapid antigen tests (RATs) which are critical for the identification and isolation of individuals currently carrying COVID-19. Here we provide a proof-of-concept study for the computational design of high-affinity antibodies that bind to multiple variants of the SARS-CoV-2 spike protein using RosettaAntibodyDesign (RAbD). Well characterized antibodies that bind with high affinity to the SARS-CoV-1 (but not SARS-CoV-2) spike protein were used as templates and re-designed to bind the SARS-CoV-2 spike protein with high affinity, resulting in a specificity switch. A panel of designed antibodies were experimentally validated. One design bound to a broad range of variants of concern including the Omicron, Delta, Wuhan, and South African spike protein variants.

17.
CBE Life Sci Educ ; 22(2): ar25, 2023 06.
Article in English | MEDLINE | ID: mdl-37058442

ABSTRACT

In-person undergraduate research experiences (UREs) promote students' integration into careers in life science research. In 2020, the COVID-19 pandemic prompted institutions hosting summer URE programs to offer them remotely, raising questions about whether undergraduates who participate in remote research can experience scientific integration and whether they might perceive doing research less favorably (i.e., not beneficial or too costly). To address these questions, we examined indicators of scientific integration and perceptions of the benefits and costs of doing research among students who participated in remote life science URE programs in Summer 2020. We found that students experienced gains in scientific self-efficacy pre- to post-URE, similar to results reported for in-person UREs. We also found that students experienced gains in scientific identity, graduate and career intentions, and perceptions of the benefits of doing research only if they started their remote UREs at lower levels on these variables. Collectively, students did not change in their perceptions of the costs of doing research despite the challenges of working remotely. Yet students who started with low cost perceptions increased in these perceptions. These findings indicate that remote UREs can support students' self-efficacy development, but may otherwise be limited in their potential to promote scientific integration.


Subject(s)
COVID-19 , Students , Humans , Pandemics
18.
bioRxiv ; 2023 Mar 15.
Article in English | MEDLINE | ID: mdl-36993750

ABSTRACT

Carbohydrates dynamically and transiently interact with proteins for cell-cell recognition, cellular differentiation, immune response, and many other cellular processes. Despite the molecular importance of these interactions, there are currently few reliable computational tools to predict potential carbohydrate binding sites on any given protein. Here, we present two deep learning models named CArbohydrate-Protein interaction Site IdentiFier (CAPSIF) that predict carbohydrate binding sites on proteins: (1) a 3D-UNet voxel-based neural network model (CAPSIF:V) and (2) an equivariant graph neural network model (CAPSIF:G). While both models outperform previous surrogate methods used for carbohydrate binding site prediction, CAPSIF:V performs better than CAPSIF:G, achieving test Dice scores of 0.597 and 0.543 and test set Matthews correlation coefficients (MCCs) of 0.599 and 0.538, respectively. We further tested CAPSIF:V on AlphaFold2-predicted protein structures. CAPSIF:V performed equivalently on both experimentally determined structures and AlphaFold2 predicted structures. Finally, we demonstrate how CAPSIF models can be used in conjunction with local glycan-docking protocols, such as GlycanDock, to predict bound protein-carbohydrate structures.

19.
MAbs ; 15(1): 2163584, 2023.
Article in English | MEDLINE | ID: mdl-36683173

ABSTRACT

Over the last three decades, the appeal for monoclonal antibodies (mAbs) as therapeutics has been steadily increasing as evident with FDA's recent landmark approval of the 100th mAb. Unlike mAbs that bind to single targets, multispecific biologics (msAbs) have garnered particular interest owing to the advantage of engaging distinct targets. One important modular component of msAbs is the single-chain variable fragment (scFv). Despite the exquisite specificity and affinity of these scFv modules, their relatively poor thermostability often hampers their development as a potential therapeutic drug. In recent years, engineering antibody sequences to enhance their stability by mutations has gained considerable momentum. As experimental methods for antibody engineering are time-intensive, laborious and expensive, computational methods serve as a fast and inexpensive alternative to conventional routes. In this work, we show two machine learning approaches - one with pre-trained language models (PTLM) capturing functional effects of sequence variation, and second, a supervised convolutional neural network (CNN) trained with Rosetta energetic features - to better classify thermostable scFv variants from sequence. Both of these models are trained over temperature-specific data (TS50 measurements) derived from multiple libraries of scFv sequences. On out-of-distribution (refers to the fact that the out-of-distribution sequnes are blind to the algorithm) sequences, we show that a sufficiently simple CNN model performs better than general pre-trained language models trained on diverse protein sequences (average Spearman correlation coefficient, ρ, of 0.4 as opposed to 0.15). On the other hand, an antibody-specific language model performs comparatively better than the CNN model on the same task (ρ= 0.52). Further, we demonstrate that for an independent mAb with available thermal melting temperatures for 20 experimentally characterized thermostable mutations, these models trained on TS50 data could identify 18 residue positions and 5 identical amino-acid mutations showing remarkable generalizability. Our results suggest that such models can be broadly applicable for improving the biological characteristics of antibodies. Further, transferring such models for alternative physicochemical properties of scFvs can have potential applications in optimizing large-scale production and delivery of mAbs or bsAbs.


Subject(s)
Antibodies, Monoclonal , Single-Chain Antibodies , Amino Acid Sequence , Machine Learning , Algorithms
20.
Proteins ; 91(2): 196-208, 2023 02.
Article in English | MEDLINE | ID: mdl-36111441

ABSTRACT

The continued emergence of new SARS-CoV-2 variants has accentuated the growing need for fast and reliable methods for the design of potentially neutralizing antibodies (Abs) to counter immune evasion by the virus. Here, we report on the de novo computational design of high-affinity Ab variable regions (Fv) through the recombination of VDJ genes targeting the most solvent-exposed hACE2-binding residues of the SARS-CoV-2 spike receptor binding domain (RBD) protein using the software tool OptMAVEn-2.0. Subsequently, we carried out computational affinity maturation of the designed variable regions through amino acid substitutions for improved binding with the target epitope. Immunogenicity of designs was restricted by preferring designs that match sequences from a 9-mer library of "human Abs" based on a human string content score. We generated 106 different antibody designs and reported in detail on the top five that trade-off the greatest computational binding affinity for the RBD with human string content scores. We further describe computational evaluation of the top five designs produced by OptMAVEn-2.0 using a Rosetta-based approach. We used Rosetta SnugDock for local docking of the designs to evaluate their potential to bind the spike RBD and performed "forward folding" with DeepAb to assess their potential to fold into the designed structures. Ultimately, our results identified one designed Ab variable region, P1.D1, as a particularly promising candidate for experimental testing. This effort puts forth a computational workflow for the de novo design and evaluation of Abs that can quickly be adapted to target spike epitopes of emerging SARS-CoV-2 variants or other antigenic targets.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/metabolism , Angiotensin-Converting Enzyme 2/metabolism , Antibodies, Neutralizing , Epitopes/chemistry , Immunoglobulin Variable Region , Spike Glycoprotein, Coronavirus/metabolism , Antibodies, Viral/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...