Pesquisa | BVS IEC

SeRenDIP-CE: sequence-based interface prediction for conformational epitopes.

Hou, Qingzhen; Stringer, Bas; Waury, Katharina; Capel, Henriette; Haydarlou, Reza; Xue, Fuzhong; Abeln, Sanne; Heringa, Jaap; Feenstra, K Anton.

Bioinformatics ; 37(20): 3421-3427, 2021 Oct 25.

Artigo em Inglês | MEDLINE | ID: mdl-33974039

RESUMO

MOTIVATION: Antibodies play an important role in clinical research and biotechnology, with their specificity determined by the interaction with the antigen's epitope region, as a special type of protein-protein interaction (PPI) interface. The ubiquitous availability of sequence data, allows us to predict epitopes from sequence in order to focus time-consuming wet-lab experiments toward the most promising epitope regions. Here, we extend our previously developed sequence-based predictors for homodimer and heterodimer PPI interfaces to predict epitope residues that have the potential to bind an antibody. RESULTS: We collected and curated a high quality epitope dataset from the SAbDab database. Our generic PPI heterodimer predictor obtained an AUC-ROC of 0.666 when evaluated on the epitope test set. We then trained a random forest model specifically on the epitope dataset, reaching AUC 0.694. Further training on the combined heterodimer and epitope datasets, improves our final predictor to AUC 0.703 on the epitope test set. This is better than the best state-of-the-art sequence-based epitope predictor BepiPred-2.0. On one solved antibody-antigen structure of the COVID19 virus spike receptor binding domain, our predictor reaches AUC 0.778. We added the SeRenDIP-CE Conformational Epitope predictors to our webserver, which is simple to use and only requires a single antigen sequence as input, which will help make the method immediately applicable in a wide range of biomedical and biomolecular research. AVAILABILITY AND IMPLEMENTATION: Webserver, source code and datasets at www.ibi.vu.nl/programs/serendipwww/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

A comparison of the binding sites of antibodies and single-domain antibodies.

Gordon, Gemma L; Capel, Henriette L; Guloglu, Bora; Richardson, Eve; Stafford, Ryan L; Deane, Charlotte M.

Front Immunol ; 14: 1231623, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37533864

RESUMO

Antibodies are the largest class of biotherapeutics. However, in recent years, single-domain antibodies have gained traction due to their smaller size and comparable binding affinity. Antibodies (Abs) and single-domain antibodies (sdAbs) differ in the structures of their binding sites: most significantly, single-domain antibodies lack a light chain and so have just three CDR loops. Given this inherent structural difference, it is important to understand whether Abs and sdAbs are distinguishable in how they engage a binding partner and thus, whether they are suited to different types of epitopes. In this study, we use non-redundant sequence and structural datasets to compare the paratopes, epitopes and antigen interactions of Abs and sdAbs. We demonstrate that even though sdAbs have smaller paratopes, they target epitopes of equal size to those targeted by Abs. To achieve this, the paratopes of sdAbs contribute more interactions per residue than the paratopes of Abs. Additionally, we find that conserved framework residues are of increased importance in the paratopes of sdAbs, suggesting that they include non-specific interactions to achieve comparable affinity. Furthermore, the epitopes of sdAbs are only marginally less accessible than those of Abs: we posit that this may be explained by differences in the orientation and compaction of sdAb and Ab CDR-H3 loops. Overall, our results have important implications for the engineering and humanization of sdAbs, as well as the selection of the best modality for targeting a particular epitope.

Assuntos

Anticorpos de Domínio Único , Anticorpos , Sítios de Ligação , Epitopos , Antígenos

Multi-task learning to leverage partially annotated data for PPI interface prediction.

Capel, Henriette; Feenstra, K Anton; Abeln, Sanne.

Sci Rep ; 12(1): 10487, 2022 06 21.

Artigo em Inglês | MEDLINE | ID: mdl-35729253

RESUMO

Protein protein interactions (PPI) are crucial for protein functioning, nevertheless predicting residues in PPI interfaces from the protein sequence remains a challenging problem. In addition, structure-based functional annotations, such as the PPI interface annotations, are scarce: only for about one-third of all protein structures residue-based PPI interface annotations are available. If we want to use a deep learning strategy, we have to overcome the problem of limited data availability. Here we use a multi-task learning strategy that can handle missing data. We start with the multi-task model architecture, and adapted it to carefully handle missing data in the cost function. As related learning tasks we include prediction of secondary structure, solvent accessibility, and buried residue. Our results show that the multi-task learning strategy significantly outperforms single task approaches. Moreover, only the multi-task strategy is able to effectively learn over a dataset extended with structural feature data, without additional PPI annotations. The multi-task setup becomes even more important, if the fraction of PPI annotations becomes very small: the multi-task learner trained on only one-eighth of the PPI annotations-with data extension-reaches the same performances as the single-task learner on all PPI annotations. Thus, we show that the multi-task learning strategy can be beneficial for a small training dataset where the protein's functional properties of interest are only partially annotated.

Assuntos

Algoritmos , Proteínas , Proteínas/metabolismo

ProteinGLUE multi-task benchmark suite for self-supervised protein modeling.

Capel, Henriette; Weiler, Robin; Dijkstra, Maurits; Vleugels, Reinier; Bloem, Peter; Feenstra, K Anton.

Sci Rep ; 12(1): 16047, 2022 09 26.

Artigo em Inglês | MEDLINE | ID: mdl-36163232

RESUMO

Self-supervised language modeling is a rapidly developing approach for the analysis of protein sequence data. However, work in this area is heterogeneous and diverse, making comparison of models and methods difficult. Moreover, models are often evaluated only on one or two downstream tasks, making it unclear whether the models capture generally useful properties. We introduce the ProteinGLUE benchmark for the evaluation of protein representations: a set of seven per-amino-acid tasks for evaluating learned protein representations. We also offer reference code, and we provide two baseline models with hyperparameters specifically trained for these benchmarks. Pre-training was done on two tasks, masked symbol prediction and next sentence prediction. We show that pre-training yields higher performance on a variety of downstream tasks such as secondary structure and protein interaction interface prediction, compared to no pre-training. However, the larger base model does not outperform the smaller medium model. We expect the ProteinGLUE benchmark dataset introduced here, together with the two baseline pre-trained models and their performance evaluations, to be of great value to the field of protein sequence-based property prediction. Availability: code and datasets from https://github.com/ibivu/protein-glue .

Assuntos

Benchmarking , Proteínas , Sequência de Aminoácidos , Aminoácidos/química , Processamento de Linguagem Natural

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA