Improving antibody language models with native pairing.

Burbach, Sarah M; Briney, Bryan

Burbach, Sarah M; Briney, Bryan.

Afiliación

Burbach SM; Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037, USA.
Briney B; Center for Viral Systems Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.

Patterns (N Y) ; 5(5): 100967, 2024 May 10.

Article en En | MEDLINE | ID: mdl-38800360

ABSTRACT

ABSTRACT

Existing antibody language models are limited by their use of unpaired antibody sequence data. A recently published dataset of â¼1.6 × 106 natively paired human antibody sequences offers a unique opportunity to evaluate how antibody language models are improved by training with native pairs. We trained three baseline antibody language models (BALM), using natively paired (BALM-paired), randomly-paired (BALM-shuffled), or unpaired (BALM-unpaired) sequences from this dataset. To address the paucity of paired sequences, we additionally fine-tuned ESM (evolutionary scale modeling)-2 with natively paired antibody sequences (ft-ESM). We provide evidence that training with native pairs allows the model to learn immunologically relevant features that span the light and heavy chains, which cannot be simulated by training with random pairs. We additionally show that training with native pairs improves model performance on a variety of metrics, including the ability of the model to classify antibodies by pathogen specificity.

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Idioma: En Revista: Patterns (N Y) Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google