Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Brief Bioinform ; 25(6)2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39438077

RESUMO

Adaptive immune receptors, such as antibodies and T-cell receptors, recognize foreign threats with exquisite specificity. A major challenge in adaptive immunology is discovering the rules governing immune receptor-antigen binding in order to predict the antigen binding status of previously unseen immune receptors. Many studies assume that the antigen binding status of an immune receptor may be determined by the presence of a short motif in the complementarity determining region 3 (CDR3), disregarding other amino acids. To test this assumption, we present a method to discover short motifs which show high precision in predicting antigen binding and generalize well to unseen simulated and experimental data. Our analysis of a mutagenesis-based antibody dataset reveals 11 336 position-specific, mostly gapped motifs of 3-5 amino acids that retain high precision on independently generated experimental data. Using a subset of only 178 motifs, a simple classifier was made that on the independently generated dataset outperformed a deep learning model proposed specifically for such datasets. In conclusion, our findings support the notion that for some antibodies, antigen binding may be largely determined by a short CDR3 motif. As more experimental data emerge, our methodology could serve as a foundation for in-depth investigations into antigen binding signals.


Assuntos
Motivos de Aminoácidos , Antígenos , Regiões Determinantes de Complementaridade , Regiões Determinantes de Complementaridade/química , Regiões Determinantes de Complementaridade/imunologia , Regiões Determinantes de Complementaridade/genética , Antígenos/imunologia , Antígenos/química , Antígenos/metabolismo , Humanos , Anticorpos/imunologia , Anticorpos/química , Anticorpos/metabolismo , Aprendizado Profundo , Ligação Proteica , Biologia Computacional/métodos
2.
Genome Res ; 31(12): 2209-2224, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34815307

RESUMO

The process of recombination between variable (V), diversity (D), and joining (J) immunoglobulin (Ig) gene segments determines an individual's naive Ig repertoire and, consequently, (auto)antigen recognition. VDJ recombination follows probabilistic rules that can be modeled statistically. So far, it remains unknown whether VDJ recombination rules differ between individuals. If these rules differed, identical (auto)antigen-specific Ig sequences would be generated with individual-specific probabilities, signifying that the available Ig sequence space is individual specific. We devised a sensitivity-tested distance measure that enables inter-individual comparison of VDJ recombination models. We discovered, accounting for several sources of noise as well as allelic variation in Ig sequencing data, that not only unrelated individuals but also human monozygotic twins and even inbred mice possess statistically distinguishable immunoglobulin recombination models. This suggests that, in addition to genetic, there is also nongenetic modulation of VDJ recombination. We demonstrate that population-wide individualized VDJ recombination can result in orders of magnitude of difference in the probability to generate (auto)antigen-specific Ig sequences. Our findings have implications for immune receptor-based individualized medicine approaches relevant to vaccination, infection, and autoimmunity.

4.
Clin Immunol ; 222: 108621, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33197618

RESUMO

An individual's T cell repertoire is skewed towards some specificities as a result of past antigen exposure and subsequent clonal expansion. Identifying T cell receptor signatures associated with a disease is challenging due to the overall complexity of antigens and polymorphic HLA allotypes. In celiac disease, the antigen epitopes are well characterised and the specific HLA-DQ2-restricted T-cell repertoire associated with the disease has been explored in depth. By investigating T cell receptor repertoires of unsorted lamina propria T cells from 15 individuals, we provide the first proof-of-concept study showing that it could be possible to infer disease state by matching against a priori known disease-associated T cell receptor sequences.


Assuntos
Doença Celíaca/diagnóstico , Doença Celíaca/imunologia , Epitopos de Linfócito T/imunologia , Receptores de Antígenos de Linfócitos T/imunologia , Adolescente , Adulto , Idoso , Biomarcadores , Antígenos HLA-DQ/genética , Antígenos HLA-DQ/imunologia , Humanos , Ativação Linfocitária/imunologia , Pessoa de Meia-Idade , Mucosa/citologia , Mucosa/imunologia , Adulto Jovem
5.
Bioinformatics ; 36(11): 3594-3596, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32154832

RESUMO

SUMMARY: B- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we developed immuneSIM, an R package that allows the simulation of native-like and aberrant synthetic full-length variable region immune receptor sequences by tuning the following immune receptor features: (i) species and chain type (BCR, TCR, single and paired), (ii) germline gene usage, (iii) occurrence of insertions and deletions, (iv) clonal abundance, (v) somatic hypermutation and (vi) sequence motifs. Each simulated sequence is annotated by the complete set of simulation events that contributed to its in silico generation. immuneSIM permits the benchmarking of key computational tools for immune receptor analysis, such as germline gene annotation, diversity and overlap estimation, sequence similarity, network architecture, clustering analysis and machine learning methods for motif detection. AVAILABILITY AND IMPLEMENTATION: The package is available via https://github.com/GreiffLab/immuneSIM and on CRAN at https://cran.r-project.org/web/packages/immuneSIM. The documentation is hosted at https://immuneSIM.readthedocs.io. CONTACT: sai.reddy@ethz.ch or victor.greiff@medisin.uio.no. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Benchmarking , Software , Simulação por Computador , Receptores de Antígenos de Linfócitos T/genética
6.
Biomed Chromatogr ; 33(2): e4384, 2019 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-30215855

RESUMO

The separation and characterization of the unknown degradation product of second-generation antipsychotic drug ziprasidone are essential for defining the genotoxic potential of the compound. The aim of this study was to develop a simple UHPLC method coupled with tandem mass spectrometry (MS/MS) for chemical characterization of an unknown degradant, and the separation and quantification of ziprasidone and its five main impurities (I-V) in the raw material and pharmaceuticals. Chromatographic conditions were optimized by experimental design. The MS/MS fragmentation conditions were optimized individually for each compound in order to obtain both specific fragments and high signal intensity. A rapid and sensitive UHPLC-MS/MS method was developed. All seven analytes were eluted within the 7 min run time. The best separation was obtained on the Acquity UPLC BEH C18 (50 × 2.1 mm × 1.7 µm) column in gradient mode with ammonium-formate buffer (10 mm; pH 4.7) and acetonitrile as mobile phase, with the flow rate of 0.3 mL min-1 and at the column temperature of 30°C. The new UHPLC-MS/MS method was fully validated and all validation parameters were confirmed. The fragmentation pathways and chemical characterization of an unknown degradant were proposed and it was confirmed that there are no structural alerts concerning genotoxicity.


Assuntos
Cromatografia Líquida de Alta Pressão/métodos , Piperazinas/análise , Piperazinas/química , Espectrometria de Massas em Tandem/métodos , Tiazóis/análise , Tiazóis/química , Contaminação de Medicamentos , Análise dos Mínimos Quadrados , Limite de Detecção , Reprodutibilidade dos Testes
7.
Front Public Health ; 11: 1183725, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37408750

RESUMO

Aim: To perform a systematic review on the use of Artificial Intelligence (AI) techniques for predicting COVID-19 hospitalization and mortality using primary and secondary data sources. Study eligibility criteria: Cohort, clinical trials, meta-analyses, and observational studies investigating COVID-19 hospitalization or mortality using artificial intelligence techniques were eligible. Articles without a full text available in the English language were excluded. Data sources: Articles recorded in Ovid MEDLINE from 01/01/2019 to 22/08/2022 were screened. Data extraction: We extracted information on data sources, AI models, and epidemiological aspects of retrieved studies. Bias assessment: A bias assessment of AI models was done using PROBAST. Participants: Patients tested positive for COVID-19. Results: We included 39 studies related to AI-based prediction of hospitalization and death related to COVID-19. The articles were published in the period 2019-2022, and mostly used Random Forest as the model with the best performance. AI models were trained using cohorts of individuals sampled from populations of European and non-European countries, mostly with cohort sample size <5,000. Data collection generally included information on demographics, clinical records, laboratory results, and pharmacological treatments (i.e., high-dimensional datasets). In most studies, the models were internally validated with cross-validation, but the majority of studies lacked external validation and calibration. Covariates were not prioritized using ensemble approaches in most of the studies, however, models still showed moderately good performances with Area under the Receiver operating characteristic Curve (AUC) values >0.7. According to the assessment with PROBAST, all models had a high risk of bias and/or concern regarding applicability. Conclusions: A broad range of AI techniques have been used to predict COVID-19 hospitalization and mortality. The studies reported good prediction performance of AI models, however, high risk of bias and/or concern regarding applicability were detected.


Assuntos
Inteligência Artificial , COVID-19 , Humanos , COVID-19/epidemiologia , Hospitalização , Idioma , Curva ROC
8.
Front Public Health ; 11: 1258840, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38146473

RESUMO

Aims: To develop a disease risk score for COVID-19-related hospitalization and mortality in Sweden and externally validate it in Norway. Method: We employed linked data from the national health registries of Sweden and Norway to conduct our study. We focused on individuals in Sweden with confirmed SARS-CoV-2 infection through RT-PCR testing up to August 2022 as our study cohort. Within this group, we identified hospitalized cases as those who were admitted to the hospital within 14 days of testing positive for SARS-CoV-2 and matched them with five controls from the same cohort who were not hospitalized due to SARS-CoV-2. Additionally, we identified individuals who died within 30 days after being hospitalized for COVID-19. To develop our disease risk scores, we considered various factors, including demographics, infectious, somatic, and mental health conditions, recorded diagnoses, and pharmacological treatments. We also conducted age-specific analyses and assessed model performance through 5-fold cross-validation. Finally, we performed external validation using data from the Norwegian population with COVID-19 up to December 2021. Results: During the study period, a total of 124,560 individuals in Sweden were hospitalized, and 15,877 individuals died within 30 days following COVID-19 hospitalization. Disease risk scores for both hospitalization and mortality demonstrated predictive capabilities with ROC-AUC values of 0.70 and 0.72, respectively, across the entire study period. Notably, these scores exhibited a positive correlation with the likelihood of hospitalization or death. In the external validation using data from the Norwegian COVID-19 population (consisting of 53,744 individuals), the disease risk score predicted hospitalization with an AUC of 0.47 and death with an AUC of 0.74. Conclusion: The disease risk score showed moderately good performance to predict COVID-19-related mortality but performed poorly in predicting hospitalization when externally validated.


Assuntos
COVID-19 , Humanos , COVID-19/epidemiologia , SARS-CoV-2 , Suécia/epidemiologia , Fatores de Risco , Hospitalização , Aprendizado de Máquina
9.
Gigascience ; 112022 05 25.
Artigo em Inglês | MEDLINE | ID: mdl-35639633

RESUMO

BACKGROUND: Machine learning (ML) methodology development for the classification of immune states in adaptive immune receptor repertoires (AIRRs) has seen a recent surge of interest. However, so far, there does not exist a systematic evaluation of scenarios where classical ML methods (such as penalized logistic regression) already perform adequately for AIRR classification. This hinders investigative reorientation to those scenarios where method development of more sophisticated ML approaches may be required. RESULTS: To identify those scenarios where a baseline ML method is able to perform well for AIRR classification, we generated a collection of synthetic AIRR benchmark data sets encompassing a wide range of data set architecture-associated and immune state-associated sequence patterns (signal) complexity. We trained ≈1,700 ML models with varying assumptions regarding immune signal on ≈1,000 data sets with a total of ≈250,000 AIRRs containing ≈46 billion TCRß CDR3 amino acid sequences, thereby surpassing the sample sizes of current state-of-the-art AIRR-ML setups by two orders of magnitude. We found that L1-penalized logistic regression achieved high prediction accuracy even when the immune signal occurs only in 1 out of 50,000 AIR sequences. CONCLUSIONS: We provide a reference benchmark to guide new AIRR-ML classification methodology by (i) identifying those scenarios characterized by immune signal and data set complexity, where baseline methods already achieve high prediction accuracy, and (ii) facilitating realistic expectations of the performance of AIRR-ML models given training data set properties and assumptions. Our study serves as a template for defining specialized AIRR benchmark data sets for comprehensive benchmarking of AIRR-ML methods.


Assuntos
Aprendizado de Máquina , Receptores Imunológicos
10.
Gigascience ; 122022 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-37848619

RESUMO

BACKGROUND: Machine learning (ML) has gained significant attention for classifying immune states in adaptive immune receptor repertoires (AIRRs) to support the advancement of immunodiagnostics and therapeutics. Simulated data are crucial for the rigorous benchmarking of AIRR-ML methods. Existing approaches to generating synthetic benchmarking datasets result in the generation of naive repertoires missing the key feature of many shared receptor sequences (selected for common antigens) found in antigen-experienced repertoires. RESULTS: We demonstrate that a common approach to generating simulated AIRR benchmark datasets can introduce biases, which may be exploited for undesired shortcut learning by certain ML methods. To mitigate undesirable access to true signals in simulated AIRR datasets, we devised a simulation strategy (simAIRR) that constructs antigen-experienced-like repertoires with a realistic overlap of receptor sequences. simAIRR can be used for constructing AIRR-level benchmarks based on a range of assumptions (or experimental data sources) for what constitutes receptor-level immune signals. This includes the possibility of making or not making any prior assumptions regarding the similarity or commonality of immune state-associated sequences that will be used as true signals. We demonstrate the real-world realism of our proposed simulation approach by showing that basic ML strategies perform similarly on simAIRR-generated and real-world experimental AIRR datasets. CONCLUSIONS: This study sheds light on the potential shortcut learning opportunities for ML methods that can arise with the state-of-the-art way of simulating AIRR datasets. simAIRR is available as a Python package: https://github.com/KanduriC/simAIRR.


Assuntos
Benchmarking , Simulação por Computador
11.
MAbs ; 14(1): 2031482, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35377271

RESUMO

Generative machine learning (ML) has been postulated to become a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody-binding parameters. The simulation framework enables the computation of synthetic antibody-antigen 3D-structures, and it functions as an oracle for unrestricted prospective evaluation and benchmarking of antibody design parameters of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (one dimensional: 1D) data can be used to design conformational (three dimensional: 3D) epitope-specific antibodies, matching, or exceeding the training dataset in affinity and developability parameter value variety. Furthermore, we established a lower threshold of sequence diversity necessary for high-accuracy generative antibody ML and demonstrated that this lower threshold also holds on experimental real-world data. Finally, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.


Assuntos
Reações Antígeno-Anticorpo , Aprendizado de Máquina , Anticorpos Monoclonais/química , Sítios de Ligação de Anticorpos , Epitopos
12.
Nat Comput Sci ; 2(12): 845-865, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38177393

RESUMO

Machine learning (ML) is a key technology for accurate prediction of antibody-antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: the lack of a unified ML formalization of immunological antibody-specificity prediction problems and the unavailability of large-scale synthetic datasets to benchmark real-world relevant ML methods and dataset design. Here we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthetic lattice-based three-dimensional antibody-antigen-binding structures with ground-truth access to conformational paratope, epitope and affinity. We formalized common immunological antibody-specificity prediction problems as ML tasks and confirmed that for both sequence- and structure-based tasks, accuracy-based rankings of ML methods trained on experimental data hold for ML methods trained on Absolut!-generated data. The Absolut! framework has the potential to enable real-world relevant development and benchmarking of ML strategies for biotherapeutics design.


Assuntos
Anticorpos , Reações Antígeno-Anticorpo , Especificidade de Anticorpos , Epitopos/química , Aprendizado de Máquina
13.
Cell Rep ; 34(11): 108856, 2021 03 16.
Artigo em Inglês | MEDLINE | ID: mdl-33730590

RESUMO

Antibody-antigen binding relies on the specific interaction of amino acids at the paratope-epitope interface. The predictability of antibody-antigen binding is a prerequisite for de novo antibody and (neo-)epitope design. A fundamental premise for the predictability of antibody-antigen binding is the existence of paratope-epitope interaction motifs that are universally shared among antibody-antigen structures. In a dataset of non-redundant antibody-antigen structures, we identify structural interaction motifs, which together compose a commonly shared structure-based vocabulary of paratope-epitope interactions. We show that this vocabulary enables the machine learnability of antibody-antigen binding on the paratope-epitope level using generative machine learning. The vocabulary (1) is compact, less than 104 motifs; (2) distinct from non-immune protein-protein interactions; and (3) mediates specific oligo- and polyreactive interactions between paratope-epitope pairs. Our work leverages combined structure- and sequence-based learning to demonstrate that machine-learning-driven predictive paratope and epitope engineering is feasible.


Assuntos
Reações Antígeno-Anticorpo/imunologia , Sítios de Ligação de Anticorpos/imunologia , Epitopos/imunologia , Motivos de Aminoácidos , Sequência de Aminoácidos , Anticorpos/química , Anticorpos/imunologia , Regiões Determinantes de Complementaridade/química , Epitopos/química , Aprendizado de Máquina , Ligação Proteica
14.
Nat Mach Intell ; 3(11): 936-944, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37396030

RESUMO

Adaptive immune receptor repertoires (AIRR) are key targets for biomedical research as they record past and ongoing adaptive immune responses. The capacity of machine learning (ML) to identify complex discriminative sequence patterns renders it an ideal approach for AIRR-based diagnostic and therapeutic discovery. To date, widespread adoption of AIRR ML has been inhibited by a lack of reproducibility, transparency, and interoperability. immuneML (immuneml.uio.no) addresses these concerns by implementing each step of the AIRR ML process in an extensible, open-source software ecosystem that is based on fully specified and shareable workflows. To facilitate widespread user adoption, immuneML is available as a command-line tool and through an intuitive Galaxy web interface, and extensive documentation of workflows is provided. We demonstrate the broad applicability of immuneML by (i) reproducing a large-scale study on immune state prediction, (ii) developing, integrating, and applying a novel deep learning method for antigen specificity prediction, and (iii) showcasing streamlined interpretability-focused benchmarking of AIRR ML.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA