Your browser doesn't support javascript.
loading
Identification of Subject-Specific Immunoglobulin Alleles From Expressed Repertoire Sequencing Data.
Gadala-Maria, Daniel; Gidoni, Moriah; Marquez, Susanna; Vander Heiden, Jason A; Kos, Justin T; Watson, Corey T; O'Connor, Kevin C; Yaari, Gur; Kleinstein, Steven H.
Afiliação
  • Gadala-Maria D; Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States.
  • Gidoni M; Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel.
  • Marquez S; Department of Pathology, Yale School of Medicine, Yale University, New Haven, CT, United States.
  • Vander Heiden JA; Department of Neurology, Yale School of Medicine, Yale University, New Haven, CT, United States.
  • Kos JT; Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States.
  • Watson CT; Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States.
  • O'Connor KC; Department of Neurology, Yale School of Medicine, Yale University, New Haven, CT, United States.
  • Yaari G; Department of Immunobiology, Yale School of Medicine, Yale University, New Haven, CT, United States.
  • Kleinstein SH; Bioengineering Program, Faculty of Engineering, Bar-Ilan University, Ramat Gan, Israel.
Front Immunol ; 10: 129, 2019.
Article em En | MEDLINE | ID: mdl-30814994
ABSTRACT
The adaptive immune receptor repertoire (AIRR) contains information on an individuals' immune past, present and potential in the form of the evolving sequences that encode the B cell receptor (BCR) repertoire. AIRR sequencing (AIRR-seq) studies rely on databases of known BCR germline variable (V), diversity (D), and joining (J) genes to detect somatic mutations in AIRR-seq data via comparison to the best-aligning database alleles. However, it has been shown that these databases are far from complete, leading to systematic misidentification of mutated positions in subsets of sample sequences. We previously presented TIgGER, a computational method to identify subject-specific V gene genotypes, including the presence of novel V gene alleles, directly from AIRR-seq data. However, the original algorithm was unable to detect alleles that differed by more than 5 single nucleotide polymorphisms (SNPs) from a database allele. Here we present and apply an improved version of the TIgGER algorithm which can detect alleles that differ by any number of SNPs from the nearest database allele, and can construct subject-specific genotypes with minimal prior information. TIgGER predictions are validated both computationally (using a leave-one-out strategy) and experimentally (using genomic sequencing), resulting in the addition of three new immunoglobulin heavy chain V (IGHV) gene alleles to the IMGT repertoire. Finally, we develop a Bayesian strategy to provide a confidence estimate associated with genotype calls. All together, these methods allow for much higher accuracy in germline allele assignment, an essential step in AIRR-seq studies.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Imunoglobulinas Tipo de estudo: Diagnostic_studies / Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2019 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Imunoglobulinas Tipo de estudo: Diagnostic_studies / Prognostic_studies Limite: Humans Idioma: En Ano de publicação: 2019 Tipo de documento: Article