RESUMO
RAG endonuclease initiates antibody heavy chain variable region exon assembly from V, D, and J segments within a chromosomal V(D)J recombination center (RC) by cleaving between paired gene segments and flanking recombination signal sequences (RSSs). The IGCR1 control region promotes DJH intermediate formation by isolating Ds, JHs, and RCs from upstream VHs in a chromatin loop anchored by CTCF-binding elements (CBEs). How VHs access the DJHRC for VH to DJH rearrangement was unknown. We report that CBEs immediately downstream of frequently rearranged VH-RSSs increase recombination potential of their associated VH far beyond that provided by RSSs alone. This CBE activity becomes particularly striking upon IGCR1 inactivation, which allows RAG, likely via loop extrusion, to linearly scan chromatin far upstream. VH-associated CBEs stabilize interactions of D-proximal VHs first encountered by the DJHRC during linear RAG scanning and thereby promote dominant rearrangement of these VHs by an unanticipated chromatin accessibility-enhancing CBE function.
Assuntos
Fator de Ligação a CCCTC/metabolismo , Cromatina/metabolismo , Proteínas de Homeodomínio/metabolismo , Recombinação V(D)J , Animais , Linhagem Celular , DNA Intergênico/genética , DNA Intergênico/metabolismo , Proteínas de Ligação a DNA/deficiência , Proteínas de Ligação a DNA/genética , Cadeias Pesadas de Imunoglobulinas/genética , Cadeias Pesadas de Imunoglobulinas/metabolismo , Região Variável de Imunoglobulina/genética , Região Variável de Imunoglobulina/metabolismo , Camundongos , Camundongos Endogâmicos C57BL , Modelos Moleculares , Mutagênese , Sinais Direcionadores de Proteínas , RNA Guia de Cinetoplastídeos/metabolismo , Receptores de Antígenos de Linfócitos T/genética , Receptores de Antígenos de Linfócitos T/metabolismoRESUMO
The bacterial diversity and load on equipment in food processing facilities is constantly influenced by raw material, water, air, and staff. Despite regular cleaning and disinfection, some bacteria may persist and thereby potentially compromise food quality and safety. Little is known about how bacterial communities in a new food processing facility gradually establish themselves. Here, the development of bacterial communities in a newly opened salmon processing plant was studied from the first day and during the first year of operation. To focus on the persisting bacterial communities, surface sampling was done on strategical sampling points after cleaning and disinfection. To study the diversity dynamics, isolates from selected sampling and time points were classified by Oxford Nanopore Technology-based rep-PCR amplicon sequencing (ON-rep-seq) supplemented by 16S rRNA gene or rpoD gene sequencing (for Pseudomonas). An overall increase in bacterial numbers was only observed for food-contact surfaces in the slaughter department, but not in filleting department, on non-food contact surfaces or on the fish. Changes in temporal and spatial diversity and community composition were observed and our approach revealed highly point-specific bacterial communities.
Assuntos
Microbiologia de Alimentos , Salmão , Animais , Bactérias , Manipulação de Alimentos , RNA Ribossômico 16S/genética , MicrobiotaRESUMO
Developing B lymphocytes undergo V(D)J recombination to assemble germ-line V, D, and J gene segments into exons that encode the antigen-binding variable region of Ig heavy (H) and light (L) chains. IgH and IgL chains associate to form the B-cell receptor (BCR), which, upon antigen binding, activates B cells to secrete BCR as an antibody. Each of the huge number of clonally independent B cells expresses a unique set of IgH and IgL variable regions. The ability of V(D)J recombination to generate vast primary B-cell repertoires results from a combinatorial assortment of large numbers of different V, D, and J segments, coupled with diversification of the junctions between them to generate the complementary determining region 3 (CDR3) for antigen contact. Approaches to evaluate in depth the content of primary antibody repertoires and, ultimately, to study how they are further molded by secondary mutation and affinity maturation processes are of great importance to the B-cell development, vaccine, and antibody fields. We now describe an unbiased, sensitive, and readily accessible assay, referred to as high-throughput genome-wide translocation sequencing-adapted repertoire sequencing (HTGTS-Rep-seq), to quantify antibody repertoires. HTGTS-Rep-seq quantitatively identifies the vast majority of IgH and IgL V(D)J exons, including their unique CDR3 sequences, from progenitor and mature mouse B lineage cells via the use of specific J primers. HTGTS-Rep-seq also accurately quantifies DJH intermediates and V(D)J exons in either productive or nonproductive configurations. HTGTS-Rep-seq should be useful for studies of human samples, including clonal B-cell expansions, and also for following antibody affinity maturation processes.
Assuntos
Anticorpos/análise , Técnicas Genéticas , Recombinação V(D)J , Animais , CamundongosRESUMO
BACKGROUND: Pre-processing of high-throughput sequencing data for immune repertoire profiling is essential to insure high quality input for downstream analysis. VDJPipe is a flexible, high-performance tool that can perform multiple pre-processing tasks with just a single pass over the data files. RESULTS: Processing tasks provided by VDJPipe include base composition statistics calculation, read quality statistics calculation, quality filtering, homopolymer filtering, length and nucleotide filtering, paired-read merging, barcode demultiplexing, 5' and 3' PCR primer matching, and duplicate reads collapsing. VDJPipe utilizes a pipeline approach whereby multiple processing steps are performed in a sequential workflow, with the output of each step passed as input to the next step automatically. The workflow is flexible enough to handle the complex barcoding schemes used in many immunosequencing experiments. Because VDJPipe is designed for computational efficiency, we evaluated this by comparing execution times with those of pRESTO, a widely-used pre-processing tool for immune repertoire sequencing data. We found that VDJPipe requires <10% of the run time required by pRESTO. CONCLUSIONS: VDJPipe is a high-performance tool that is optimized for pre-processing large immune repertoire sequencing data sets.
Assuntos
Linfócitos B/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Imunoglobulina G/genética , Software , Animais , Primers do DNA , Humanos , Camundongos , Fatores de TempoRESUMO
High-throughput sequencing (HTS) is considered a technical revolution that has improved our knowledge of lymphoid and autoimmune diseases, changing our approach to leukaemia both at diagnosis and during follow-up. As part of an immunoglobulin/T cell receptor-based minimal residual disease (MRD) assessment of acute lymphoblastic leukaemia patients, we assessed the performance and feasibility of the replacement of the first steps of the approach based on DNA isolation and Sanger sequencing, using a HTS protocol combined with bioinformatics analysis and visualization using the Vidjil software. We prospectively analysed the diagnostic and relapse samples of 34 paediatric patients, thus identifying 125 leukaemic clones with recombinations on multiple loci (TRG, TRD, IGH and IGK), including Dd2/Dd3 and Intron/KDE rearrangements. Sequencing failures were halved (14% vs. 34%, P = 0.0007), enabling more patients to be monitored. Furthermore, more markers per patient could be monitored, reducing the probability of false negative MRD results. The whole analysis, from sample receipt to clinical validation, was shorter than our current diagnostic protocol, with equal resources. V(D)J recombination was successfully assigned by the software, even for unusual recombinations. This study emphasizes the progress that HTS with adapted bioinformatics tools can bring to the diagnosis of leukaemia patients.
Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Leucemia-Linfoma Linfoblástico de Células Precursoras/diagnóstico , Adolescente , Adulto , Criança , Pré-Escolar , Células Clonais , Erros de Diagnóstico/prevenção & controle , Rearranjo Gênico do Linfócito T , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Lactente , Recém-Nascido , Neoplasia Residual/diagnóstico , Estudos Prospectivos , Software , Recombinação V(D)J/genética , Adulto JovemRESUMO
Analysis of an individual's immunoglobulin or T cell receptor gene repertoire can provide important insights into immune function. High-quality analysis of adaptive immune receptor repertoire sequencing data depends upon accurate and relatively complete germline sets, but current sets are known to be incomplete. Established processes for the review and systematic naming of receptor germline genes and alleles require specific evidence and data types, but the discovery landscape is rapidly changing. To exploit the potential of emerging data, and to provide the field with improved state-of-the-art germline sets, an intermediate approach is needed that will allow the rapid publication of consolidated sets derived from these emerging sources. These sets must use a consistent naming scheme and allow refinement and consolidation into genes as new information emerges. Name changes should be minimised, but, where changes occur, the naming history of a sequence must be traceable. Here we outline the current issues and opportunities for the curation of germline IG/TR genes and present a forward-looking data model for building out more robust germline sets that can dovetail with current established processes. We describe interoperability standards for germline sets, and an approach to transparency based on principles of findability, accessibility, interoperability, and reusability.
RESUMO
B1 B cells reactive to phosphatidyl choline (PtC) exhibit restricted immunoglobulin heavy chain (HC) and light chain (LC) combinations, exemplified by VH12/Vκ4/5H. Two checkpoints are thought to focus PtC+ B cell maturation in VH12-transgenic mice (VH12 mice): V-J rearrangements encoding a "permissive" LC capable of VH12 HC pairing are selected first, followed by positive selection based on PtC binding, often requiring LC receptor editing to salvage PtC- B cells and acquire PtC reactivity. However, evidence obtained from breeding VH12 mice to editing-defective dnRAG1 mice and analyzing LC sequences from PtC+ and PtC- B cell subsets instead suggests that receptor editing functions after initial positive selection to remove PtC+ B cells in VH12 mice. This offers a mechanism to constrain natural, polyreactive B cells to limit their frequency. Sequencing also reveals occasional in-frame hybrid LC genes, reminiscent of type 2 gene replacement, that, testing suggests, arise via a recombination-activating gene (RAG)-independent mechanism.
Assuntos
Região Variável de Imunoglobulina , Fosfatidilcolinas , Animais , Linfócitos B , Região Variável de Imunoglobulina/genética , Camundongos , Camundongos Transgênicos , BaçoRESUMO
The antibody repertoire is a critical component of the adaptive immune system and is believed to reflect an individual's immune history and current immune status. Delineating the antibody repertoire has advanced our understanding of humoral immunity, facilitated antibody discovery, and showed great potential for improving the diagnosis and treatment of disease. However, no tool to date has effectively integrated big Rep-seq data and prior knowledge of functional antibodies to elucidate the remarkably diverse antibody repertoire. We developed a Rep-seq dataset Analysis Platform with an Integrated antibody Database (RAPID; https://rapid.zzhlab.org/), a free and web-based tool that allows researchers to process and analyse Rep-seq datasets. RAPID consolidates 521 WHO-recognized therapeutic antibodies, 88,059 antigen- or disease-specific antibodies, and 306 million clones extracted from 2,449 human IGH Rep-seq datasets generated from individuals with 29 different health conditions. RAPID also integrates a standardized Rep-seq dataset analysis pipeline to enable users to upload and analyse their datasets. In the process, users can also select set of existing repertoires for comparison. RAPID automatically annotates clones based on integrated therapeutic and known antibodies, and users can easily query antibodies or repertoires based on sequence or optional keywords. With its powerful analysis functions and rich set of antibody and antibody repertoire information, RAPID will benefit researchers in adaptive immune studies.
Assuntos
Anticorpos/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Humanos , Software , NavegadorRESUMO
Antibody repertoire sequencing (Rep-seq) has been widely used to reveal repertoire dynamics and to interrogate antibodies of interest at single nucleotide-level resolution. However, polymerase chain reaction (PCR) amplification introduces extensive artifacts including chimeras and nucleotide errors, leading to false discovery of antibodies and incorrect assessment of somatic hypermutations (SHMs) which subsequently mislead downstream investigations. Here, a novel approach named DUMPArts, which improves the accuracy of antibody repertoires by labeling each sample with dual barcodes and each molecule with dual unique molecular identifiers (UMIs) via minimal PCR amplification to remove artifacts, is developed. Tested by ultra-deep Rep-seq data, DUMPArts removed inter-sample chimeras, which cause artifactual shared clones and constitute approximately 15% of reads in the library, as well as intra-sample chimeras with erroneous SHMs and constituting approximately 20% of the reads, and corrected base errors and amplification biases by consensus building. The removal of these artifacts will provide an accurate assessment of antibody repertoires and benefit related studies, especially mAb discovery and antibody-guided vaccine design.
Assuntos
Anticorpos/análise , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Reação em Cadeia da Polimerase , Anticorpos/genética , Artefatos , Células Cultivadas , Biblioteca Gênica , Voluntários Saudáveis , Humanos , Leucócitos Mononucleares , Cultura Primária de Células , Desenvolvimento de Vacinas/métodosRESUMO
Identification, source tracking, and surveillance of food pathogens are crucial factors for the food-producing industry. Over the last decade, the techniques used for this have moved from conventional enrichment methods, through species-specific detection by PCR to sequencing-based methods, whole-genome sequencing (WGS) being the ultimate method. However, using WGS requires the right infrastructure, high computational power, and bioinformatics expertise. Therefore, there is a need for faster, more cost-effective, and more user-friendly methods. A newly developed method, ON-rep-seq, combines the classical rep-PCR method with nanopore sequencing, resulting in a highly discriminating set of sequences that can be used for species identification and also strain discrimination. This study is essentially a real industry case from a salmon processing plant. Twenty Listeria monocytogenes isolates were analyzed both by ON-rep-seq and WGS to identify and differentiate putative L. monocytogenes from a routine sampling of processing equipment and products, and finally, compare the strain-level discriminatory power of ON-rep-seq to different analyzing levels delivered from the WGS data. The analyses revealed that among the isolates tested there were three different strains. The isolates of the most frequently detected strain (n = 15) were all detected in the problematic area in the processing plant. The strain level discrimination done by ON-rep-seq was in full accordance with the interpretation of WGS data. Our findings also demonstrate that ON-rep-seq may serve as a primary screening method alternative to WGS for identification and strain-level differentiation for surveillance of potential pathogens in a food-producing environment.
Assuntos
Microbiologia de Alimentos , Indústria de Processamento de Alimentos , Listeria monocytogenes/classificação , Sequenciamento por Nanoporos , Reação em Cadeia da Polimerase , Salmão/microbiologia , Animais , Análise Custo-Benefício , Genoma Bacteriano , Listeria monocytogenes/genética , Listeria monocytogenes/isolamento & purificação , Filogenia , Análise de Sequência de DNA , Sequenciamento Completo do GenomaRESUMO
Immunoglobulin genes are rarely considered as disease susceptibility genes despite their obvious and central contributions to immune function. This appears to be a consequence of historical views on antibody repertoire formation that no longer stand, and of difficulties that until recently surrounded the documentation of the suite of antibody genes in any individual. If these important genes are to be accessible to GWAS studies, allelic variation within the human population needs to be better documented, and a curated set of genomic variations associated with antibody genes needs to be formulated. Repertoire studies arising from the COVID-19 pandemic provide an opportunity to meet these needs, and may provide insights into the profound variability that is seen in outcomes to this infection.
RESUMO
The Adaptive Immune Receptor Repertoire (AIRR) Community is a research-driven group that is establishing a clear set of community-accepted data and metadata standards; standards-based reference implementation tools; and policies and practices for infrastructure to support the deposit, curation, storage, and use of high-throughput sequencing data from B-cell and T-cell receptor repertoires (AIRR-seq data). The AIRR Data Commons is a distributed system of data repositories that utilizes a common data model, a common query language, and common interoperability formats for storage, query, and downloading of AIRR-seq data. Here is described the principal technical standards for the AIRR Data Commons consisting of the AIRR Data Model for repertoires and rearrangements, the AIRR Data Commons (ADC) API for programmatic query of data repositories, a reference implementation for ADC API services, and tools for querying and validating data repositories that support the ADC API. AIRR-seq data repositories can become part of the AIRR Data Commons by implementing the data model and API. The AIRR Data Commons allows AIRR-seq data to be reused for novel analyses and empowers researchers to discover new biological insights about the adaptive immune system.
RESUMO
The human antibody repertoire is generated by the recombination of different gene segments as well as by processes of somatic mutation. Together these mechanisms result in a tremendous diversity of antibodies that are able to combat various pathogens including viruses and bacteria, or malignant cells. In this review, we summarize the opportunities and challenges that are associated with the analyses of the B cell receptor repertoire and the antigen-specific B cell response. We will discuss how recent advances have increased our understanding of the antibody response and how repertoire analyses can be exploited to inform on vaccine strategies, particularly against HIV-1.
RESUMO
The adaptive immune system generates an incredible diversity of antigen receptors for B and T cells to keep dangerous pathogens at bay. The DNA sequences coding for these receptors arise by a complex recombination process followed by a series of productivity-based filters, as well as affinity maturation for B cells, giving considerable diversity to the circulating pool of receptor sequences. Although these datasets hold considerable promise for medical and public health applications, the complex structure of the resulting adaptive immune receptor repertoire sequencing (AIRR-seq) datasets makes analysis difficult. In this paper we introduce sumrep, an R package that efficiently performs a wide variety of repertoire summaries and comparisons, and show how sumrep can be used to perform model validation. We find that summaries vary in their ability to differentiate between datasets, although many are able to distinguish between covariates such as donor, timepoint, and cell type for BCR and TCR repertoires. We show that deletion and insertion lengths resulting from V(D)J recombination tend to be more discriminative characterizations of a repertoire than summaries that describe the amino acid composition of the CDR3 region. We also find that state-of-the-art generative models excel at recapitulating gene usage and recombination statistics in a given experimental repertoire, but struggle to capture many physiochemical properties of real repertoires.
Assuntos
Modelos Estatísticos , Receptores Imunológicos , Software , Interpretação Estatística de Dados , HumanosRESUMO
Background: Recent technological advances in immune repertoire sequencing have created tremendous potential for advancing our understanding of adaptive immune response dynamics in various states of health and disease. Immune repertoire sequencing produces large, highly complex data sets, however, which require specialized methods and software tools for their effective analysis and interpretation. Results: VDJServer is a cloud-based analysis portal for immune repertoire sequence data that provide access to a suite of tools for a complete analysis workflow, including modules for preprocessing and quality control of sequence reads, V(D)J gene segment assignment, repertoire characterization, and repertoire comparison. VDJServer also provides sophisticated visualizations for exploratory analysis. It is accessible through a standard web browser via a graphical user interface designed for use by immunologists, clinicians, and bioinformatics researchers. VDJServer provides a data commons for public sharing of repertoire sequencing data, as well as private sharing of data between users. We describe the main functionality and architecture of VDJServer and demonstrate its capabilities with use cases from cancer immunology and autoimmunity. Conclusion: VDJServer provides a complete analysis suite for human and mouse T-cell and B-cell receptor repertoire sequencing data. The combination of its user-friendly interface and high-performance computing allows large immune repertoire sequencing projects to be analyzed with no programming or software installation required. VDJServer is a web-accessible cloud platform that provides access through a graphical user interface to a data management infrastructure, a collection of analysis tools covering all steps in an analysis, and an infrastructure for sharing data along with workflows, results, and computational provenance. VDJServer is a free, publicly available, and open-source licensed resource.
Assuntos
Computação em Nuvem , Biologia Computacional/métodos , Genômica/métodos , Éxons VDJ/imunologia , Animais , Metodologias Computacionais , Humanos , Disseminação de Informação , Camundongos , Software , Interface Usuário-Computador , Navegador , Fluxo de TrabalhoRESUMO
Human aging is associated with a profound loss of thymus productivity, yet naïve T lymphocytes still maintain their numbers by division in the periphery for many years. The extent of such proliferation may depend on the cytokine environment, including IL-7 and T-cell receptor (TCR) "tonic" signaling mediated by self pMHCs recognition. Additionally, intrinsic properties of distinct subpopulations of naïve T cells could influence the overall dynamics of aging-related changes within the naïve T cell compartment. Here, we investigated the differences in the architecture of TCR beta repertoires for naïve CD4, naïve CD8, naïve CD4+CD25-CD31+ (enriched with recent thymic emigrants, RTE), and mature naïve CD4+CD25-CD31- peripheral blood subsets between young and middle-age/old healthy individuals. In addition to observing the accumulation of clonal expansions (as was shown previously), we reveal several notable changes in the characteristics of T cell repertoire. We observed significant decrease of CDR3 length, NDN insert, and number of non-template added N nucleotides within TCR beta CDR3 with aging, together with a prominent change of physicochemical properties of the central part of CDR3 loop. These changes were similar across CD4, CD8, RTE-enriched, and mature CD4 subsets of naïve T cells, with minimal or no difference observed between the latter two subsets for individuals of the same age group. We also observed an increase in "publicity" (fraction of shared clonotypes) of CD4, but not CD8 naïve T cell repertoires. We propose several explanations for these phenomena built upon previous studies of naïve T-cell homeostasis, and call for further studies of the mechanisms causing the observed changes and of consequences of these changes in respect of the possible holes formed in the landscape of naïve T cell TCR repertoire.
RESUMO
Increased interest in the immune system's involvement in pathophysiological phenomena coupled with decreased DNA sequencing costs have led to an explosion of antibody and T cell receptor sequencing data collectively termed "adaptive immune receptor repertoire sequencing" (AIRR-seq or Rep-Seq). The AIRR Community has been actively working to standardize protocols, metadata, formats, APIs, and other guidelines to promote open and reproducible studies of the immune repertoire. In this paper, we describe the work of the AIRR Community's Data Representation Working Group to develop standardized data representations for storing and sharing annotated antibody and T cell receptor data. Our file format emphasizes ease-of-use, accessibility, scalability to large data sets, and a commitment to open and transparent science. It is composed of a tab-delimited format with a specific schema. Several popular repertoire analysis tools and data repositories already utilize this AIRR-seq data format. We hope that others will follow suit in the interest of promoting interoperable standards.
Assuntos
Anticorpos/genética , Sequência de Bases , Sistemas de Gerenciamento de Base de Dados , Disseminação de Informação/métodos , Receptores de Antígenos de Linfócitos T/genética , Imunidade Adaptativa/genética , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Sequenciamento de Nucleotídeos em Larga Escala/economia , Humanos , Receptores Imunológicos/genética , Projetos de PesquisaRESUMO
The adaptation of high-throughput sequencing to the B cell receptor and T cell receptor has made it possible to characterize the adaptive immune receptor repertoire (AIRR) at unprecedented depth. These AIRR sequencing (AIRR-seq) studies offer tremendous potential to increase the understanding of adaptive immune responses in vaccinology, infectious disease, autoimmunity, and cancer. The increasingly wide application of AIRR-seq is leading to a critical mass of studies being deposited in the public domain, offering the possibility of novel scientific insights through secondary analyses and meta-analyses. However, effective sharing of these large-scale data remains a challenge. The AIRR community has proposed minimal information about adaptive immune receptor repertoire (MiAIRR), a standard for reporting AIRR-seq studies. The MiAIRR standard has been operationalized using the National Center for Biotechnology Information (NCBI) repositories. Submissions of AIRR-seq data to the NCBI repositories typically use a combination of web-based and flat-file templates and include only a minimal amount of terminology validation. As a result, AIRR-seq studies at the NCBI are often described using inconsistent terminologies, limiting scientists' ability to access, find, interoperate, and reuse the data sets. In order to improve metadata quality and ease submission of AIRR-seq studies to the NCBI, we have leveraged the software framework developed by the Center for Expanded Data Annotation and Retrieval (CEDAR), which develops technologies involving the use of data standards and ontologies to improve metadata quality. The resulting CEDAR-AIRR (CAIRR) pipeline enables data submitters to: (i) create web-based templates whose entries are controlled by ontology terms, (ii) generate and validate metadata, and (iii) submit the ontology-linked metadata and sequence files (FASTQ) to the NCBI BioProject, BioSample, and Sequence Read Archive databases. Overall, CAIRR provides a web-based metadata submission interface that supports compliance with the MiAIRR standard. This pipeline is available at http://cairr.miairr.org, and will facilitate the NCBI submission process and improve the metadata quality of AIRR-seq studies.
Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Receptores de Antígenos de Linfócitos B/genética , Receptores de Antígenos de Linfócitos T/genética , Software , Biologia Computacional/organização & administração , Mineração de Dados , Ontologia Genética , Humanos , Metadados , Reprodutibilidade dos Testes , Interface Usuário-Computador , Fluxo de TrabalhoRESUMO
Next-generation sequencing is making it possible to study the antibody repertoire of an organism in unprecedented detail, and, by so doing, to characterize its behavior in the response to infection and in pathological conditions such as autoimmunity and cancer. The polymorphic nature of the repertoire poses unique challenges that rule out the use of many commonly used NGS methods and require tradeoffs to be made when considering experimental design.We outline the main contexts in which antibody repertoire analysis has been used, and summarize the key tools that are available. The humoral immune response to vaccination has been a particular focus of repertoire analyses, and we review the key conclusions and methods used in these studies.
Assuntos
Anticorpos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Animais , Humanos , Imunidade Humoral/genéticaRESUMO
The B cell antigen receptor repertoire is highly diverse and constantly modified by clonal selection. High-throughput DNA sequencing (HTS) of the lymphocyte repertoire (Rep-Seq) represents a promising technology to explore such diversity ex-vivo and assist in the identification of antigen-specific antibodies based on molecular signatures of clonal selection. Therefore, integrative tools for repertoire reconstruction and analysis from antibody sequences are needed. We developed ImmunediveRity, a stand-alone pipeline primarily based in R programming for the integral analysis of B cell repertoire data generated by HTS. The pipeline integrates GNU software and in house scripts to perform quality filtering, sequencing noise correction and repertoire reconstruction based on V, D and J segment assignment, clonal origin and unique heavy chain identification. Post-analysis scripts generate a wealth of repertoire metrics that in conjunction with a rich graphical output facilitates sample comparison and repertoire mining. Its performance was tested with raw and curated human and mouse 454-Roche sequencing benchmarks providing good approximations of repertoire structure. Furthermore, ImmunediveRsity was used to mine the B cell repertoire of immunized mice with a model antigen, allowing the identification of previously validated antigen-specific antibodies, and revealing different and unexpected clonal diversity patterns in the post-immunization IgM and IgG compartments. Although ImmunediveRsity is similar to other recently developed tools, it offers significant advantages that facilitate repertoire analysis and repertoire mining. ImmunediveRsity is open source and free for academic purposes and it runs on 64 bit GNU/Linux and MacOS. Available at: https://bitbucket.org/ImmunediveRsity/immunediversity/.