Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 602(7895): 142-147, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-35082445

RESUMEN

Public databases contain a planetary collection of nucleic acid sequences, but their systematic exploration has been inhibited by a lack of efficient methods for searching this corpus, which (at the time of writing) exceeds 20 petabases and is growing exponentially1. Here we developed a cloud computing infrastructure, Serratus, to enable ultra-high-throughput sequence alignment at the petabase scale. We searched 5.7 million biologically diverse samples (10.2 petabases) for the hallmark gene RNA-dependent RNA polymerase and identified well over 105 novel RNA viruses, thereby expanding the number of known species by roughly an order of magnitude. We characterized novel viruses related to coronaviruses, hepatitis delta virus and huge phages, respectively, and analysed their environmental reservoirs. To catalyse the ongoing revolution of viral discovery, we established a free and comprehensive database of these data and tools. Expanding the known sequence diversity of viruses can reveal the evolutionary origins of emerging pathogens and improve pathogen surveillance for the anticipation and mitigation of future pandemics.


Asunto(s)
Nube Computacional , Bases de Datos Genéticas , Virus ARN/genética , Virus ARN/aislamiento & purificación , Alineación de Secuencia/métodos , Virología/métodos , Viroma/genética , Animales , Archivos , Bacteriófagos/enzimología , Bacteriófagos/genética , Biodiversidad , Coronavirus/clasificación , Coronavirus/enzimología , Coronavirus/genética , Evolución Molecular , Virus de la Hepatitis Delta/enzimología , Virus de la Hepatitis Delta/genética , Humanos , Modelos Moleculares , Virus ARN/clasificación , Virus ARN/enzimología , ARN Polimerasa Dependiente del ARN/química , ARN Polimerasa Dependiente del ARN/genética , Programas Informáticos
2.
Bioinformatics ; 34(14): 2371-2375, 2018 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-29506021

RESUMEN

Motivation: The 16S ribosomal RNA (rRNA) gene is widely used to survey microbial communities. Sequences are often clustered into Operational Taxonomic Units (OTUs) as proxies for species. The canonical clustering threshold is 97% identity, which was proposed in 1994 when few 16S rRNA sequences were available, motivating a reassessment on current data. Results: Using a large set of high-quality 16S rRNA sequences from finished genomes, I assessed the correspondence of OTUs to species for five representative clustering algorithms using four accuracy metrics. All algorithms had comparable accuracy when tuned to a given metric. Optimal identity thresholds were ∼99% for full-length sequences and ∼100% for the V4 hypervariable region. Availability and implementation: Reference sequences and source code are provided in the Supplementary Material. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genes de ARNr , Microbiota/genética , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Análisis por Conglomerados
3.
Nature ; 488(7409): 86-90, 2012 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-22859206

RESUMEN

Land plants associate with a root microbiota distinct from the complex microbial community present in surrounding soil. The microbiota colonizing the rhizosphere (immediately surrounding the root) and the endophytic compartment (within the root) contribute to plant growth, productivity, carbon sequestration and phytoremediation. Colonization of the root occurs despite a sophisticated plant immune system, suggesting finely tuned discrimination of mutualists and commensals from pathogens. Genetic principles governing the derivation of host-specific endophyte communities from soil communities are poorly understood. Here we report the pyrosequencing of the bacterial 16S ribosomal RNA gene of more than 600 Arabidopsis thaliana plants to test the hypotheses that the root rhizosphere and endophytic compartment microbiota of plants grown under controlled conditions in natural soils are sufficiently dependent on the host to remain consistent across different soil types and developmental stages, and sufficiently dependent on host genotype to vary between inbred Arabidopsis accessions. We describe different bacterial communities in two geochemically distinct bulk soils and in rhizosphere and endophytic compartments prepared from roots grown in these soils. The communities in each compartment are strongly influenced by soil type. Endophytic compartments from both soils feature overlapping, low-complexity communities that are markedly enriched in Actinobacteria and specific families from other phyla, notably Proteobacteria. Some bacteria vary quantitatively between plants of different developmental stage and genotype. Our rigorous definition of an endophytic compartment microbiome should facilitate controlled dissection of plant-microbe interactions derived from complex soil communities.


Asunto(s)
Arabidopsis/microbiología , Endófitos/clasificación , Endófitos/aislamiento & purificación , Metagenoma , Raíces de Plantas/microbiología , Microbiología del Suelo , Actinobacteria/genética , Actinobacteria/aislamiento & purificación , Arabidopsis/clasificación , Arabidopsis/crecimiento & desarrollo , Endófitos/genética , Genotipo , Hibridación Fluorescente in Situ , Raíces de Plantas/clasificación , Raíces de Plantas/crecimiento & desarrollo , Proteobacteria/genética , Proteobacteria/aislamiento & purificación , ARN Ribosómico 16S/genética , ARN Ribosómico 16S/aislamiento & purificación , Rizosfera , Ribotipificación , Análisis de Secuencia de ADN , Simbiosis
4.
Nat Methods ; 10(10): 996-8, 2013 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-23955772

RESUMEN

Amplified marker-gene sequences can be used to understand microbial community structure, but they suffer from a high level of sequencing and amplification artifacts. The UPARSE pipeline reports operational taxonomic unit (OTU) sequences with ≤1% incorrect bases in artificial microbial community tests, compared with >3% incorrect bases commonly reported by other methods. The improved accuracy results in far fewer OTUs, consistently closer to the expected number of species in a community.


Asunto(s)
Microbiota/genética , Filogenia , ARN Ribosómico 16S/genética , Algoritmos , Bases de Datos Genéticas , Humanos , Metagenómica , Proyectos de Investigación , Sensibilidad y Especificidad , Programas Informáticos
5.
Bioinformatics ; 31(21): 3476-82, 2015 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-26139637

RESUMEN

MOTIVATION: Next-generation sequencing produces vast amounts of data with errors that are difficult to distinguish from true biological variation when coverage is low. RESULTS: We demonstrate large reductions in error frequencies, especially for high-error-rate reads, by three independent means: (i) filtering reads according to their expected number of errors, (ii) assembling overlapping read pairs and (iii) for amplicon reads, by exploiting unique sequence abundances to perform error correction. We also show that most published paired read assemblers calculate incorrect posterior quality scores. AVAILABILITY AND IMPLEMENTATION: These methods are implemented in the USEARCH package. Binaries are freely available at http://drive5.com/usearch. CONTACT: robert@drive5.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Algoritmos , Programas Informáticos
6.
bioRxiv ; 2024 Jan 21.
Artículo en Inglés | MEDLINE | ID: mdl-38293115

RESUMEN

Here, we describe the "Obelisks," a previously unrecognised class of viroid-like elements that we first identified in human gut metatranscriptomic data. "Obelisks" share several properties: (i) apparently circular RNA ~1kb genome assemblies, (ii) predicted rod-like secondary structures encompassing the entire genome, and (iii) open reading frames coding for a novel protein superfamily, which we call the "Oblins". We find that Obelisks form their own distinct phylogenetic group with no detectable sequence or structural similarity to known biological agents. Further, Obelisks are prevalent in tested human microbiome metatranscriptomes with representatives detected in ~7% of analysed stool metatranscriptomes (29/440) and in ~50% of analysed oral metatranscriptomes (17/32). Obelisk compositions appear to differ between the anatomic sites and are capable of persisting in individuals, with continued presence over >300 days observed in one case. Large scale searches identified 29,959 Obelisks (clustered at 90% nucleotide identity), with examples from all seven continents and in diverse ecological niches. From this search, a subset of Obelisks are identified to code for Obelisk-specific variants of the hammerhead type-III self-cleaving ribozyme. Lastly, we identified one case of a bacterial species (Streptococcus sanguinis) in which a subset of defined laboratory strains harboured a specific Obelisk RNA population. As such, Obelisks comprise a class of diverse RNAs that have colonised, and gone unnoticed in, human, and global microbiomes.

7.
Nat Commun ; 14(1): 2591, 2023 05 05.
Artículo en Inglés | MEDLINE | ID: mdl-37147358

RESUMEN

Earth's life may have originated as self-replicating RNA, and it has been argued that RNA viruses and viroid-like elements are remnants of such pre-cellular RNA world. RNA viruses are defined by linear RNA genomes encoding an RNA-dependent RNA polymerase (RdRp), whereas viroid-like elements consist of small, single-stranded, circular RNA genomes that, in some cases, encode paired self-cleaving ribozymes. Here we show that the number of candidate viroid-like elements occurring in geographically and ecologically diverse niches is much higher than previously thought. We report that, amongst these circular genomes, fungal ambiviruses are viroid-like elements that undergo rolling circle replication and encode their own viral RdRp. Thus, ambiviruses are distinct infectious RNAs showing hybrid features of viroid-like RNAs and viruses. We also detected similar circular RNAs, containing active ribozymes and encoding RdRps, related to mitochondrial-like fungal viruses, highlighting fungi as an evolutionary hub for RNA viruses and viroid-like elements. Our findings point to a deep co-evolutionary history between RNA viruses and subviral elements and offer new perspectives in the origin and evolution of primordial infectious agents, and RNA life.


Asunto(s)
Virus ARN , ARN Catalítico , Viroides , Viroides/genética , ARN Catalítico/genética , ARN Viral/genética , Replicación Viral/genética , ARN/genética , Virus ARN/genética , ARN Polimerasa Dependiente del ARN/genética , Hongos/genética
8.
Bioinformatics ; 27(16): 2194-200, 2011 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-21700674

RESUMEN

MOTIVATION: Chimeric DNA sequences often form during polymerase chain reaction amplification, especially when sequencing single regions (e.g. 16S rRNA or fungal Internal Transcribed Spacer) to assess diversity or compare populations. Undetected chimeras may be misinterpreted as novel species, causing inflated estimates of diversity and spurious inferences of differences between populations. Detection and removal of chimeras is therefore of critical importance in such experiments. RESULTS: We describe UCHIME, a new program that detects chimeric sequences with two or more segments. UCHIME either uses a database of chimera-free sequences or detects chimeras de novo by exploiting abundance data. UCHIME has better sensitivity than ChimeraSlayer (previously the most sensitive database method), especially with short, noisy sequences. In testing on artificial bacterial communities with known composition, UCHIME de novo sensitivity is shown to be comparable to Perseus. UCHIME is >100× faster than Perseus and >1000× faster than ChimeraSlayer. CONTACT: robert@drive5.com AVAILABILITY: Source, binaries and data: http://drive5.com/uchime. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Artefactos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Biología Computacional , Reacción en Cadena de la Polimerasa
9.
Nucleic Acids Res ; 38(7): 2145-53, 2010 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-20047958

RESUMEN

Multiple protein sequence alignment methods are central to many applications in molecular biology. These methods are typically assessed on benchmark datasets including BALIBASE, OXBENCH, PREFAB and SABMARK, which are important to biologists in making informed choices between programs. In this article, annotations of domain homology and secondary structure are used to define new measures of alignment quality and are used to make the first systematic, independent evaluation of these benchmarks. These measures indicate sensitivity and specificity while avoiding the ambiguous residue correspondences and arbitrary distance cutoffs inherent to structural superpositions. Alignments by selected methods that indicate high-confidence columns (ALIGN-M, DIALIGN-T, FSA and MUSCLE) are also assessed. Fold space coverage and effective benchmark database sizes are estimated by reference to domain annotations, and significant redundancy is found in all benchmarks except SABMARK. Questionable alignments are found in all benchmarks, especially in BALIBASE where 87% of sequences have unknown structure, 20% of columns contain different folds according to SUPERFAMILY and 30% of 'core block' columns have conflicting secondary structure according to DSSP. A careful analysis of current protein multiple alignment benchmarks calls into question their ability to determine reliable algorithm rankings.


Asunto(s)
Alineación de Secuencia/normas , Análisis de Secuencia de Proteína , Secuencia de Aminoácidos , Benchmarking , Bases de Datos de Proteínas , Datos de Secuencia Molecular , Pliegue de Proteína , Estructura Secundaria de Proteína , Estructura Terciaria de Proteína , Estándares de Referencia , Alineación de Secuencia/métodos , Programas Informáticos
10.
Proc Natl Acad Sci U S A ; 106(31): 12855-60, 2009 Aug 04.
Artículo en Inglés | MEDLINE | ID: mdl-19625614

RESUMEN

Interspersed repeat composition and distribution in mammals have been best characterized in the human and mouse genomes. The bovine genome contains typical eutherian mammal repeats, but also has a significant number of long interspersed nuclear element RTE (BovB) elements proposed to have been horizontally transferred from squamata. Our analysis of the BovB repeats has indicated that only a few of them are currently likely to retrotranspose in cattle. However, bovine L1 repeats (L1 BT) have many likely active copies. Comparison of substitution rates for BovB and L1 BT indicates that L1 BT is a younger repeat family than BovB. In contrast to mouse and human, L1 occurrence is not negatively correlated with G+C content. However, BovB, Bov A2, ART2A, and Bov-tA are negatively correlated with G+C, although Bov-tAs correlation is weaker. Also, by performing genome wide correlation analysis of interspersed and simple sequence repeats, we have identified genome territories by repeat content that appear to define ancestral vs. ruminant-specific genomic regions. These ancestral regions, enriched with L2 and MIR repeats, are largely conserved between bovine and human.


Asunto(s)
Bovinos/genética , Genoma , Secuencias Repetitivas de Ácidos Nucleicos , Retroelementos , Animales , Composición de Base , Metilación de ADN , Elementos de Nucleótido Esparcido Largo
11.
Nat Commun ; 13(1): 6968, 2022 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-36379955

RESUMEN

Multiple sequence alignments are widely used to infer evolutionary relationships, enabling inferences of structure, function, and phylogeny. Standard practice is to construct one alignment by some preferred method and use it in further analysis; however, undetected alignment bias can be problematic. I describe Muscle5, a novel algorithm which constructs an ensemble of high-accuracy alignment with diverse biases by perturbing a hidden Markov model and permuting its guide tree. Confidence in an inference is assessed as the fraction of the ensemble which supports it. Applied to phylogenetic tree estimation, I show that ensembles can confidently resolve topologies with low bootstrap according to standard methods, and conversely that some topologies with high bootstraps are incorrect. Applied to the phylogeny of RNA viruses, ensemble analysis shows that recently adopted taxonomic phyla are probably polyphyletic. Ensemble analysis can improve confidence assessment in any inference from an alignment.


Asunto(s)
Algoritmos , Evolución Biológica , Filogenia , Alineación de Secuencia , Homología de Secuencia
12.
Pathogens ; 11(7)2022 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-35890050

RESUMEN

Conventionally, hyperimmune globulin drugs manufactured from pooled immunoglobulins from vaccinated or convalescent donors have been used in treating infections where no treatment is available. This is especially important where multi-epitope neutralization is required to prevent the development of immune-evading viral mutants that can emerge upon treatment with monoclonal antibodies. Using microfluidics, flow sorting, and a targeted integration cell line, a first-in-class recombinant hyperimmune globulin therapeutic against SARS-CoV-2 (GIGA-2050) was generated. Using processes similar to conventional monoclonal antibody manufacturing, GIGA-2050, comprising 12,500 antibodies, was scaled-up for clinical manufacturing and multiple development/tox lots were assessed for consistency. Antibody sequence diversity, cell growth, productivity, and product quality were assessed across different manufacturing sites and production scales. GIGA-2050 was purified and tested for good laboratory procedures (GLP) toxicology, pharmacokinetics, and in vivo efficacy against natural SARS-CoV-2 infection in mice. The GIGA-2050 master cell bank was highly stable, producing material at consistent yield and product quality up to >70 generations. Good manufacturing practices (GMP) and development batches of GIGA-2050 showed consistent product quality, impurity clearance, potency, and protection in an in vivo efficacy model. Nonhuman primate toxicology and pharmacokinetics studies suggest that GIGA-2050 is safe and has a half-life similar to other recombinant human IgG1 antibodies. These results supported a successful investigational new drug application for GIGA-2050. This study demonstrates that a new class of drugs, recombinant hyperimmune globulins, can be manufactured consistently at the clinical scale and presents a new approach to treating infectious diseases that targets multiple epitopes of a virus.

13.
Bioinformatics ; 26(19): 2460-1, 2010 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-20709691

RESUMEN

MOTIVATION: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. RESULTS: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. AVAILABILITY: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Análisis por Conglomerados , Bases de Datos de Proteínas , Proteínas/química
14.
Nat Biotechnol ; 39(8): 989-999, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-33859400

RESUMEN

Plasma-derived polyclonal antibody therapeutics, such as intravenous immunoglobulin, have multiple drawbacks, including low potency, impurities, insufficient supply and batch-to-batch variation. Here we describe a microfluidics and molecular genomics strategy for capturing diverse mammalian antibody repertoires to create recombinant multivalent hyperimmune globulins. Our method generates of diverse mixtures of thousands of recombinant antibodies, enriched for specificity and activity against therapeutic targets. Each hyperimmune globulin product comprised thousands to tens of thousands of antibodies derived from convalescent or vaccinated human donors or from immunized mice. Using this approach, we generated hyperimmune globulins with potent neutralizing activity against severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) in under 3 months, Fc-engineered hyperimmune globulins specific for Zika virus that lacked antibody-dependent enhancement of disease, and hyperimmune globulins specific for lung pathogens present in patients with primary immune deficiency. To address the limitations of rabbit-derived anti-thymocyte globulin, we generated a recombinant human version and demonstrated its efficacy in mice against graft-versus-host disease.


Asunto(s)
Linfocitos B/inmunología , COVID-19/terapia , Globulinas/biosíntesis , SARS-CoV-2/inmunología , Animales , Anticuerpos Antivirales/inmunología , Células CHO , Cricetulus , Ensayo de Inmunoadsorción Enzimática , Globulinas/inmunología , Humanos , Inmunización Pasiva , Ratones , Proteínas Recombinantes/biosíntesis , Proteínas Recombinantes/inmunología , Virus Zika/inmunología , Sueroterapia para COVID-19
15.
Nat Biotechnol ; 38(5): 609-619, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32393905

RESUMEN

T cells engineered to express antigen-specific T cell receptors (TCRs) are potent therapies for viral infections and cancer. However, efficient identification of clinical candidate TCRs is complicated by the size and complexity of T cell repertoires and the challenges of working with primary T cells. Here we present a high-throughput method to identify TCRs with high functional avidity from diverse human T cell repertoires. The approach used massively parallel microfluidics to generate libraries of natively paired, full-length TCRαß clones, from millions of primary T cells, which were then expressed in Jurkat cells. The TCRαß-Jurkat libraries enabled repeated screening and panning for antigen-reactive TCRs using peptide major histocompatibility complex binding and cellular activation. We captured more than 2.9 million natively paired TCRαß clonotypes from six healthy human donors and identified rare (<0.001% frequency) viral-antigen-reactive TCRs. We also mined a tumor-infiltrating lymphocyte sample from a patient with melanoma and identified several tumor-specific TCRs, which, after expression in primary T cells, led to tumor cell killing.


Asunto(s)
Antígenos/análisis , Receptores de Antígenos de Linfocitos T alfa-beta/inmunología , Linfocitos T/citología , Ingeniería Celular , Biblioteca de Genes , Humanos , Células Jurkat , Linfocitos Infiltrantes de Tumor/inmunología , Melanoma/inmunología , Linfocitos T/inmunología , Virus/inmunología
16.
BMC Bioinformatics ; 10: 396, 2009 Dec 02.
Artículo en Inglés | MEDLINE | ID: mdl-19954534

RESUMEN

BACKGROUND: While substitution matrices can readily be computed from reference alignments, it is challenging to compute optimal or approximately optimal gap penalties. It is also not well understood which substitution matrices are the most effective when alignment accuracy is the goal rather than homolog recognition. Here a new parameter optimization procedure, POP, is described and applied to the problems of optimizing gap penalties and selecting substitution matrices for pair-wise global protein alignments. RESULTS: POP is compared to a recent method due to Kim and Kececioglu and found to achieve from 0.2% to 1.3% higher accuracies on pair-wise benchmarks extracted from BALIBASE. The VTML matrix series is shown to be the most accurate on several global pair-wise alignment benchmarks, with VTML200 giving best or close to the best performance in all tests. BLOSUM matrices are found to be slightly inferior, even with the marginal improvements in the bug-fixed RBLOSUM series. The PAM series is significantly worse, giving accuracies typically 2% less than VTML. Integer rounding is found to cause slight degradations in accuracy. No evidence is found that selecting a matrix based on sequence divergence improves accuracy, suggesting that the use of this heuristic in CLUSTALW may be ineffective. Using VTML200 is found to improve the accuracy of CLUSTALW by 8% on BALIBASE and 5% on PREFAB. CONCLUSION: The hypothesis that more accurate alignments of distantly related sequences may be achieved using low-identity matrices is shown to be false for commonly used matrix types. Source code and test data is freely available from the author's web site at http://www.drive5.com/pop.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Alineación de Secuencia/métodos , Algoritmos , Bases de Datos de Proteínas , Análisis de Secuencia de Proteína
17.
Curr Opin Struct Biol ; 16(3): 368-73, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16679011

RESUMEN

Multiple sequence alignments are an essential tool for protein structure and function prediction, phylogeny inference and other common tasks in sequence analysis. Recently developed systems have advanced the state of the art with respect to accuracy, ability to scale to thousands of proteins and flexibility in comparing proteins that do not share the same domain architecture. New multiple alignment benchmark databases include PREFAB, SABMARK, OXBENCH and IRMBASE. Although CLUSTALW is still the most popular alignment tool to date, recent methods offer significantly better alignment quality and, in some cases, reduced computational cost.


Asunto(s)
Alineación de Secuencia/métodos , Programas Informáticos , Algoritmos , Biología Computacional/métodos , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido
18.
MAbs ; 11(5): 870-883, 2019 07.
Artículo en Inglés | MEDLINE | ID: mdl-30898066

RESUMEN

Immunization of mice followed by hybridoma or B-cell screening is one of the most common antibody discovery methods used to generate therapeutic monoclonal antibody (mAb) candidates. There are a multitude of different immunization protocols that can generate an immune response in animals. However, an extensive analysis of the antibody repertoires that these alternative immunization protocols can generate has not been performed. In this study, we immunized mice that transgenically express human antibodies with either programmed cell death 1 protein or cytotoxic T-lymphocyte associated protein 4 using four different immunization protocols, and then utilized a single cell microfluidic platform to generate tissue-specific, natively paired immunoglobulin (Ig) repertoires from each method and enriched for target-specific binders using yeast single-chain variable fragment (scFv) display. We deep sequenced the scFv repertoires from both the pre-sort and post-sort libraries. All methods and both targets yielded similar oligoclonality, variable (V) and joining (J) gene usage, and divergence from germline of enriched libraries. However, there were differences between targets and/or immunization protocols for overall clonal counts, complementarity-determining region 3 (CDR3) length, and antibody/CDR3 sequence diversity. Our data suggest that, although different immunization protocols may generate a response to an antigen, performing multiple immunization protocols in parallel can yield greater Ig diversity. We conclude that modern microfluidic methods, followed by an extensive molecular genomic analysis of antibody repertoires, can be used to quickly analyze new immunization protocols or mouse platforms.


Asunto(s)
Anticuerpos Monoclonales Humanizados/genética , Diversidad de Anticuerpos , Inmunización/métodos , Microfluídica/métodos , Animales , Anticuerpos Monoclonales Humanizados/inmunología , Linfocitos B/inmunología , Antígeno CTLA-4/inmunología , Regiones Determinantes de Complementariedad/genética , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Hibridomas , Ratones , Ratones Transgénicos , Biblioteca de Péptidos , Receptor de Muerte Celular Programada 1/inmunología , Anticuerpos de Cadena Única/genética , Anticuerpos de Cadena Única/inmunología
19.
Antibodies (Basel) ; 8(1)2019 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-31544823

RESUMEN

To discover therapeutically relevant antibody candidates, many groups use mouse immunization followed by hybridoma generation or B cell screening. One modern approach is to screen B cells by generating natively paired single chain variable fragment (scFv) display libraries in yeast. Such methods typically rely on soluble antigens for scFv library screening. However, many therapeutically relevant cell-surface targets are difficult to express in a soluble protein format, complicating discovery. In this study, we developed methods to screen humanized mouse-derived yeast scFv libraries using recombinant OX40 protein in cell lysate. We used deep sequencing to compare screening with cell lysate to screening with soluble OX40 protein, in the context of mouse immunizations using either soluble OX40 or OX40-expressing cells and OX40-encoding DNA vector. We found that all tested methods produce a unique diversity of scFv binders. However, when we reformatted forty-one of these scFv as full-length monoclonal antibodies (mAbs), we observed that mAbs identified using soluble antigen immunization with cell lysate sorting always bound cell surface OX40, whereas other methods had significant false positive rates. Antibodies identified using soluble antigen immunization and cell lysate sorting were also significantly more likely to activate OX40 in a cellular assay. Our data suggest that sorting with OX40 protein in cell lysate is more likely than other methods to retain the epitopes required for antibody-mediated OX40 agonism.

20.
Nucleic Acids Res ; 34(20): 5932-42, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-17068081

RESUMEN

Multiple sequence alignments are the usual starting point for analyses of protein structure and evolution. For proteins with repeated, shuffled and missing domains, however, traditional multiple sequence alignment algorithms fail to provide an accurate view of homology between related proteins, because they either assume that the input sequences are globally alignable or require locally alignable regions to appear in the same order in all sequences. In this paper, we present ProDA, a novel system for automated detection and alignment of homologous regions in collections of proteins with arbitrary domain architectures. Given an input set of unaligned sequences, ProDA identifies all homologous regions appearing in one or more sequences, and returns a collection of local multiple alignments for these regions. On a subset of the BAliBASE benchmarking suite containing curated alignments of proteins with complicated domain architectures, ProDA performs well in detecting conserved domain boundaries and clustering domain segments, achieving the highest accuracy to date for this task. We conclude that ProDA is a practical tool for automated alignment of protein sequences with repeats and rearrangements in their domain architecture.


Asunto(s)
Estructura Terciaria de Proteína , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína , Programas Informáticos , Algoritmos , Biología Computacional , Internet , Secuencias Repetitivas de Aminoácido , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA