Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
1.
PLoS One ; 18(9): e0291169, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37729186

RESUMEN

Campaign contributions are a staple of congressional life. Yet, the search for tangible effects of congressional donations often focuses on the association between contributions and votes on congressional bills. We present an alternative approach by considering the relationship between money and legislators' speech. Floor speeches are an important component of congressional behavior, and reflect a legislator's policy priorities and positions in a way that voting cannot. Our research provides the first comprehensive analysis of the association between a legislator's campaign donors and the policy issues they prioritize with congressional speech. Ultimately, we find a robust relationship between donors and speech, indicating a more pervasive role of money in politics than previously assumed. We use a machine learning framework on a new dataset that brings together legislator metadata for all representatives in the US House between 1995 and 2018, including committee assignments, legislative speech, donation records, and information about Political Action Committees. We compare information about donations against other potential explanatory variables, such as party affiliation, home state, and committee assignments, and find that donors consistently have the strongest association with legislators' issue-attention. We further contribute a procedure for identifying speech and donation events that occur in close proximity to one another and share meaningful connections, identifying the proverbial needles in the haystack of speech and donation activity in Congress which may be cases of interest for investigative journalism. Taken together, our framework, data, and findings can help increase the transparency of the role of money in politics.


Asunto(s)
Aprendizaje Automático , Donantes de Tejidos , Humanos , Metadatos , Políticas , Política
2.
Cogsci ; 2021: 1767-1773, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-34617074

RESUMEN

A longstanding question in cognitive science concerns the learning mechanisms underlying compositionality in human cognition. Humans can infer the structured relationships (e.g., grammatical rules) implicit in their sensory observations (e.g., auditory speech), and use this knowledge to guide the composition of simpler meanings into complex wholes. Recent progress in artificial neural networks has shown that when large models are trained on enough linguistic data, grammatical structure emerges in their representations. We extend this work to the domain of mathematical reasoning, where it is possible to formulate precise hypotheses about how meanings (e.g., the quantities corresponding to numerals) should be composed according to structured rules (e.g., order of operations). Our work shows that neural networks are not only able to infer something about the structured relationships implicit in their training data, but can also deploy this knowledge to guide the composition of individual meanings into composite wholes.

3.
Cell ; 177(1): 32-37, 2019 03 21.
Artículo en Inglés | MEDLINE | ID: mdl-30901545

RESUMEN

The introduction of exome sequencing in the clinic has sparked tremendous optimism for the future of rare disease diagnosis, and there is exciting opportunity to further leverage these advances. To provide diagnostic clarity to all of these patients, however, there is a critical need for the field to develop and implement strategies to understand the mechanisms underlying all rare diseases and translate these to clinical care.


Asunto(s)
Secuenciación del Exoma/tendencias , Enfermedades Raras/diagnóstico , Investigación Biomédica Traslacional/métodos , Exoma , Pruebas Genéticas , Genoma Humano/genética , Secuenciación de Nucleótidos de Alto Rendimiento/tendencias , Humanos , Enfermedades Raras/genética , Análisis de Secuencia de ADN/métodos , Secuenciación del Exoma/métodos
4.
IEEE Trans Pattern Anal Mach Intell ; 41(12): 3086-3099, 2019 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-30130178

RESUMEN

A new unified video analytics framework (ER3) is proposed for complex event retrieval, recognition and recounting, based on the proposed video imprint representation, which exploits temporal correlations among image features across video frames. With the video imprint representation, it is convenient to reverse map back to both temporal and spatial locations in video frames, allowing for both key frame identification and key areas localization within each frame. In the proposed framework, a dedicated feature alignment module is incorporated for redundancy removal across frames to produce the tensor representation, i.e., the video imprint. Subsequently, the video imprint is individually fed into both a reasoning network and a feature aggregation module, for event recognition/recounting and event retrieval tasks, respectively. Thanks to its attention mechanism inspired by the memory networks used in language modeling, the proposed reasoning network is capable of simultaneous event category recognition and localization of the key pieces of evidence for event recounting. In addition, the latent structure in our reasoning network highlights the areas of the video imprint, which can be directly used for event recounting. With the event retrieval task, the compact video representation aggregated from the video imprint contributes to better retrieval results than existing state-of-the-art methods.

5.
Genome Res ; 27(12): 2015-2024, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-29097404

RESUMEN

Our ability to predict protein expression from DNA sequence alone remains poor, reflecting our limited understanding of cis-regulatory grammar and hampering the design of engineered genes for synthetic biology applications. Here, we generate a model that predicts the protein expression of the 5' untranslated region (UTR) of mRNAs in the yeast Saccharomyces cerevisiae. We constructed a library of half a million 50-nucleotide-long random 5' UTRs and assayed their activity in a massively parallel growth selection experiment. The resulting data allow us to quantify the impact on protein expression of Kozak sequence composition, upstream open reading frames (uORFs), and secondary structure. We trained a convolutional neural network (CNN) on the random library and showed that it performs well at predicting the protein expression of both a held-out set of the random 5' UTRs as well as native S. cerevisiae 5' UTRs. The model additionally was used to computationally evolve highly active 5' UTRs. We confirmed experimentally that the great majority of the evolved sequences led to higher protein expression rates than the starting sequences, demonstrating the predictive power of this model.


Asunto(s)
Modelos Genéticos , Saccharomyces cerevisiae/genética , Regiones no Traducidas 5' , Empalme Alternativo , Simulación por Computador , Biblioteca de Genes , Aprendizaje Automático , Redes Neurales de la Computación , ARN de Hongos , ARN Mensajero
6.
Artif Intell Med ; 70: 1-11, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27431033

RESUMEN

OBJECTIVE: High-throughput technologies have generated an unprecedented amount of high-dimensional gene expression data. Algorithmic approaches could be extremely useful to distill information and derive compact interpretable representations of the statistical patterns present in the data. This paper proposes a mining approach to extract an informative representation of gene expression profiles based on a generative model called the Counting Grid (CG). METHOD: Using the CG model, gene expression values are arranged on a discrete grid, learned in a way that "similar" co-expression patterns are arranged in close proximity, thus resulting in an intuitive visualization of the dataset. More than this, the model permits to identify the genes that distinguish between classes (e.g. different types of cancer). Finally, each sample can be characterized with a discriminative signature - extracted from the model - that can be effectively employed for classification. RESULTS: A thorough evaluation on several gene expression datasets demonstrate the suitability of the proposed approach from a twofold perspective: numerically, we reached state-of-the-art classification accuracies on 5 datasets out of 7, and similar results when the approach is tested in a gene selection setting (with a stability always above 0.87); clinically, by confirming that many of the genes highlighted by the model as significant play also a key role for cancer biology. CONCLUSION: The proposed framework can be successfully exploited to meaningfully visualize the samples; detect medically relevant genes; properly classify samples.


Asunto(s)
Algoritmos , Minería de Datos , Perfilación de la Expresión Génica , Análisis por Conglomerados , Genes Relacionados con las Neoplasias , Humanos , Neoplasias/genética
7.
AIDS ; 30(5): 701-11, 2016 Mar 13.
Artículo en Inglés | MEDLINE | ID: mdl-26730570

RESUMEN

OBJECTIVES: AIDS is caused by CD4 T-cell depletion. Although combination antiretroviral therapy can restore blood T-cell numbers, the clonal diversity of the reconstituting cells, critical for immunocompetence, is not well defined. METHODS: We performed an extensive analysis of parameters of thymic function in perinatally HIV-1-infected (n = 39) and control (n = 28) participants ranging from 13 to 23 years of age. CD4 T cells including naive (CD27 CD45RA) and recent thymic emigrant (RTE) (CD31/CD45RA) cells, were quantified by flow cytometry. Deep sequencing was used to examine T-cell receptor (TCR) sequence diversity in sorted RTE CD4 T cells. RESULTS: Infected participants had reduced CD4 T-cell levels with predominant depletion of the memory subset and preservation of naive cells. RTE CD4 T-cell levels were normal in most infected individuals, and enhanced thymopoiesis was indicated by higher proportions of CD4 T cells containing TCR recombination excision circles. Memory CD4 T-cell depletion was highly associated with CD8 T-cell activation in HIV-1-infected persons and plasma interlekin-7 levels were correlated with naive CD4 T cells, suggesting activation-driven loss and compensatory enhancement of thymopoiesis. Deep sequencing of CD4 T-cell receptor sequences in well compensated infected persons demonstrated supranormal diversity, providing additional evidence of enhanced thymic output. CONCLUSION: Despite up to two decades of infection, many individuals have remarkable thymic reserve to compensate for ongoing CD4 T-cell loss, although there is ongoing viral replication and immune activation despite combination antiretroviral therapy. The longer term sustainability of this physiology remains to be determined.


Asunto(s)
Linfocitos T CD4-Positivos/inmunología , Infecciones por VIH/inmunología , VIH-1/crecimiento & desarrollo , Subgrupos de Linfocitos T/inmunología , Timo/fisiología , Adolescente , Linfocitos T CD4-Positivos/química , Linfocitos T CD4-Positivos/clasificación , Femenino , Citometría de Flujo , Variación Genética , Infecciones por VIH/virología , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Antígenos Comunes de Leucocito/análisis , Masculino , Molécula-1 de Adhesión Celular Endotelial de Plaqueta/análisis , Receptores de Antígenos de Linfocitos T/genética , Análisis de Secuencia de ADN , Subgrupos de Linfocitos T/química , Subgrupos de Linfocitos T/clasificación , Miembro 7 de la Superfamilia de Receptores de Factores de Necrosis Tumoral/análisis , Adulto Joven
8.
J Infect Dis ; 213(8): 1248-52, 2016 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-26655301

RESUMEN

Outcomes of chronic infection with hepatitis B virus (HBV) are varied, with increased morbidity reported in the context of human immunodeficiency virus (HIV) coinfection. The factors driving different outcomes are not well understood, but there is increasing interest in an HLA class I effect. We therefore studied the influence of HLA class I on HBV in an African HIV-positive cohort. We demonstrated that virologic markers of HBV disease activity (hepatitis B e antigen status or HBV DNA level) are associated with HLA-A genotype. This finding supports the role of the CD8(+) T-cell response in HBV control, and potentially informs future therapeutic T-cell vaccine strategies.


Asunto(s)
Coinfección , Infecciones por VIH , Antígenos HLA/genética , Antígenos e de la Hepatitis B/sangre , Hepatitis B , Adulto , Estudios de Cohortes , Coinfección/complicaciones , Coinfección/epidemiología , Coinfección/genética , Coinfección/virología , Femenino , Infecciones por VIH/complicaciones , Infecciones por VIH/epidemiología , Infecciones por VIH/virología , Hepatitis B/complicaciones , Hepatitis B/epidemiología , Hepatitis B/genética , Hepatitis B/virología , Humanos , Masculino , Prevalencia , Curva ROC
9.
IEEE Trans Pattern Anal Mach Intell ; 37(12): 2374-87, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26539844

RESUMEN

In recent scene recognition research images or large image regions are often represented as disorganized "bags" of features which can then be analyzed using models originally developed to capture co-variation of word counts in text. However, image feature counts are likely to be constrained in different ways than word counts in text. For example, as a camera pans upwards from a building entrance over its first few floors and then further up into the sky Fig. 1 Fig. 1. Feature counts change slightly as the field of view moves. For example, the abundance of the "car" features is reduced, but the counts of the features found on building facades are increased. The counting grid model accounts for such changes naturally, and it can also account for images of different scenes.

10.
Science ; 347(6218): 1254806, 2015 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-25525159

RESUMEN

To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.


Asunto(s)
Inteligencia Artificial , Trastornos Generalizados del Desarrollo Infantil/genética , Neoplasias Colorrectales Hereditarias sin Poliposis/genética , Estudio de Asociación del Genoma Completo/métodos , Anotación de Secuencia Molecular/métodos , Atrofia Muscular Espinal/genética , Empalme del ARN/genética , Proteínas Adaptadoras Transductoras de Señales/genética , Simulación por Computador , ADN/genética , Exones/genética , Código Genético , Marcadores Genéticos , Variación Genética , Humanos , Intrones/genética , Modelos Genéticos , Homólogo 1 de la Proteína MutL , Mutación Missense , Proteínas Nucleares/genética , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Sitios de Empalme de ARN/genética , Proteínas de Unión al ARN/genética
11.
Med Image Comput Comput Assist Interv ; 17(Pt 2): 805-12, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25485454

RESUMEN

This paper exploits the embedding provided by the counting grid model and proposes a framework for the classification and the analysis of brain MRI images. Each brain, encoded by a count of local features, is mapped into a window on a grid of feature distributions. Similar sample are mapped in close proximity on the grid and their commonalities in their feature distributions are reflected in the overlap of windows on the grid. Here we exploited these properties to design a novel kernel and a visualization strategy which we applied to the analysis of schizophrenic patients. Experiments report a clear improvement in classification accuracy as compared with similar methods. Moreover, our visualizations are able to highlight brain clusters and to obtain a visual interpretation of the features related to the disease.


Asunto(s)
Algoritmos , Mapeo Encefálico/métodos , Interpretación de Imagen Asistida por Computador/métodos , Imagen por Resonancia Magnética/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Esquizofrenia/patología , Femenino , Humanos , Aumento de la Imagen/métodos , Masculino , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
12.
Acta Trop ; 135: 104-21, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-24681218

RESUMEN

Malaria remains a public health hazard in tropical countries as a consequence of the rise and spread of drug and insecticide resistances; hence the need for a vaccine with widespread application. Protective immunity to malaria is known to be mediated by both antibody and cellular immune responses, though characterization of the latter has been less extensive. The aim of the present investigation was to identify novel T-cell epitopes that may contribute to naturally acquired immune responses against malaria. Using the Microsoft software, Epitome™ T-cell peptide epitopes on 19 Plasmodium falciparum proteins in the Plasmodium Database (www.plasmodb.org.PlasmoDB 9.0) were predicted in-silico. The peptides were synthesized and used to stimulate peripheral blood mononuclear cells (PBMCs) in 14 semi-immune and 21 malaria susceptible subjects for interferon-gamma (IFN-γ) production ex-vivo. The level of IFN-γ production, a marker of T-cell responses, was measured by ELISPOT assay in semi-immune subjects (SIS) and frequently sick subjects (FSS) from an endemic zone with perennial malaria transmission. Of the 19 proteins studied, 17 yielded 27 pools (189 peptides), which were reactive with the subjects' PBMCs when tested for IFN-γ production, taking a stimulation index (SI) of ≥2 as a cutoff point for a positive response. There were 10 reactive peptide pools (constituting eight protein loci) with an SI of 10 or greater. Of the 19 proteins studied, two were known vaccine candidates (MSP-8 and SSP2/TRAP), which reacted both with SIS and FSS. Similarly the hypothetical proteins (PFF1030w, PFE0795c, PFD0880w, PFC0065c and PF10_0052) also reacted strongly with both SIS and FSS making them attractive for further characterization as mediators of protective immunity and/or pathogenesis.


Asunto(s)
Antígenos de Protozoos/inmunología , Interferón gamma/metabolismo , Malaria Falciparum/inmunología , Plasmodium falciparum/inmunología , Proteínas Protozoarias/inmunología , Linfocitos T/inmunología , Adulto , Simulación por Computador , Ensayo de Immunospot Ligado a Enzimas , Femenino , Humanos , Masculino , Persona de Mediana Edad , Péptidos/genética , Péptidos/inmunología , Adulto Joven
13.
Pac Symp Biocomput ; : 288-99, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24297555

RESUMEN

The immune system gathers evidence of the execution of various molecular processes, both foreign and the cells' own, as time- and space-varying sets of epitopes, small linear or conformational segments of the proteins involved in these processes. Epitopes do not have any obvious ordering in this scheme: The immune system simply sees these epitope sets as disordered "bags" of simple signatures based on whose contents the actions need to be decided. The immense landscape of possible bags of epitopes is shaped by the cellular pathways in various cells, as well as the characteristics of the internal sampling process that chooses and brings epitopes to cellular surface. As a consequence, upon the infection by the same pathogen, different individuals' cells present very different epitope sets. Modeling this landscape should thus be a key step in computational immunology. We show that among possible bag-of-words models, the counting grid is most fit for modeling cellular presentation. We describe each patient by a bag-of-peptides they are likely to present on the cellular surface. In regression tests, we found that compared to the state-of-the-art, counting grids explain more than twice as much of the log viral load variance in these patients. This is potentially a significant advancement in the field, given that a large part of the log viral load variance also depends on the infecting HIV strain, and that HIV polymorphisms themselves are known to strongly associate with HLA types, both effects beyond what is modeled here.


Asunto(s)
VIH/genética , VIH/inmunología , Modelos Inmunológicos , Carga Viral/estadística & datos numéricos , Biología Computacional , Epítopos/genética , Antígenos VIH/genética , Infecciones por VIH/inmunología , Infecciones por VIH/virología , Antígenos HLA/genética , Antígenos HLA/metabolismo , Prueba de Histocompatibilidad , Interacciones Huésped-Patógeno/genética , Interacciones Huésped-Patógeno/inmunología , Humanos , Medicina de Precisión , Análisis de Regresión
14.
Proc Natl Acad Sci U S A ; 110(33): 13492-7, 2013 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-23878211

RESUMEN

Experimental and computational evidence suggests that HLAs preferentially bind conserved regions of viral proteins, a concept we term "targeting efficiency," and that this preference may provide improved clearance of infection in several viral systems. To test this hypothesis, T-cell responses to A/H1N1 (2009) were measured from peripheral blood mononuclear cells obtained from a household cohort study performed during the 2009-2010 influenza season. We found that HLA targeting efficiency scores significantly correlated with IFN-γ enzyme-linked immunosorbent spot responses (P = 0.042, multiple regression). A further population-based analysis found that the carriage frequencies of the alleles with the lowest targeting efficiencies, A*24, were associated with pH1N1 mortality (r = 0.37, P = 0.031) and are common in certain indigenous populations in which increased pH1N1 morbidity has been reported. HLA efficiency scores and HLA use are associated with CD8 T-cell magnitude in humans after influenza infection. The computational tools used in this study may be useful predictors of potential morbidity and identify immunologic differences of new variant influenza strains more accurately than evolutionary sequence comparisons. Population-based studies of the relative frequency of these alleles in severe vs. mild influenza cases might advance clinical practices for severe H1N1 infections among genetically susceptible populations.


Asunto(s)
Linfocitos T CD4-Positivos/inmunología , Antígenos HLA/inmunología , Subtipo H1N1 del Virus de la Influenza A , Gripe Humana/epidemiología , Gripe Humana/inmunología , Estudios de Cohortes , Biología Computacional/métodos , Ensayo de Immunospot Ligado a Enzimas , Frecuencia de los Genes , Antígenos HLA/metabolismo , Humanos , Interferón gamma/inmunología , Modelos Estadísticos , Análisis de Regresión
15.
BMC Bioinformatics ; 13 Suppl 6: S11, 2012 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-22537040

RESUMEN

Transcript quantification is a long-standing problem in genomics and estimating the relative abundance of alternatively-spliced isoforms from the same transcript is an important special case. Both problems have recently been illuminated by high-throughput RNA sequencing experiments which are quickly generating large amounts of data. However, much of the signal present in this data is corrupted or obscured by biases resulting in non-uniform and non-proportional representation of sequences from different transcripts. Many existing analyses attempt to deal with these and other biases with various task-specific approaches, which makes direct comparison between them difficult. However, two popular tools for isoform quantification, MISO and Cufflinks, have adopted a general probabilistic framework to model and mitigate these biases in a more general fashion. These advances motivate the need to investigate the effects of RNA-seq biases on the accuracy of different approaches for isoform quantification. We conduct the investigation by building models of increasing sophistication to account for noise introduced by the biases and compare their accuracy to the established approaches. We focus on methods that estimate the expression of alternatively-spliced isoforms with the percent-spliced-in (PSI) metric for each exon skipping event. To improve their estimates, many methods use evidence from RNA-seq reads that align to exon bodies. However, the methods we propose focus on reads that span only exon-exon junctions. As a result, our approaches are simpler and less sensitive to exon definitions than existing methods, which enables us to distinguish their strengths and weaknesses more easily. We present several probabilistic models of of position-specific read counts with increasing complexity and compare them to each other and to the current state-of-the-art methods in isoform quantification, MISO and Cufflinks. On a validation set with RT-PCR measurements for 26 cassette events, some of our methods are more accurate and some are significantly more consistent than these two popular tools. This comparison demonstrates the challenges in estimating the percent inclusion of alternatively spliced junctions and illuminates the tradeoffs between different approaches.


Asunto(s)
Empalme Alternativo , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Exones , Perfilación de la Expresión Génica , Células HeLa , Humanos , Modelos Estadísticos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
16.
IEEE Trans Pattern Anal Mach Intell ; 34(7): 1249-62, 2012 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-22156097

RESUMEN

A score function induced by a generative model of the data can provide a feature vector of a fixed dimension for each data sample. Data samples themselves may be of differing lengths (e.g., speech segments or other sequential data), but as a score function is based on the properties of the data generation process, it produces a fixed-length vector in a highly informative space, typically referred to as "score space." Discriminative classifiers have been shown to achieve higher performances in appropriately chosen score spaces with respect to what is achievable by either the corresponding generative likelihood-based classifiers or the discriminative classifiers using standard feature extractors. In this paper, we present a novel score space that exploits the free energy associated with a generative model. The resulting free energy score space (FESS) takes into account the latent structure of the data at various levels and can be shown to lead to classification performance that at least matches the performance of the free energy classifier based on the same generative model and the same factorization of the posterior. We also show that in several typical computer vision and computational biology applications the classifiers optimized in FESS outperform the corresponding pure generative approaches, as well as a number of previous approaches combining discriminating and generative models.

17.
J Comput Biol ; 18(11): 1649-60, 2011 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-22047543

RESUMEN

This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.


Asunto(s)
Análisis de Secuencia de Proteína/métodos , Estadísticas no Paramétricas , Algoritmos , Secuencia de Aminoácidos , Secuencia de Bases , Teorema de Bayes , Simulación por Computador , Genes Codificadores de los Receptores de Linfocitos T , Variación Genética , Haplotipos , Glicoproteínas Hemaglutininas del Virus de la Influenza/química , Glicoproteínas Hemaglutininas del Virus de la Influenza/genética , Antígenos de Histocompatibilidad Clase I/química , Antígenos de Histocompatibilidad Clase I/genética , Humanos , Inmunoglobulinas/genética , Funciones de Verosimilitud , Modelos Genéticos , Orthomyxoviridae/genética , Polimorfismo de Nucleótido Simple , Recombinación Genética
19.
PLoS One ; 6(3): e17969, 2011 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-21464919

RESUMEN

Different vaccine approaches cope with HIV-1 diversity, ranging from centralized(1-4) to variability-encompassing(5-7) antigens. For all these strategies, a concern remains: how does HIV-1 diversity impact epitope recognition by the immune system? We studied the relationship between HIV-1 diversity and CD8(+) T Lymphocytes (CTL) targeting of HIV-1 subtype B Nef using 944 peptides (10-mers overlapping by nine amino acids (AA)) that corresponded to consensus peptides and their most common variants in the HIV-1-B virus population. IFN-γ ELISpot assays were performed using freshly isolated PBMC from 26 HIV-1-infected persons. Three hundred and fifty peptides elicited a response in at least one individual. Individuals targeted a median of 7 discrete regions. Overall, 33% of responses were directed against viral variants but not elicited against consensus-based test peptides. However, there was no significant relationship between the frequency of a 10-mer in the viral population and either its frequency of recognition (Spearman's correlation coefficient ρ = 0.24) or the magnitude of the responses (ρ = 0.16). We found that peptides with a single mutation compared to the consensus were likely to be recognized (especially if the change was conservative) and to elicit responses of similar magnitude as the consensus peptide. Our results indicate that cross-reactivity between rare and frequent variants is likely to play a role in the expansion of CTL responses, and that maximizing antigenic diversity in a vaccine may increase the breadth and depth of CTL responses. However, since there are few obvious preferred pathways to virologic escape, the diversity that may be required to block all potential escape pathways may be too large for a realistic vaccine to accommodate. Furthermore, since peptides were not recognized based on their frequency in the population, it remains unclear by which mechanisms variability-inclusive antigens (i.e., constructs enriched with frequent variants) expand CTL recognition.


Asunto(s)
Epítopos/inmunología , VIH-1/inmunología , Mutación/genética , Linfocitos T Citotóxicos/inmunología , Productos del Gen nef del Virus de la Inmunodeficiencia Humana/inmunología , Secuencia de Aminoácidos , Secuencia de Bases , Secuencia de Consenso , Reacciones Cruzadas/inmunología , Ensayo de Immunospot Ligado a Enzimas , Humanos , Interferón gamma/inmunología , Péptidos/química , Péptidos/inmunología
20.
J Immunol Methods ; 374(1-2): 35-42, 2011 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-20934429

RESUMEN

There has been considerable interest in statistical approaches that leverage the large volumes of experimental data to predict the binding of Major Histocompatibility Complex class I (MHC-I) molecules to peptides. Here we present our method for averaging together multiple predictors for MHC-peptide binding, where given a particular MHC molecule, a set of predictors and a set of training peptides, our method will average multiple simple predictors for MHC binding to produce a final prediction of the binding affinity between a given MHC molecule and a test peptide. The averaging of predictors is done using a nonparametric method, whereby for any test peptide, we identify similar peptides in the training set and average the predictions on the training set, weighted by each predictor's average accuracy for similar peptides in the training set. We show that our method significantly improves on individual predictors based on held-out data and also produces a predictor whose accuracy is competitive with state-of-the-art techniques based on the results from the Machine Learning in Immunology competition in which 21 submitted techniques were assessed on their accuracy in predicting the binding of HLA-A*0101, HLA-A*0201 and HLA-B*0702 molecules to 9-mer and 10-mer peptides.


Asunto(s)
Antígenos HLA/metabolismo , Modelos Inmunológicos , Inteligencia Artificial , Citomegalovirus/genética , Citomegalovirus/inmunología , Bases de Datos Factuales , Mapeo Epitopo , Antígeno HLA-A1/metabolismo , Antígeno HLA-A2/metabolismo , Antígeno HLA-B7/metabolismo , Humanos , Oligopéptidos/genética , Oligopéptidos/metabolismo , Estadísticas no Paramétricas , Proteínas Virales/genética , Proteínas Virales/inmunología , Proteínas Virales/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA