Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 177(1): 32-37, 2019 03 21.
Artículo en Inglés | MEDLINE | ID: mdl-30901545

RESUMEN

The introduction of exome sequencing in the clinic has sparked tremendous optimism for the future of rare disease diagnosis, and there is exciting opportunity to further leverage these advances. To provide diagnostic clarity to all of these patients, however, there is a critical need for the field to develop and implement strategies to understand the mechanisms underlying all rare diseases and translate these to clinical care.


Asunto(s)
Secuenciación del Exoma/tendencias , Enfermedades Raras/diagnóstico , Investigación Biomédica Traslacional/métodos , Exoma , Pruebas Genéticas , Genoma Humano/genética , Secuenciación de Nucleótidos de Alto Rendimiento/tendencias , Humanos , Enfermedades Raras/genética , Análisis de Secuencia de ADN/métodos , Secuenciación del Exoma/métodos
2.
Genome Res ; 27(12): 2015-2024, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-29097404

RESUMEN

Our ability to predict protein expression from DNA sequence alone remains poor, reflecting our limited understanding of cis-regulatory grammar and hampering the design of engineered genes for synthetic biology applications. Here, we generate a model that predicts the protein expression of the 5' untranslated region (UTR) of mRNAs in the yeast Saccharomyces cerevisiae. We constructed a library of half a million 50-nucleotide-long random 5' UTRs and assayed their activity in a massively parallel growth selection experiment. The resulting data allow us to quantify the impact on protein expression of Kozak sequence composition, upstream open reading frames (uORFs), and secondary structure. We trained a convolutional neural network (CNN) on the random library and showed that it performs well at predicting the protein expression of both a held-out set of the random 5' UTRs as well as native S. cerevisiae 5' UTRs. The model additionally was used to computationally evolve highly active 5' UTRs. We confirmed experimentally that the great majority of the evolved sequences led to higher protein expression rates than the starting sequences, demonstrating the predictive power of this model.


Asunto(s)
Modelos Genéticos , Saccharomyces cerevisiae/genética , Regiones no Traducidas 5' , Empalme Alternativo , Simulación por Computador , Biblioteca de Genes , Aprendizaje Automático , Redes Neurales de la Computación , ARN de Hongos , ARN Mensajero
3.
J Infect Dis ; 213(8): 1248-52, 2016 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-26655301

RESUMEN

Outcomes of chronic infection with hepatitis B virus (HBV) are varied, with increased morbidity reported in the context of human immunodeficiency virus (HIV) coinfection. The factors driving different outcomes are not well understood, but there is increasing interest in an HLA class I effect. We therefore studied the influence of HLA class I on HBV in an African HIV-positive cohort. We demonstrated that virologic markers of HBV disease activity (hepatitis B e antigen status or HBV DNA level) are associated with HLA-A genotype. This finding supports the role of the CD8(+) T-cell response in HBV control, and potentially informs future therapeutic T-cell vaccine strategies.


Asunto(s)
Coinfección , Infecciones por VIH , Antígenos HLA/genética , Antígenos e de la Hepatitis B/sangre , Hepatitis B , Adulto , Estudios de Cohortes , Coinfección/complicaciones , Coinfección/epidemiología , Coinfección/genética , Coinfección/virología , Femenino , Infecciones por VIH/complicaciones , Infecciones por VIH/epidemiología , Infecciones por VIH/virología , Hepatitis B/complicaciones , Hepatitis B/epidemiología , Hepatitis B/genética , Hepatitis B/virología , Humanos , Masculino , Prevalencia , Curva ROC
4.
Proc Natl Acad Sci U S A ; 110(33): 13492-7, 2013 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-23878211

RESUMEN

Experimental and computational evidence suggests that HLAs preferentially bind conserved regions of viral proteins, a concept we term "targeting efficiency," and that this preference may provide improved clearance of infection in several viral systems. To test this hypothesis, T-cell responses to A/H1N1 (2009) were measured from peripheral blood mononuclear cells obtained from a household cohort study performed during the 2009-2010 influenza season. We found that HLA targeting efficiency scores significantly correlated with IFN-γ enzyme-linked immunosorbent spot responses (P = 0.042, multiple regression). A further population-based analysis found that the carriage frequencies of the alleles with the lowest targeting efficiencies, A*24, were associated with pH1N1 mortality (r = 0.37, P = 0.031) and are common in certain indigenous populations in which increased pH1N1 morbidity has been reported. HLA efficiency scores and HLA use are associated with CD8 T-cell magnitude in humans after influenza infection. The computational tools used in this study may be useful predictors of potential morbidity and identify immunologic differences of new variant influenza strains more accurately than evolutionary sequence comparisons. Population-based studies of the relative frequency of these alleles in severe vs. mild influenza cases might advance clinical practices for severe H1N1 infections among genetically susceptible populations.


Asunto(s)
Linfocitos T CD4-Positivos/inmunología , Antígenos HLA/inmunología , Subtipo H1N1 del Virus de la Influenza A , Gripe Humana/epidemiología , Gripe Humana/inmunología , Estudios de Cohortes , Biología Computacional/métodos , Ensayo de Immunospot Ligado a Enzimas , Frecuencia de los Genes , Antígenos HLA/metabolismo , Humanos , Interferón gamma/inmunología , Modelos Estadísticos , Análisis de Regresión
5.
PLoS One ; 18(9): e0291169, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37729186

RESUMEN

Campaign contributions are a staple of congressional life. Yet, the search for tangible effects of congressional donations often focuses on the association between contributions and votes on congressional bills. We present an alternative approach by considering the relationship between money and legislators' speech. Floor speeches are an important component of congressional behavior, and reflect a legislator's policy priorities and positions in a way that voting cannot. Our research provides the first comprehensive analysis of the association between a legislator's campaign donors and the policy issues they prioritize with congressional speech. Ultimately, we find a robust relationship between donors and speech, indicating a more pervasive role of money in politics than previously assumed. We use a machine learning framework on a new dataset that brings together legislator metadata for all representatives in the US House between 1995 and 2018, including committee assignments, legislative speech, donation records, and information about Political Action Committees. We compare information about donations against other potential explanatory variables, such as party affiliation, home state, and committee assignments, and find that donors consistently have the strongest association with legislators' issue-attention. We further contribute a procedure for identifying speech and donation events that occur in close proximity to one another and share meaningful connections, identifying the proverbial needles in the haystack of speech and donation activity in Congress which may be cases of interest for investigative journalism. Taken together, our framework, data, and findings can help increase the transparency of the role of money in politics.


Asunto(s)
Aprendizaje Automático , Donantes de Tejidos , Humanos , Metadatos , Políticas , Política
6.
BMC Bioinformatics ; 13 Suppl 6: S11, 2012 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-22537040

RESUMEN

Transcript quantification is a long-standing problem in genomics and estimating the relative abundance of alternatively-spliced isoforms from the same transcript is an important special case. Both problems have recently been illuminated by high-throughput RNA sequencing experiments which are quickly generating large amounts of data. However, much of the signal present in this data is corrupted or obscured by biases resulting in non-uniform and non-proportional representation of sequences from different transcripts. Many existing analyses attempt to deal with these and other biases with various task-specific approaches, which makes direct comparison between them difficult. However, two popular tools for isoform quantification, MISO and Cufflinks, have adopted a general probabilistic framework to model and mitigate these biases in a more general fashion. These advances motivate the need to investigate the effects of RNA-seq biases on the accuracy of different approaches for isoform quantification. We conduct the investigation by building models of increasing sophistication to account for noise introduced by the biases and compare their accuracy to the established approaches. We focus on methods that estimate the expression of alternatively-spliced isoforms with the percent-spliced-in (PSI) metric for each exon skipping event. To improve their estimates, many methods use evidence from RNA-seq reads that align to exon bodies. However, the methods we propose focus on reads that span only exon-exon junctions. As a result, our approaches are simpler and less sensitive to exon definitions than existing methods, which enables us to distinguish their strengths and weaknesses more easily. We present several probabilistic models of of position-specific read counts with increasing complexity and compare them to each other and to the current state-of-the-art methods in isoform quantification, MISO and Cufflinks. On a validation set with RT-PCR measurements for 26 cassette events, some of our methods are more accurate and some are significantly more consistent than these two popular tools. This comparison demonstrates the challenges in estimating the percent inclusion of alternatively spliced junctions and illuminates the tradeoffs between different approaches.


Asunto(s)
Empalme Alternativo , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ARN/métodos , Exones , Perfilación de la Expresión Génica , Células HeLa , Humanos , Modelos Estadísticos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
7.
J Virol ; 85(3): 1310-21, 2011 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-21084470

RESUMEN

The high diversity of HLA binding preferences has been driven by the sequence diversity of short segments of relevant pathogenic proteins presented by HLA molecules to the immune system. To identify possible commonalities in HLA binding preferences, we quantify these using a novel measure termed "targeting efficiency," which captures the correlation between HLA-peptide binding affinities and the conservation of the targeted proteomic regions. Analysis of targeting efficiencies for 95 HLA class I alleles over thousands of human proteins and 52 human viruses indicates that HLA molecules preferentially target conserved regions in these proteomes, although the arboviral Flaviviridae are a notable exception where nonconserved regions are preferentially targeted by most alleles. HLA-A alleles and several HLA-B alleles that have maintained close sequence identity with chimpanzee homologues target conserved human proteins and DNA viruses such as Herpesviridae and Adenoviridae most efficiently, while all HLA-B alleles studied efficiently target RNA viruses. These patterns of host and pathogen specialization are both consistent with coevolutionary selection and functionally relevant in specific cases; for example, preferential HLA targeting of conserved proteomic regions is associated with improved outcomes in HIV infection and with protection against dengue hemorrhagic fever. Efficiency analysis provides a novel perspective on the coevolutionary relationship between HLA class I molecular diversity, self-derived peptides that shape T-cell immunity through ontogeny, and the broad range of viruses that subsequently engage with the adaptive immune response.


Asunto(s)
Evolución Molecular , Antígenos de Histocompatibilidad Clase I/genética , Antígenos de Histocompatibilidad Clase I/inmunología , Interacciones Huésped-Patógeno , Proteínas/genética , Proteínas/inmunología , Virus/inmunología , Secuencia Conservada , Humanos , Unión Proteica
8.
Cogsci ; 2021: 1767-1773, 2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-34617074

RESUMEN

A longstanding question in cognitive science concerns the learning mechanisms underlying compositionality in human cognition. Humans can infer the structured relationships (e.g., grammatical rules) implicit in their sensory observations (e.g., auditory speech), and use this knowledge to guide the composition of simpler meanings into complex wholes. Recent progress in artificial neural networks has shown that when large models are trained on enough linguistic data, grammatical structure emerges in their representations. We extend this work to the domain of mathematical reasoning, where it is possible to formulate precise hypotheses about how meanings (e.g., the quantities corresponding to numerals) should be composed according to structured rules (e.g., order of operations). Our work shows that neural networks are not only able to infer something about the structured relationships implicit in their training data, but can also deploy this knowledge to guide the composition of individual meanings into composite wholes.

9.
J Immunol ; 181(9): 6361-70, 2008 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-18941227

RESUMEN

Hepatitis C virus (HCV) vaccine efficacy may crucially depend on immunogen length and coverage of viral sequence diversity. However, covering a considerable proportion of the circulating viral sequence variants would likely require long immunogens, which for the conserved portions of the viral genome, would contain unnecessarily redundant sequence information. In this study, we present the design and in vitro performance analysis of a novel "epitome" approach that compresses frequent immune targets of the cellular immune response against HCV into a shorter immunogen sequence. Compression of immunological information is achieved by partial overlapping shared sequence motifs between individual epitopes. At the same time, sequence diversity coverage is provided by taking advantage of emerging cross-reactivity patterns among epitope variants so that epitope variants associated with the broadest variant cross-recognition are preferentially included. The processing and presentation analysis of specific epitopes included in such a compressed, in vitro-expressed HCV epitome indicated effective processing of a majority of tested epitopes, although re-presentation of some epitopes may require refined sequence design. Together, the present study establishes the epitome approach as a potential powerful tool for vaccine immunogen design, especially suitable for the induction of cellular immune responses against highly variable pathogens.


Asunto(s)
Presentación de Antígeno/inmunología , Epítopos de Linfocito T/biosíntesis , Epítopos de Linfocito T/química , Regulación de la Expresión Génica/inmunología , Hepacivirus/inmunología , Linfocitos T Citotóxicos/inmunología , Linfocitos T Citotóxicos/metabolismo , Secuencia de Aminoácidos , Línea Celular , Epítopos de Linfocito T/inmunología , Epítopos de Linfocito T/metabolismo , Antígeno HLA-B35/biosíntesis , Antígeno HLA-B35/química , Antígeno HLA-B35/inmunología , Antígeno HLA-B35/metabolismo , Hepacivirus/genética , Hepatitis C Crónica/inmunología , Hepatitis C Crónica/metabolismo , Hepatitis C Crónica/virología , Humanos , Epítopos Inmunodominantes/biosíntesis , Epítopos Inmunodominantes/química , Epítopos Inmunodominantes/inmunología , Epítopos Inmunodominantes/metabolismo , Datos de Secuencia Molecular , Proteoma/biosíntesis , Proteoma/síntesis química , Proteoma/inmunología , Proteoma/metabolismo , Linfocitos T Citotóxicos/virología , Proteínas no Estructurales Virales/biosíntesis , Proteínas no Estructurales Virales/síntesis química , Proteínas no Estructurales Virales/inmunología , Proteínas no Estructurales Virales/metabolismo
10.
IEEE Trans Pattern Anal Mach Intell ; 41(12): 3086-3099, 2019 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-30130178

RESUMEN

A new unified video analytics framework (ER3) is proposed for complex event retrieval, recognition and recounting, based on the proposed video imprint representation, which exploits temporal correlations among image features across video frames. With the video imprint representation, it is convenient to reverse map back to both temporal and spatial locations in video frames, allowing for both key frame identification and key areas localization within each frame. In the proposed framework, a dedicated feature alignment module is incorporated for redundancy removal across frames to produce the tensor representation, i.e., the video imprint. Subsequently, the video imprint is individually fed into both a reasoning network and a feature aggregation module, for event recognition/recounting and event retrieval tasks, respectively. Thanks to its attention mechanism inspired by the memory networks used in language modeling, the proposed reasoning network is capable of simultaneous event category recognition and localization of the key pieces of evidence for event recounting. In addition, the latent structure in our reasoning network highlights the areas of the video imprint, which can be directly used for event recounting. With the event retrieval task, the compact video representation aggregated from the video imprint contributes to better retrieval results than existing state-of-the-art methods.

11.
PLoS Comput Biol ; 3(4): e75, 2007 Apr 27.
Artículo en Inglés | MEDLINE | ID: mdl-17465674

RESUMEN

The ability of human immunodeficiency virus type 1 (HIV-1) to develop high levels of genetic diversity, and thereby acquire mutations to escape immune pressures, contributes to the difficulties in producing a vaccine. Possibly no single HIV-1 sequence can induce sufficiently broad immunity to protect against a wide variety of infectious strains, or block mutational escape pathways available to the virus after infection. The authors describe the generation of HIV-1 immunogens that minimizes the phylogenetic distance of viral strains throughout the known viral population (the center of tree [COT]) and then extend the COT immunogen by addition of a composite sequence that includes high-frequency variable sites preserved in their native contexts. The resulting COT(+) antigens compress the variation found in many independent HIV-1 isolates into lengths suitable for vaccine immunogens. It is possible to capture 62% of the variation found in the Nef protein and 82% of the variation in the Gag protein into immunogens of three gene lengths. The authors put forward immunogen designs that maximize representation of the diverse antigenic features present in a spectrum of HIV-1 strains. These immunogens should elicit immune responses against high-frequency viral strains as well as against most mutant forms of the virus.


Asunto(s)
Vacunas contra el SIDA/genética , Vacunas contra el SIDA/inmunología , Variación Antigénica/genética , Mapeo Epitopo/métodos , Productos del Gen nef/genética , Productos del Gen nef/inmunología , Variación Genética/genética , Diseño de Fármacos , Productos del Gen nef del Virus de la Inmunodeficiencia Humana
12.
Mol Biochem Parasitol ; 155(2): 103-12, 2007 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-17669514

RESUMEN

VAR2CSA is the main candidate for a pregnancy malaria vaccine, but vaccine development may be complicated by sequence polymorphism. Here, we obtained partial or full-length var2CSA sequences from 106 parasites and applied novel computational methods and three-dimensional modeling to investigate VAR2CSA geographic variation and selection pressure. Our analysis reveals structural patterns of VAR2CSA sequence variation in which polymorphic sites group into segments of limited diversity. Within these segments, two or three basic types characterize a substantial majority of the parasite samples. Comparison to the primate malaria Plasmodium reichenowi shows that these basic types have ancient origins. Globally, var2CSA genes are comprised of a mosaic of these ancestral polymorphic segments that have recombined extensively between var2CSA alleles. Three-dimensional modeling reveals that polymorphic segments concentrate in flexible loops at characteristic locations in the six VAR2CSA Duffy binding-like (DBL) adhesion domains. Individual DBL domain surfaces have distinct patterns of diversifying selection, suggesting that limited and differing portions of each DBL domain are targeted by host antibody. Since standard phylogenetic tree analysis is inadequate for highly recombining genes like var2CSA, we developed a novel phylogenetic approach that incorporates recombination and tracks new mutations in segment types. In the resulting tree, P. reichenowi is confirmed as an outlier and African and Asian P. falciparum isolates have slightly diverged. These findings validate a new approach to modeling protein evolution in the presence of frequent recombination and provide a clearer understanding of how var gene products function as immunoevasive binding ligands.


Asunto(s)
Antígenos de Protozoos/genética , Antígenos de Protozoos/inmunología , Malaria/parasitología , Plasmodium falciparum/genética , Polimorfismo Genético , Complicaciones Parasitarias del Embarazo/inmunología , Selección Genética , Animales , Antígenos de Protozoos/química , Biología Computacional/métodos , ADN Protozoario/química , ADN Protozoario/genética , Femenino , Geografía , Humanos , Malaria/inmunología , Vacunas contra la Malaria/inmunología , Modelos Moleculares , Datos de Secuencia Molecular , Filogenia , Plasmodium falciparum/aislamiento & purificación , Embarazo , Complicaciones Parasitarias del Embarazo/prevención & control , Estructura Terciaria de Proteína , Análisis de Secuencia de ADN , Homología de Secuencia de Aminoácido
13.
Bioinformatics ; 22(14): e227-35, 2006 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-16873476

RESUMEN

MOTIVATION AND RESULTS: Motivated by the ability of a simple threading approach to predict MHC I--peptide binding, we developed a new and improved structure-based model for which parameters can be estimated from additional sources of data about MHC-peptide binding. In addition to the known 3D structures of a small number of MHC-peptide complexes that were used in the original threading approach, we included three other sources of information on peptide-MHC binding: (1) MHC class I sequences; (2) known binding energies for a large number of MHC-peptide complexes; and (3) an even larger binary dataset that contains information about strong binders (epitopes) and non-binders (peptides that have a low affinity for a particular MHC molecule). Our model significantly outperforms the standard threading approach in binding energy prediction. In our approach, which we call adaptive double threading, the parameters of the threading model are learnable, and both MHC and peptide sequences can be threaded onto structures of other alleles. These two properties make our model appropriate for predicting binding for alleles for which very little data (if any) is available beyond just their sequence, including prediction for alleles for which 3D structures are not available. The ability of our model to generalize beyond the MHC types for which training data is available also separates our approach from epitope prediction methods which treat MHC alleles as symbolic types, rather than biological sequences. We used the trained binding energy predictor to study viral infections in 246 HIV patients from the West Australian cohort, and over 1000 sequences in HIV clade B from Los Alamos National Laboratory database, capturing the course of HIV evolution over the last 20 years. Finally, we illustrate short-, medium-, and long-term adaptation of HIV to the human immune system. AVAILABILITY: http://www.research.microsoft.com/~jojic/hlaBinding.html.


Asunto(s)
Algoritmos , Antígenos de Histocompatibilidad Clase I/química , Modelos Químicos , Modelos Moleculares , Péptidos/química , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Secuencia de Aminoácidos , Inteligencia Artificial , Sitios de Unión , Simulación por Computador , Datos de Secuencia Molecular , Reconocimiento de Normas Patrones Automatizadas/métodos , Unión Proteica , Conformación Proteica , Alineación de Secuencia/métodos
14.
Artif Intell Med ; 70: 1-11, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27431033

RESUMEN

OBJECTIVE: High-throughput technologies have generated an unprecedented amount of high-dimensional gene expression data. Algorithmic approaches could be extremely useful to distill information and derive compact interpretable representations of the statistical patterns present in the data. This paper proposes a mining approach to extract an informative representation of gene expression profiles based on a generative model called the Counting Grid (CG). METHOD: Using the CG model, gene expression values are arranged on a discrete grid, learned in a way that "similar" co-expression patterns are arranged in close proximity, thus resulting in an intuitive visualization of the dataset. More than this, the model permits to identify the genes that distinguish between classes (e.g. different types of cancer). Finally, each sample can be characterized with a discriminative signature - extracted from the model - that can be effectively employed for classification. RESULTS: A thorough evaluation on several gene expression datasets demonstrate the suitability of the proposed approach from a twofold perspective: numerically, we reached state-of-the-art classification accuracies on 5 datasets out of 7, and similar results when the approach is tested in a gene selection setting (with a stability always above 0.87); clinically, by confirming that many of the genes highlighted by the model as significant play also a key role for cancer biology. CONCLUSION: The proposed framework can be successfully exploited to meaningfully visualize the samples; detect medically relevant genes; properly classify samples.


Asunto(s)
Algoritmos , Minería de Datos , Perfilación de la Expresión Génica , Análisis por Conglomerados , Genes Relacionados con las Neoplasias , Humanos , Neoplasias/genética
15.
AIDS ; 30(5): 701-11, 2016 Mar 13.
Artículo en Inglés | MEDLINE | ID: mdl-26730570

RESUMEN

OBJECTIVES: AIDS is caused by CD4 T-cell depletion. Although combination antiretroviral therapy can restore blood T-cell numbers, the clonal diversity of the reconstituting cells, critical for immunocompetence, is not well defined. METHODS: We performed an extensive analysis of parameters of thymic function in perinatally HIV-1-infected (n = 39) and control (n = 28) participants ranging from 13 to 23 years of age. CD4 T cells including naive (CD27 CD45RA) and recent thymic emigrant (RTE) (CD31/CD45RA) cells, were quantified by flow cytometry. Deep sequencing was used to examine T-cell receptor (TCR) sequence diversity in sorted RTE CD4 T cells. RESULTS: Infected participants had reduced CD4 T-cell levels with predominant depletion of the memory subset and preservation of naive cells. RTE CD4 T-cell levels were normal in most infected individuals, and enhanced thymopoiesis was indicated by higher proportions of CD4 T cells containing TCR recombination excision circles. Memory CD4 T-cell depletion was highly associated with CD8 T-cell activation in HIV-1-infected persons and plasma interlekin-7 levels were correlated with naive CD4 T cells, suggesting activation-driven loss and compensatory enhancement of thymopoiesis. Deep sequencing of CD4 T-cell receptor sequences in well compensated infected persons demonstrated supranormal diversity, providing additional evidence of enhanced thymic output. CONCLUSION: Despite up to two decades of infection, many individuals have remarkable thymic reserve to compensate for ongoing CD4 T-cell loss, although there is ongoing viral replication and immune activation despite combination antiretroviral therapy. The longer term sustainability of this physiology remains to be determined.


Asunto(s)
Linfocitos T CD4-Positivos/inmunología , Infecciones por VIH/inmunología , VIH-1/crecimiento & desarrollo , Subgrupos de Linfocitos T/inmunología , Timo/fisiología , Adolescente , Linfocitos T CD4-Positivos/química , Linfocitos T CD4-Positivos/clasificación , Femenino , Citometría de Flujo , Variación Genética , Infecciones por VIH/virología , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Antígenos Comunes de Leucocito/análisis , Masculino , Molécula-1 de Adhesión Celular Endotelial de Plaqueta/análisis , Receptores de Antígenos de Linfocitos T/genética , Análisis de Secuencia de ADN , Subgrupos de Linfocitos T/química , Subgrupos de Linfocitos T/clasificación , Miembro 7 de la Superfamilia de Receptores de Factores de Necrosis Tumoral/análisis , Adulto Joven
16.
Bioinformatics ; 20 Suppl 1: i161-8, 2004 Aug 04.
Artículo en Inglés | MEDLINE | ID: mdl-15262795

RESUMEN

MOTIVATION: We consider models useful for learning an evolutionary or phylogenetic tree from data consisting of DNA sequences corresponding to the leaves of the tree. In particular, we consider a general probabilistic model described in Siepel and Haussler that we call the phylogenetic-HMM model which generalizes the classical probabilistic models of Neyman and Felsenstein. Unfortunately, computing the likelihood of phylogenetic-HMM models is intractable. We consider several approximations for computing the likelihood of such models including an approximation introduced in Siepel and Haussler, loopy belief propagation and several variational methods. RESULTS: We demonstrate that, unlike the other approximations, variational methods are accurate and are guaranteed to lower bound the likelihood. In addition, we identify a particular variational approximation to be best-one in which the posterior distribution is variationally approximated using the classic Neyman-Felsenstein model. The application of our best approximation to data from the cystic fibrosis transmembrane conductance regulator gene region across nine eutherian mammals reveals a CpG effect.


Asunto(s)
Algoritmos , Inteligencia Artificial , Evolución Molecular , Modelos Genéticos , Filogenia , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Simulación por Computador , Bases de Datos Genéticas , Cadenas de Markov , Datos de Secuencia Molecular , Reconocimiento de Normas Patrones Automatizadas/métodos , Homología de Secuencia de Ácido Nucleico
17.
IEEE Trans Pattern Anal Mach Intell ; 27(9): 1392-416, 2005 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16173184

RESUMEN

Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition, face detection, speaker identification, and prediction of gene function, it is even more exciting that researchers are on the verge of introducing systems that can perform large-scale combinatorial analyses of data, decomposing the data into interacting components. For example, computational methods for automatic scene analysis are now emerging in the computer vision community. These methods decompose an input image into its constituent objects, lighting conditions, motion patterns, etc. Two of the main challenges are finding effective representations and models in specific applications and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms. We review exact techniques and various approximate, computationally efficient techniques, including iterated conditional modes, the expectation maximization (EM) algorithm, Gibbs sampling, the mean field method, variational techniques, structured variational techniques and the sum-product algorithm ("loopy" belief propagation). We describe how each technique can be applied in a vision model of multiple, occluding objects and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.


Asunto(s)
Algoritmos , Inteligencia Artificial , Gráficos por Computador , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Almacenamiento y Recuperación de la Información/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Simulación por Computador , Modelos Biológicos , Modelos Estadísticos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
18.
IEEE Trans Pattern Anal Mach Intell ; 37(12): 2374-87, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26539844

RESUMEN

In recent scene recognition research images or large image regions are often represented as disorganized "bags" of features which can then be analyzed using models originally developed to capture co-variation of word counts in text. However, image feature counts are likely to be constrained in different ways than word counts in text. For example, as a camera pans upwards from a building entrance over its first few floors and then further up into the sky Fig. 1 Fig. 1. Feature counts change slightly as the field of view moves. For example, the abundance of the "car" features is reduced, but the counts of the features found on building facades are increased. The counting grid model accounts for such changes naturally, and it can also account for images of different scenes.

19.
Science ; 347(6218): 1254806, 2015 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-25525159

RESUMEN

To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.


Asunto(s)
Inteligencia Artificial , Trastornos Generalizados del Desarrollo Infantil/genética , Neoplasias Colorrectales Hereditarias sin Poliposis/genética , Estudio de Asociación del Genoma Completo/métodos , Anotación de Secuencia Molecular/métodos , Atrofia Muscular Espinal/genética , Empalme del ARN/genética , Proteínas Adaptadoras Transductoras de Señales/genética , Simulación por Computador , ADN/genética , Exones/genética , Código Genético , Marcadores Genéticos , Variación Genética , Humanos , Intrones/genética , Modelos Genéticos , Homólogo 1 de la Proteína MutL , Mutación Missense , Proteínas Nucleares/genética , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Sitios de Empalme de ARN/genética , Proteínas de Unión al ARN/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA