Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
G3 (Bethesda) ; 5(5): 839-47, 2015 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-25758824

RESUMO

Statistical factor analysis methods have previously been used to remove noise components from high-dimensional data prior to genetic association mapping and, in a guided fashion, to summarize biologically relevant sources of variation. Here, we show how the derived factors summarizing pathway expression can be used to analyze the relationships between expression, heritability, and aging. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarize patterns of gene expression to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 "pathway phenotypes" that summarized patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold ([Formula: see text]). These phenotypes are more heritable ([Formula: see text]) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolizing sugars and fatty acids; others relate to insulin signaling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors.


Assuntos
Envelhecimento/genética , Envelhecimento/metabolismo , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Modelos Biológicos , Fenótipo , Transdução de Sinais , Adulto , Idoso , Idoso de 80 Anos ou mais , Humanos , Pessoa de Meia-Idade
2.
Respir Res ; 14: 60, 2013 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-23721360

RESUMO

BACKGROUND: Although a large body of literature is available that describes the effects of smoking, asthma and COPD on lung function, most studies are restricted to a small age range and to one factor. As a consequence, available results are incomplete and often difficult to compare, also due to the ways the effects are expressed. Furthermore, current approaches consider one type of measurement only or several types separately. METHODS: We propose a probabilistic model that expresses the effects as number of years added to chronological age or, in other words, that estimates the biological age of the lungs. Using biological age as a measure of the effects has the advantage of facilitating the understanding of their severity and comparison of results. In our model, chronological age and other factors affecting the health status of the lungs generate biological age, which in turn generates lung function measurements. This structure enables the use of multiple types of measurement to obtain a more precise estimate of the effects and parameter sharing for characterization over large age ranges and of co-occurrence of factors with little data. We treat the parameters that model smoking habits and lung diseases as random variables to obtain uncertainty in the estimated effects. RESULTS: We use the model to investigate the effects of smoking, asthma and COPD on the TwinsUK Registry. Our results suggest that the combination of smoking with lung disease(s) has higher effect than smoking or lung disease(s) alone, and that in smokers, co-occurrence of asthma and COPD is more detrimental than asthma or COPD alone. CONCLUSIONS: The proposed model or other models based on a similar approach could be of help in improving the understanding of factors affecting lung function by enabling characterizations over large age ranges and of co-occurrence of factors with little data and the use of multiple types of measurement. The software implementing the model can be downloaded at the first author's webpage.


Assuntos
Asma/epidemiologia , Interpretação Estatística de Dados , Modelos de Riscos Proporcionais , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Sistema de Registros/estatística & dados numéricos , Fumar/epidemiologia , Adolescente , Adulto , Distribuição por Idade , Idoso , Idoso de 80 Anos ou mais , Comorbidade , Simulação por Computador , Feminino , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Modelos Estatísticos , Medição de Risco , Fatores de Risco , Reino Unido/epidemiologia , Adulto Jovem
3.
PLoS Genet ; 8(5): e1002704, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22589741

RESUMO

Small RNAs are functional molecules that modulate mRNA transcripts and have been implicated in the aetiology of several common diseases. However, little is known about the extent of their variability within the human population. Here, we characterise the extent, causes, and effects of naturally occurring variation in expression and sequence of small RNAs from adipose tissue in relation to genotype, gene expression, and metabolic traits in the MuTHER reference cohort. We profiled the expression of 15 to 30 base pair RNA molecules in subcutaneous adipose tissue from 131 individuals using high-throughput sequencing, and quantified levels of 591 microRNAs and small nucleolar RNAs. We identified three genetic variants and three RNA editing events. Highly expressed small RNAs are more conserved within mammals than average, as are those with highly variable expression. We identified 14 genetic loci significantly associated with nearby small RNA expression levels, seven of which also regulate an mRNA transcript level in the same region. In addition, these loci are enriched for variants significant in genome-wide association studies for body mass index. Contrary to expectation, we found no evidence for negative correlation between expression level of a microRNA and its target mRNAs. Trunk fat mass, body mass index, and fasting insulin were associated with more than twenty small RNA expression levels each, while fasting glucose had no significant associations. This study highlights the similar genetic complexity and shared genetic control of small RNA and mRNA transcripts, and gives a quantitative picture of small RNA expression variation in the human population.


Assuntos
Variação Genética , MicroRNAs , RNA Mensageiro/genética , RNA Nucleolar Pequeno , Pequeno RNA não Traduzido/genética , Gordura Subcutânea , Animais , Glicemia , Distribuição da Gordura Corporal , Índice de Massa Corporal , Jejum , Feminino , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Insulina/sangue , MicroRNAs/genética , MicroRNAs/metabolismo , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , RNA Mensageiro/metabolismo , RNA Nucleolar Pequeno/genética , RNA Nucleolar Pequeno/metabolismo , Pequeno RNA não Traduzido/metabolismo , Gordura Subcutânea/metabolismo
4.
Bioinformatics ; 28(7): 1001-8, 2012 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-22333244

RESUMO

MOTIVATION: Accurate large-scale phenotyping has recently gained considerable importance in biology. For example, in genome-wide association studies technological advances have rendered genotyping cheap, leaving phenotype acquisition as the major bottleneck. Automatic image analysis is one major strategy to phenotype individuals in large numbers. Current approaches for visual phenotyping focus predominantly on summarizing statistics and geometric measures, such as height and width of an individual, or color histograms and patterns. However, more subtle, but biologically informative phenotypes, such as the local deformation of the shape of an individual with respect to the population mean cannot be automatically extracted and quantified by current techniques. RESULTS: We propose a probabilistic machine learning model that allows for the extraction of deformation phenotypes from biological images, making them available as quantitative traits for downstream analysis. Our approach jointly models a collection of images using a learned common template that is mapped onto each image through a deformable smooth transformation. In a case study, we analyze the shape deformations of 388 guppy fish (Poecilia reticulata). We find that the flexible shape phenotypes our model extracts are complementary to basic geometric measures. Moreover, these quantitative traits assort the observations into distinct groups and can be mapped to polymorphic genetic loci of the sample set. AVAILABILITY: Code is available under: http://bioweb.me/GEBI CONTACT: theofanis.karaletsos@tuebingen.mpg.de; oliver.stegle@tuebingen.mpg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Inteligência Artificial , Processamento de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Fenótipo , Animais , Análise por Conglomerados , Biologia Computacional/métodos , Masculino , Cadeias de Markov , Modelos Estatísticos , Poecilia
5.
Nat Protoc ; 7(3): 500-7, 2012 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-22343431

RESUMO

We present PEER (probabilistic estimation of expression residuals), a software package implementing statistical models that improve the sensitivity and interpretability of genetic associations in population-scale expression data. This approach builds on factor analysis methods that infer broad variance components in the measurements. PEER takes as input transcript profiles and covariates from a set of individuals, and then outputs hidden factors that explain much of the expression variability. Optionally, these factors can be interpreted as pathway or transcription factor activations by providing prior information about which genes are involved in the pathway or targeted by the factor. The inferred factors are used in genetic association analyses. First, they are treated as additional covariates, and are included in the model to increase detection power for mapping expression traits. Second, they are analyzed as phenotypes themselves to understand the causes of global expression variability. PEER extends previous related surrogate variable models and can be implemented within hours on a desktop computer.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Estudos de Associação Genética/métodos , Modelos Estatísticos , Software , Análise Fatorial , Perfilação da Expressão Gênica/estatística & dados numéricos , Sensibilidade e Especificidade
7.
Inf Process Med Imaging ; 22: 184-96, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21761656

RESUMO

This work addresses the challenging problem of simultaneously segmenting multiple anatomical structures in highly varied CT scans. We propose the entangled decision forest (EDF) as a new discriminative classifier which augments the state of the art decision forest, resulting in higher prediction accuracy and shortened decision time. Our main contribution is two-fold. First, we propose entangling the binary tests applied at each tree node in the forest, such that the test result can depend on the result of tests applied earlier in the same tree and at image points offset from the voxel to be classified. This is demonstrated to improve accuracy and capture long-range semantic context. Second, during training, we propose injecting randomness in a guided way, in which node feature types and parameters are randomly drawn from a learned (nonuniform) distribution. This further improves classification accuracy. We assess our probabilistic anatomy segmentation technique using a labeled database of CT image volumes of 250 different patients from various scan protocols and scanner vendors. In each volume, 12 anatomical structures have been manually segmented. The database comprises highly varied body shapes and sizes, a wide array of pathologies, scan resolutions, and diverse contrast agents. Quantitative comparisons with state of the art algorithms demonstrate both superior test accuracy and computational efficiency.


Assuntos
Algoritmos , Inteligência Artificial , Reconhecimento Automatizado de Padrão/métodos , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Tomografia Computadorizada por Raios X/métodos , Análise por Conglomerados , Humanos , Intensificação de Imagem Radiográfica/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
8.
PLoS Genet ; 7(1): e1001276, 2011 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-21283789

RESUMO

Even within a defined cell type, the expression level of a gene differs in individual samples. The effects of genotype, measured factors such as environmental conditions, and their interactions have been explored in recent studies. Methods have also been developed to identify unmeasured intermediate factors that coherently influence transcript levels of multiple genes. Here, we show how to bring these two approaches together and analyse genetic effects in the context of inferred determinants of gene expression. We use a sparse factor analysis model to infer hidden factors, which we treat as intermediate cellular phenotypes that in turn affect gene expression in a yeast dataset. We find that the inferred phenotypes are associated with locus genotypes and environmental conditions and can explain genetic associations to genes in trans. For the first time, we consider and find interactions between genotype and intermediate phenotypes inferred from gene expression levels, complementing and extending established results.


Assuntos
Expressão Gênica/genética , Estudos de Associação Genética/estatística & dados numéricos , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/genética , Algoritmos , Vias Biossintéticas/genética , Interpretação Estatística de Dados , Bases de Dados de Proteínas/estatística & dados numéricos , Meio Ambiente , Epistasia Genética/genética , Redes Reguladoras de Genes , Variação Genética/genética , Genótipo , Modelos Estatísticos , Fenótipo , Locos de Características Quantitativas/genética
9.
IEEE Trans Pattern Anal Mach Intell ; 33(1): 30-42, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21088317

RESUMO

This paper presents an automatic segmentation algorithm for video frames captured by a (monocular) webcam that closely approximates depth segmentation from a stereo camera. The frames are segmented into foreground and background layers that comprise a subject (participant) and other objects and individuals. The algorithm produces correct segmentations even in the presence of large background motion with a nearly stationary foreground. This research makes three key contributions: First, we introduce a novel motion representation, referred to as "motons," inspired by research in object recognition. Second, we propose estimating the segmentation likelihood from the spatial context of motion. The estimation is efficiently learned by random forests. Third, we introduce a general taxonomy of tree-based classifiers that facilitates both theoretical and experimental comparisons of several known classification algorithms and generates new ones. In our bilayer segmentation algorithm, diverse visual cues such as motion, motion context, color, contrast, and spatial priors are fused by means of a conditional random field (CRF) model. Segmentation is then achieved by binary min-cut. Experiments on many sequences of our videochat application demonstrate that our algorithm, which requires no initialization, is effective in a variety of scenes, and the segmentation results are comparable to those obtained by stereo systems.


Assuntos
Inteligência Artificial , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Simulação por Computador , Humanos , Movimento (Física) , Reconhecimento Automatizado de Padrão/métodos , Gravação em Vídeo/métodos
10.
Neural Comput ; 23(3): 593-650, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21162663

RESUMO

Computer vision has grown tremendously in the past two decades. Despite all efforts, existing attempts at matching parts of the human visual system's extraordinary ability to understand visual scenes lack either scope or power. By combining the advantages of general low-level generative models and powerful layer-based and hierarchical models, this work aims at being a first step toward richer, more flexible models of images. After comparing various types of restricted Boltzmann machines (RBMs) able to model continuous-valued data, we introduce our basic model, the masked RBM, which explicitly models occlusion boundaries in image patches by factoring the appearance of any patch region from its shape. We then propose a generative model of larger images using a field of such RBMs. Finally, we discuss how masked RBMs could be stacked to form a deep model able to generate more complicated structures and suitable for various tasks such as segmentation or object recognition.

11.
PLoS Comput Biol ; 6(5): e1000770, 2010 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-20463871

RESUMO

Gene expression measurements are influenced by a wide range of factors, such as the state of the cell, experimental conditions and variants in the sequence of regulatory regions. To understand the effect of a variable of interest, such as the genotype of a locus, it is important to account for variation that is due to confounding causes. Here, we present VBQTL, a probabilistic approach for mapping expression quantitative trait loci (eQTLs) that jointly models contributions from genotype as well as known and hidden confounding factors. VBQTL is implemented within an efficient and flexible inference framework, making it fast and tractable on large-scale problems. We compare the performance of VBQTL with alternative methods for dealing with confounding variability on eQTL mapping datasets from simulations, yeast, mouse, and human. Employing Bayesian complexity control and joint modelling is shown to result in more precise estimates of the contribution of different confounding factors resulting in additional associations to measured transcript levels compared to alternative approaches. We present a threefold larger collection of cis eQTLs than previously found in a whole-genome eQTL scan of an outbred human population. Altogether, 27% of the tested probes show a significant genetic association in cis, and we validate that the additional eQTLs are likely to be real by replicating them in different sets of individuals. Our method is the next step in the analysis of high-dimensional phenotype data, and its application has revealed insights into genetic regulation of gene expression by demonstrating more abundant cis-acting eQTLs in human than previously shown. Our software is freely available online at http://www.sanger.ac.uk/resources/software/peer/.


Assuntos
Teorema de Bayes , Expressão Gênica , Modelos Genéticos , Locos de Características Quantitativas , Software , Animais , Bases de Dados Genéticas , Humanos , Internet , Cadeias de Markov , Camundongos , Modelos Estatísticos , Fenótipo , Reprodutibilidade dos Testes , Leveduras
12.
Am J Respir Crit Care Med ; 181(11): 1200-6, 2010 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-20167852

RESUMO

RATIONALE: The pattern of IgE response (over time or to specific allergens) may reflect different atopic vulnerabilities which are related to the presence of asthma in a fundamentally different way from current definition of atopy. OBJECTIVES: To redefine the atopic phenotype by identifying latent structure within a complex dataset, taking into account the timing and type of sensitization to specific allergens, and relating these novel phenotypes to asthma. METHODS: In a population-based birth cohort in which multiple skin and IgE tests have been taken throughout childhood, we used a machine learning approach to cluster children into multiple atopic classes in an unsupervised way. We then investigated the relation between these classes and asthma (symptoms, hospitalizations, lung function and airway reactivity). MEASUREMENTS AND MAIN RESULTS: A five-class model indicated a complex latent structure, in which children with atopic vulnerability were clustered into four distinct classes (Multiple Early [112/1053, 10.6%]; Multiple Late [171/1053, 16.2%]; Dust Mite [47/1053, 4.5%]; and Non-dust Mite [100/1053, 9.5%]), with a fifth class describing children with No Latent Vulnerability (623/1053, 59.2%). The association with asthma was considerably stronger for Multiple Early compared with other classes and conventionally defined atopy (odds ratio [95% CI]: 29.3 [11.1-77.2] versus 12.4 [4.8-32.2] versus 11.6 [4.8-27.9] for Multiple Early class versus Ever Atopic versus Atopic age 8). Lung function and airway reactivity were significantly poorer among children in Multiple Early class. Cox regression demonstrated a highly significant increase in risk of hospital admissions for wheeze/asthma after age 3 yr only among children in the Multiple Early class (HR 9.2 [3.5-24.0], P < 0.001). CONCLUSIONS: IgE antibody responses do not reflect a single phenotype of atopy, but several different atopic vulnerabilities which differ in their relation with asthma presence and severity.


Assuntos
Asma/classificação , Animais , Asma/epidemiologia , Asma/imunologia , Criança , Análise por Conglomerados , Estudos de Coortes , Suscetibilidade a Doenças , Feminino , Hospitalização , Humanos , Imunoglobulina E/sangue , Masculino , Mães , Análise Multivariada , Fenótipo , Pletismografia Total , Pyroglyphidae , Testes de Função Respiratória , Sons Respiratórios , Testes Cutâneos , Fumar/epidemiologia , Espirometria , Reino Unido/epidemiologia
13.
IEEE Trans Pattern Anal Mach Intell ; 31(12): 2158-67, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19834138

RESUMO

This paper presents a novel method for location recognition, which exploits an epitomic representation to achieve both high efficiency and good generalization. A generative model based on epitomic image analysis captures the appearance and geometric structure of an environment while allowing for variations due to motion, occlusions, and non-Lambertian effects. The ability to model translation and scale invariance together with the fusion of diverse visual features yields enhanced generalization with economical training. Experiments on both existing and new labeled image databases result in recognition accuracy superior to state of the art with real-time computational performance.

14.
Bioinformatics ; 23(13): i212-21, 2007 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-17646299

RESUMO

MOTIVATION: With the recent availability of large-scale data sets profiling single nucleotide polymorphisms (SNPs) and quantitative traits data across different human subpopulations, there has been much attention directed towards discovering patterns of genetic variation and their connection to gene regulation and the onset/progression of disease. While previous work has focused primarily on correlating individual SNP markers with gene expression and disease, it has been suggested that using haplotype blocks instead of individual markers can significantly increase statistical power. RESULTS: We present BlockMapper, a probabilistic generative model for genotype data and quantitative traits data, such as gene expression or phenotype measurements. BlockMapper discovers the block structure of genotype data and associates these inferred blocks to patterns of variation in quantitative traits data, whilst accounting for non-genetic factors. Our model achieves high accuracy for predicting Crohn's disease phenotype in Chromosome 5q31 and reveals novel cis-associations between two haplotype blocks in the ENm006 genomic region and GDI1, a gene implicated in X-linked mental retardation. Our results underscore the importance of accounting for the influence of large sets of SNPs on patterns of regulatory/phenotypic variation and represent a step towards an understanding of human genetic variation.


Assuntos
Mapeamento Cromossômico/métodos , Variação Genética/genética , Genética Populacional , Haplótipos/genética , Modelos Genéticos , Característica Quantitativa Herdável , Sequências Reguladoras de Ácido Nucleico/genética , Teorema de Bayes , Evolução Biológica , Marcadores Genéticos/genética , Humanos , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA