Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Genome Biol ; 21(1): 107, 2020 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-32381040

RESUMO

BACKGROUND: Tumors comprise a complex microenvironment of interacting malignant and stromal cell types. Much of our understanding of the tumor microenvironment comes from in vitro studies isolating the interactions between malignant cells and a single stromal cell type, often along a single pathway. RESULT: To develop a deeper understanding of the interactions between cells within human lung tumors, we perform RNA-seq profiling of flow-sorted malignant cells, endothelial cells, immune cells, fibroblasts, and bulk cells from freshly resected human primary non-small-cell lung tumors. We map the cell-specific differential expression of prognostically associated secreted factors and cell surface genes, and computationally reconstruct cross-talk between these cell types to generate a novel resource called the Lung Tumor Microenvironment Interactome (LTMI). Using this resource, we identify and validate a prognostically unfavorable influence of Gremlin-1 production by fibroblasts on proliferation of malignant lung adenocarcinoma cells. We also find a prognostically favorable association between infiltration of mast cells and less aggressive tumor cell behavior. CONCLUSION: These results illustrate the utility of the LTMI as a resource for generating hypotheses concerning tumor-microenvironment interactions that may have prognostic and therapeutic relevance.

2.
Genome Biol ; 20(1): 230, 2019 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-31684996

RESUMO

BACKGROUND: Molecular and cellular changes are intrinsic to aging and age-related diseases. Prior cross-sectional studies have investigated the combined effects of age and genetics on gene expression and alternative splicing; however, there has been no long-term, longitudinal characterization of these molecular changes, especially in older age. RESULTS: We perform RNA sequencing in whole blood from the same individuals at ages 70 and 80 to quantify how gene expression, alternative splicing, and their genetic regulation are altered during this 10-year period of advanced aging at a population and individual level. We observe that individuals are more similar to their own expression profiles later in life than profiles of other individuals their own age. We identify 1291 and 294 genes differentially expressed and alternatively spliced with age, as well as 529 genes with outlying individual trajectories. Further, we observe a strong correlation of genetic effects on expression and splicing between the two ages, with a small subset of tested genes showing a reduction in genetic associations with expression and splicing in older age. CONCLUSIONS: These findings demonstrate that, although the transcriptome and its genetic regulation is mostly stable late in life, a small subset of genes is dynamic and is characterized by a reduction in genetic regulation, most likely due to increasing environmental variance with age.


Assuntos
Envelhecimento/genética , Processamento Alternativo , Regulação da Expressão Gênica , Idoso , Idoso de 80 Anos ou mais , Envelhecimento/metabolismo , Feminino , Humanos , Masculino
3.
Nat Genet ; 51(10): 1494-1505, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31570894

RESUMO

A hallmark of the immune system is the interplay among specialized cell types transitioning between resting and stimulated states. The gene regulatory landscape of this dynamic system has not been fully characterized in human cells. Here we collected assay for transposase-accessible chromatin using sequencing (ATAC-seq) and RNA sequencing data under resting and stimulated conditions for up to 32 immune cell populations. Stimulation caused widespread chromatin remodeling, including response elements shared between stimulated B and T cells. Furthermore, several autoimmune traits showed significant heritability in stimulation-responsive elements from distinct cell types, highlighting the importance of these cell states in autoimmunity. Allele-specific read mapping identified variants that alter chromatin accessibility in particular conditions, allowing us to observe evidence of function for a candidate causal variant that is undetected by existing large-scale studies in resting cells. Our results provide a resource of chromatin dynamics and highlight the need to characterize the effects of genetic variation in stimulated cells.


Assuntos
Linfócitos B/imunologia , Cromatina/genética , Regulação da Expressão Gênica/efeitos dos fármacos , Células Matadoras Naturais/imunologia , Elementos de Resposta/genética , Linfócitos T/imunologia , Desequilíbrio Alélico , Linfócitos B/efeitos dos fármacos , Linfócitos B/metabolismo , Células Cultivadas , Cromatina/efeitos dos fármacos , Cromatina/imunologia , Epigênese Genética , Regulação da Expressão Gênica/genética , Regulação da Expressão Gênica/imunologia , Humanos , Interleucina-2/farmacologia , Interleucina-4/farmacologia , Células Matadoras Naturais/efeitos dos fármacos , Células Matadoras Naturais/metabolismo , Polissacarídeos/farmacologia , Linfócitos T/efeitos dos fármacos , Linfócitos T/metabolismo , Transcriptoma
4.
PLoS Comput Biol ; 15(5): e1006743, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-31136571

RESUMO

Drug screening studies typically involve assaying the sensitivity of a range of cancer cell lines across an array of anti-cancer therapeutics. Alongside these sensitivity measurements high dimensional molecular characterizations of the cell lines are typically available, including gene expression, copy number variation and genomic mutations. We propose a sparse multitask regression model which learns discriminative latent characteristics that predict drug sensitivity and are associated with specific molecular features. We use ideas from Bayesian nonparametrics to automatically infer the appropriate number of these latent characteristics. The resulting analysis couples high predictive performance with interpretability since each latent characteristic involves a typically small set of drugs, cell lines and genomic features. Our model uncovers a number of drug-gene sensitivity associations missed by single gene analyses. We functionally validate one such novel association: that increased expression of the cell-cycle regulator C/EBPδ decreases sensitivity to the histone deacetylase (HDAC) inhibitor panobinostat.


Assuntos
Previsões/métodos , Neoplasias/genética , Antineoplásicos/farmacologia , Teorema de Bayes , Biomarcadores Farmacológicos , Proteína delta de Ligação ao Facilitador CCAAT/genética , Linhagem Celular Tumoral , Variações do Número de Cópias de DNA , Genoma , Genômica , Inibidores de Histona Desacetilases/farmacologia , Humanos , Neoplasias/tratamento farmacológico , Panobinostat/farmacologia , Análise de Regressão , Estatísticas não Paramétricas
5.
Nat Genet ; 51(4): 592-599, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30926968

RESUMO

Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and gene expression datasets to identify gene-trait associations. In this Perspective, we explore properties of TWAS as a potential approach to prioritize causal genes at GWAS loci, by using simulations and case studies of literature-curated candidate causal genes for schizophrenia, low-density-lipoprotein cholesterol and Crohn's disease. We explore risk loci where TWAS accurately prioritizes the likely causal gene as well as loci where TWAS prioritizes multiple genes, some likely to be non-causal, owing to sharing of expression quantitative trait loci (eQTL). TWAS is especially prone to spurious prioritization with expression data from non-trait-related tissues or cell types, owing to substantial cross-cell-type variation in expression levels and eQTL strengths. Nonetheless, TWAS prioritizes candidate causal genes more accurately than simple baselines. We suggest best practices for causal-gene prioritization with TWAS and discuss future opportunities for improvement. Our results showcase the strengths and limitations of using eQTL datasets to determine causal genes at GWAS loci.


Assuntos
Predisposição Genética para Doença/genética , Transcriptoma/genética , Doença de Crohn/genética , Variação Genética/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Lipoproteínas LDL/genética , Locos de Características Quantitativas/genética , Esquizofrenia/genética
6.
Elife ; 72018 05 08.
Artigo em Inglês | MEDLINE | ID: mdl-29737278

RESUMO

Anthracycline-induced cardiotoxicity (ACT) is a key limiting factor in setting optimal chemotherapy regimes, with almost half of patients expected to develop congestive heart failure given high doses. However, the genetic basis of sensitivity to anthracyclines remains unclear. We created a panel of iPSC-derived cardiomyocytes from 45 individuals and performed RNA-seq after 24 hr exposure to varying doxorubicin dosages. The transcriptomic response is substantial: the majority of genes are differentially expressed and over 6000 genes show evidence of differential splicing, the later driven by reduced splicing fidelity in the presence of doxorubicin. We show that inter-individual variation in transcriptional response is predictive of in vitro cell damage, which in turn is associated with in vivo ACT risk. We detect 447 response-expression quantitative trait loci (QTLs) and 42 response-splicing QTLs, which are enriched in lower ACT GWAS [Formula: see text]-values, supporting the in vivo relevance of our map of genetic regulation of cellular response to anthracyclines.


Assuntos
Antraciclinas/toxicidade , Cardiotoxicidade , Miócitos Cardíacos/efeitos dos fármacos , Células Cultivadas , Doxorrubicina/toxicidade , Perfilação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Locos de Características Quantitativas , Análise de Sequência de RNA
7.
PLoS One ; 13(4): e0195788, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29659628

RESUMO

From whole organisms to individual cells, responses to environmental conditions are influenced by genetic makeup, where the effect of genetic variation on a trait depends on the environmental context. RNA-sequencing quantifies gene expression as a molecular trait, and is capable of capturing both genetic and environmental effects. In this study, we explore opportunities of using allele-specific expression (ASE) to discover cis-acting genotype-environment interactions (GxE)-genetic effects on gene expression that depend on an environmental condition. Treating 17 common, clinical traits as approximations of the cellular environment of 267 skeletal muscle biopsies, we identify 10 candidate environmental response expression quantitative trait loci (reQTLs) across 6 traits (12 unique gene-environment trait pairs; 10% FDR per trait) including sex, systolic blood pressure, and low-density lipoprotein cholesterol. Although using ASE is in principle a promising approach to detect GxE effects, replication of such signals can be challenging as validation requires harmonization of environmental traits across cohorts and a sufficient sampling of heterozygotes for a transcribed SNP. Comprehensive discovery and replication will require large human transcriptome datasets, or the integration of multiple transcribed SNPs, coupled with standardized clinical phenotyping.


Assuntos
Microambiente Celular , Regulação da Expressão Gênica , Interação Gene-Ambiente , Variação Genética , Fibras Musculares Esqueléticas/metabolismo , Músculo Esquelético/metabolismo , Metabolismo Energético , Estudos de Associação Genética , Genótipo , Humanos , Músculo Esquelético/citologia , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
8.
Nat Genet ; 50(1): 151-158, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-29229983

RESUMO

The excision of introns from pre-mRNA is an essential step in mRNA processing. We developed LeafCutter to study sample and population variation in intron splicing. LeafCutter identifies variable splicing events from short-read RNA-seq data and finds events of high complexity. Our approach obviates the need for transcript annotations and circumvents the challenges in estimating relative isoform or exon usage in complex splicing events. LeafCutter can be used both to detect differential splicing between sample groups and to map splicing quantitative trait loci (sQTLs). Compared with contemporary methods, our approach identified 1.4-2.1 times more sQTLs, many of which helped us ascribe molecular effects to disease-associated variants. Transcriptome-wide associations between LeafCutter intron quantifications and 40 complex traits increased the number of associated disease genes at a 5% false discovery rate by an average of 2.1-fold compared with that detected through the use of gene expression levels alone. LeafCutter is fast, scalable, easy to use, and available online.


Assuntos
Processamento Alternativo , Análise de Sequência de RNA/métodos , Software , Animais , Doença/genética , Perfilação da Expressão Gênica , Variação Genética , Íntrons , Anotação de Sequência Molecular , Locos de Características Quantitativas
9.
Am J Hum Genet ; 101(5): 686-699, 2017 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-29106824

RESUMO

Previous studies have prioritized trait-relevant cell types by looking for an enrichment of genome-wide association study (GWAS) signal within functional regions. However, these studies are limited in cell resolution by the lack of functional annotations from difficult-to-characterize or rare cell populations. Measurement of single-cell gene expression has become a popular method for characterizing novel cell types, and yet limited work has linked single-cell RNA sequencing (RNA-seq) to phenotypes of interest. To address this deficiency, we present RolyPoly, a regression-based polygenic model that can prioritize trait-relevant cell types and genes from GWAS summary statistics and gene expression data. RolyPoly is designed to use expression data from either bulk tissue or single-cell RNA-seq. In this study, we demonstrated RolyPoly's accuracy through simulation and validated previously known tissue-trait associations. We discovered a significant association between microglia and late-onset Alzheimer disease and an association between schizophrenia and oligodendrocytes and replicating fetal cortical cells. Additionally, RolyPoly computes a trait-relevance score for each gene to reflect the importance of expression specific to a cell type. We found that differentially expressed genes in the prefrontal cortex of individuals with Alzheimer disease were significantly enriched with genes ranked highly by RolyPoly gene scores. Overall, our method represents a powerful framework for understanding the effect of common variants on cell types contributing to complex traits.


Assuntos
Doença de Alzheimer/genética , Microglia/metabolismo , Oligodendroglia/metabolismo , Esquizofrenia/genética , Análise de Célula Única/estatística & dados numéricos , Software , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/patologia , Simulação por Computador , Feto , Estudo de Associação Genômica Ampla , Humanos , Microglia/patologia , Modelos Genéticos , Oligodendroglia/patologia , Córtex Pré-Frontal/metabolismo , Córtex Pré-Frontal/patologia , Locos de Características Quantitativas , Esquizofrenia/diagnóstico , Esquizofrenia/patologia , Análise de Célula Única/métodos , Transcriptoma
10.
Nat Methods ; 14(7): 699-702, 2017 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-28530654

RESUMO

Identifying interactions between genetics and the environment (GxE) remains challenging. We have developed EAGLE, a hierarchical Bayesian model for identifying GxE interactions based on associations between environmental variables and allele-specific expression. Combining whole-blood RNA-seq with extensive environmental annotations collected from 922 human individuals, we identified 35 GxE interactions, compared with only four using standard GxE interaction testing. EAGLE provides new opportunities for researchers to identify GxE interactions using functional genomic data.


Assuntos
Alelos , Epigênese Genética , Regulação da Expressão Gênica , Variação Genética , Adulto , Estudos de Coortes , Feminino , Humanos , Masculino , Modelos Genéticos , Locos de Características Quantitativas
11.
Nature ; 544(7650): 367-371, 2017 04 20.
Artigo em Inglês | MEDLINE | ID: mdl-28405022

RESUMO

Amyotrophic lateral sclerosis (ALS) is a rapidly progressing neurodegenerative disease that is characterized by motor neuron loss and that leads to paralysis and death 2-5 years after disease onset. Nearly all patients with ALS have aggregates of the RNA-binding protein TDP-43 in their brains and spinal cords, and rare mutations in the gene encoding TDP-43 can cause ALS. There are no effective TDP-43-directed therapies for ALS or related TDP-43 proteinopathies, such as frontotemporal dementia. Antisense oligonucleotides (ASOs) and RNA-interference approaches are emerging as attractive therapeutic strategies in neurological diseases. Indeed, treatment of a rat model of inherited ALS (caused by a mutation in Sod1) with ASOs against Sod1 has been shown to substantially slow disease progression. However, as SOD1 mutations account for only around 2-5% of ALS cases, additional therapeutic strategies are needed. Silencing TDP-43 itself is probably not appropriate, given its critical cellular functions. Here we present a promising alternative therapeutic strategy for ALS that involves targeting ataxin-2. A decrease in ataxin-2 suppresses TDP-43 toxicity in yeast and flies, and intermediate-length polyglutamine expansions in the ataxin-2 gene increase risk of ALS. We used two independent approaches to test whether decreasing ataxin-2 levels could mitigate disease in a mouse model of TDP-43 proteinopathy. First, we crossed ataxin-2 knockout mice with TDP-43 (also known as TARDBP) transgenic mice. The decrease in ataxin-2 reduced aggregation of TDP-43, markedly increased survival and improved motor function. Second, in a more therapeutically applicable approach, we administered ASOs targeting ataxin-2 to the central nervous system of TDP-43 transgenic mice. This single treatment markedly extended survival. Because TDP-43 aggregation is a component of nearly all cases of ALS, targeting ataxin-2 could represent a broadly effective therapeutic strategy.


Assuntos
Esclerose Amiotrófica Lateral/genética , Esclerose Amiotrófica Lateral/terapia , Ataxina-2/deficiência , Proteínas de Ligação a DNA/metabolismo , Longevidade , Oligonucleotídeos Antissenso/uso terapêutico , Agregação Patológica de Proteínas/terapia , Esclerose Amiotrófica Lateral/metabolismo , Esclerose Amiotrófica Lateral/fisiopatologia , Animais , Ataxina-2/genética , Sistema Nervoso Central/metabolismo , Grânulos Citoplasmáticos/metabolismo , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/genética , Progressão da Doença , Feminino , Técnicas de Silenciamento de Genes , Humanos , Masculino , Camundongos , Camundongos Knockout , Camundongos Transgênicos , Destreza Motora/fisiologia , Oligonucleotídeos Antissenso/administração & dosagem , Oligonucleotídeos Antissenso/genética , Agregação Patológica de Proteínas/genética , Estresse Fisiológico , Análise de Sobrevida
12.
Sci Rep ; 7: 39921, 2017 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-28045081

RESUMO

Single-cell RNA sequencing (scRNA-seq) can be used to characterize variation in gene expression levels at high resolution. However, the sources of experimental noise in scRNA-seq are not yet well understood. We investigated the technical variation associated with sample processing using the single-cell Fluidigm C1 platform. To do so, we processed three C1 replicates from three human induced pluripotent stem cell (iPSC) lines. We added unique molecular identifiers (UMIs) to all samples, to account for amplification bias. We found that the major source of variation in the gene expression data was driven by genotype, but we also observed substantial variation between the technical replicates. We observed that the conversion of reads to molecules using the UMIs was impacted by both biological and technical variation, indicating that UMI counts are not an unbiased estimator of gene expression levels. Based on our results, we suggest a framework for effective scRNA-seq studies.


Assuntos
RNA/metabolismo , Análise de Célula Única , Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Células-Tronco Pluripotentes Induzidas/metabolismo , Análise de Componente Principal , RNA/química , RNA/isolamento & purificação , Análise de Sequência de RNA
13.
G3 (Bethesda) ; 7(1): 31-39, 2017 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-27799337

RESUMO

Exosomes are small extracellular vesicles that carry heterogeneous cargo, including RNA, between cells. Increasing evidence suggests that exosomes are important mediators of intercellular communication and biomarkers of disease. Despite this, the variability of exosomal RNA between individuals has not been well quantified. To assess this variability, we sequenced the small RNA of cells and exosomes from a 17-member family. Across individuals, we show that selective export of miRNAs occurs not only at the level of specific transcripts, but that a cluster of 74 mature miRNAs on chromosome 14q32 is massively exported in exosomes while mostly absent from cells. We also observe more interindividual variability between exosomal samples than between cellular ones and identify four miRNA expression quantitative trait loci shared between cells and exosomes. Our findings indicate that genomically colocated miRNAs can be exported together and highlight the variability in exosomal miRNA levels between individuals as relevant for exosome use as diagnostics.


Assuntos
Exossomos/genética , MicroRNAs/genética , Locos de Características Quantitativas/genética , Linhagem Celular , Cromossomos Humanos Par 14/genética , Regulação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Ativação Linfocitária/genética , RNA Interferente Pequeno/genética , Análise de Sequência de RNA
14.
Genome Res ; 26(6): 768-77, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27197214

RESUMO

The X Chromosome, with its unique mode of inheritance, contributes to differences between the sexes at a molecular level, including sex-specific gene expression and sex-specific impact of genetic variation. Improving our understanding of these differences offers to elucidate the molecular mechanisms underlying sex-specific traits and diseases. However, to date, most studies have either ignored the X Chromosome or had insufficient power to test for the sex-specific impact of genetic variation. By analyzing whole blood transcriptomes of 922 individuals, we have conducted the first large-scale, genome-wide analysis of the impact of both sex and genetic variation on patterns of gene expression, including comparison between the X Chromosome and autosomes. We identified a depletion of expression quantitative trait loci (eQTL) on the X Chromosome, especially among genes under high selective constraint. In contrast, we discovered an enrichment of sex-specific regulatory variants on the X Chromosome. To resolve the molecular mechanisms underlying such effects, we generated chromatin accessibility data through ATAC-sequencing to connect sex-specific chromatin accessibility to sex-specific patterns of expression and regulatory variation. As sex-specific regulatory variants discovered in our study can inform sex differences in heritable disease prevalence, we integrated our data with genome-wide association study data for multiple immune traits identifying several traits with significant sex biases in genetic susceptibilities. Together, our study provides genome-wide insight into how genetic variation, the X Chromosome, and sex shape human gene regulation and disease.


Assuntos
Cromossomos Humanos X/genética , Transcriptoma , Feminino , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Predisposição Genética para Doença , Genoma Humano , Humanos , Masculino , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Caracteres Sexuais
15.
Science ; 352(6285): 600-4, 2016 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-27126046

RESUMO

Noncoding variants play a central role in the genetics of complex traits, but we still lack a full understanding of the molecular pathways through which they act. We quantified the contribution of cis-acting genetic effects at all major stages of gene regulation from chromatin to proteins, in Yoruba lymphoblastoid cell lines (LCLs). About ~65% of expression quantitative trait loci (eQTLs) have primary effects on chromatin, whereas the remaining eQTLs are enriched in transcribed regions. Using a novel method, we also detected 2893 splicing QTLs, most of which have little or no effect on gene-level expression. These splicing QTLs are major contributors to complex traits, roughly on a par with variants that affect gene expression levels. Our study provides a comprehensive view of the mechanisms linking genetic variation to variation in human gene regulation.


Assuntos
Regulação da Expressão Gênica , Variação Genética , Doenças do Sistema Imunitário/genética , Locos de Características Quantitativas , Processamento de RNA/genética , Linhagem Celular , Cromatina/metabolismo , Estudo de Associação Genômica Ampla , Humanos , Linfócitos/imunologia , Fenótipo , Polimorfismo de Nucleotídeo Único
16.
Am J Hum Genet ; 98(1): 216-24, 2016 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-26749306

RESUMO

Methods for multiple-testing correction in local expression quantitative trait locus (cis-eQTL) studies are a trade-off between statistical power and computational efficiency. Bonferroni correction, though computationally trivial, is overly conservative and fails to account for linkage disequilibrium between variants. Permutation-based methods are more powerful, though computationally far more intensive. We present an alternative correction method called eigenMT, which runs over 500 times faster than permutations and has adjusted p values that closely approximate empirical ones. To achieve this speed while also maintaining the accuracy of permutation-based methods, we estimate the effective number of independent variants tested for association with a particular gene, termed Meff, by using the eigenvalue decomposition of the genotype correlation matrix. We employ a regularized estimator of the correlation matrix to ensure Meff is robust and yields adjusted p values that closely approximate p values from permutations. Finally, using a common genotype matrix, we show that eigenMT can be applied with even greater efficiency to studies across tissues or conditions. Our method provides a simpler, more efficient approach to multiple-testing correction than existing methods and fits within existing pipelines for eQTL discovery.


Assuntos
Desequilíbrio de Ligação , Locos de Características Quantitativas , Humanos
17.
IEEE Trans Pattern Anal Mach Intell ; 37(2): 271-89, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26353241

RESUMO

In this paper we introduce the Pitman Yor Diffusion Tree (PYDT), a Bayesian non-parametric prior over tree structures which generalises the Dirichlet Diffusion Tree [30] and removes the restriction to binary branching structure. The generative process is described and shown to result in an exchangeable distribution over data points. We prove some theoretical properties of the model including showing its construction as the continuum limit of a nested Chinese restaurant process model. We then present two alternative MCMC samplers which allow us to model uncertainty over tree structures, and a computationally efficient greedy Bayesian EM search algorithm. Both algorithms use message passing on the tree structure. The utility of the model and algorithms is demonstrated on synthetic and real world data, both continuous and binary.

18.
IEEE Trans Pattern Anal Mach Intell ; 37(2): 462-74, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26353254

RESUMO

Latent variable models for network data extract a summary of the relational structure underlying an observed network. The simplest possible models subdivide nodes of the network into clusters; the probability of a link between any two nodes then depends only on their cluster assignment. Currently available models can be classified by whether clusters are disjoint or are allowed to overlap. These models can explain a "flat" clustering structure. Hierarchical Bayesian models provide a natural approach to capture more complex dependencies. We propose a model in which objects are characterised by a latent feature vector. Each feature is itself partitioned into disjoint groups (subclusters), corresponding to a second layer of hierarchy. In experimental comparisons, the model achieves significantly improved predictive performance on social and biological link prediction tasks. The results indicate that models with a single layer hierarchy over-simplify real networks.


Assuntos
Informática/métodos , Aprendizado de Máquina , Modelos Teóricos , Simulação por Computador
19.
Am J Hum Genet ; 95(3): 245-56, 2014 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-25192044

RESUMO

Recent and rapid human population growth has led to an excess of rare genetic variants that are expected to contribute to an individual's genetic burden of disease risk. To date, much of the focus has been on rare protein-coding variants, for which potential impact can be estimated from the genetic code, but determining the impact of rare noncoding variants has been more challenging. To improve our understanding of such variants, we combined high-quality genome sequencing and RNA sequencing data from a 17-individual, three-generation family to contrast expression quantitative trait loci (eQTLs) and splicing quantitative trait loci (sQTLs) within this family to eQTLs and sQTLs within a population sample. Using this design, we found that eQTLs and sQTLs with large effects in the family were enriched with rare regulatory and splicing variants (minor allele frequency < 0.01). They were also more likely to influence essential genes and genes involved in complex disease. In addition, we tested the capacity of diverse noncoding annotation to predict the impact of rare noncoding variants. We found that distance to the transcription start site, evolutionary constraint, and epigenetic annotation were considerably more informative for predicting the impact of rare variants than for predicting the impact of common variants. These results highlight that rare noncoding variants are important contributors to individual gene-expression profiles and further demonstrate a significant capability for genomic annotation to predict the impact of rare noncoding variants.


Assuntos
Genoma Humano , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas , RNA não Traduzido/genética , Análise de Sequência de RNA , Transcriptoma , Grupo com Ancestrais do Continente Europeu/genética , Família , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Linfócitos/metabolismo
20.
PLoS Genet ; 10(5): e1004304, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24786518

RESUMO

Personal exome and genome sequencing provides access to loss-of-function and rare deleterious alleles whose interpretation is expected to provide insight into individual disease burden. However, for each allele, accurate interpretation of its effect will depend on both its penetrance and the trait's expressivity. In this regard, an important factor that can modify the effect of a pathogenic coding allele is its level of expression; a factor which itself characteristically changes across tissues. To better inform the degree to which pathogenic alleles can be modified by expression level across multiple tissues, we have conducted exome, RNA and deep, targeted allele-specific expression (ASE) sequencing in ten tissues obtained from a single individual. By combining such data, we report the impact of rare and common loss-of-function variants on allelic expression exposing stronger allelic bias for rare stop-gain variants and informing the extent to which rare deleterious coding alleles are consistently expressed across tissues. This study demonstrates the potential importance of transcriptome data to the interpretation of pathogenic protein-coding variants.


Assuntos
Alelos , Proteínas/genética , Exoma , Humanos , Reação em Cadeia da Polimerase
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA