Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
1.
Commun Biol ; 7(1): 873, 2024 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-39020054

RESUMO

Causal gene discovery methods are often evaluated using reference sets of causal genes, which are treated as gold standards (GS) for the purposes of evaluation. However, evaluation methods typically treat genes not in the GS positive set as known negatives rather than unknowns. This leads to inaccurate estimates of sensitivity, specificity, and AUC. Labeling biases in GS gene sets can also lead to inaccurate ordering of alternative causal gene discovery methods. We argue that the evaluation of causal gene discovery methods should rely on statistical techniques like those used for variant discovery rather than on comparison with GS gene sets.


Assuntos
Padrões de Referência , Humanos , Bases de Dados Genéticas
2.
Am J Hum Genet ; 111(5): 966-978, 2024 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-38701746

RESUMO

Replicability is the cornerstone of modern scientific research. Reliable identifications of genotype-phenotype associations that are significant in multiple genome-wide association studies (GWASs) provide stronger evidence for the findings. Current replicability analysis relies on the independence assumption among single-nucleotide polymorphisms (SNPs) and ignores the linkage disequilibrium (LD) structure. We show that such a strategy may produce either overly liberal or overly conservative results in practice. We develop an efficient method, ReAD, to detect replicable SNPs associated with the phenotype from two GWASs accounting for the LD structure. The local dependence structure of SNPs across two heterogeneous studies is captured by a four-state hidden Markov model (HMM) built on two sequences of p values. By incorporating information from adjacent locations via the HMM, our approach provides more accurate SNP significance rankings. ReAD is scalable, platform independent, and more powerful than existing replicability analysis methods with effective false discovery rate control. Through analysis of datasets from two asthma GWASs and two ulcerative colitis GWASs, we show that ReAD can identify replicable genetic loci that existing methods might otherwise miss.


Assuntos
Asma , Estudo de Associação Genômica Ampla , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Humanos , Asma/genética , Cadeias de Markov , Colite Ulcerativa/genética , Reprodutibilidade dos Testes , Fenótipo , Genótipo
3.
Elife ; 132024 Feb 09.
Artigo em Inglês | MEDLINE | ID: mdl-38334359

RESUMO

Genetic variants in gene regulatory sequences can modify gene expression and mediate the molecular response to environmental stimuli. In addition, genotype-environment interactions (GxE) contribute to complex traits such as cardiovascular disease. Caffeine is the most widely consumed stimulant and is known to produce a vascular response. To investigate GxE for caffeine, we treated vascular endothelial cells with caffeine and used a massively parallel reporter assay to measure allelic effects on gene regulation for over 43,000 genetic variants. We identified 665 variants with allelic effects on gene regulation and 6 variants that regulate the gene expression response to caffeine (GxE, false discovery rate [FDR] < 5%). When overlapping our GxE results with expression quantitative trait loci colocalized with coronary artery disease and hypertension, we dissected their regulatory mechanisms and showed a modulatory role for caffeine. Our results demonstrate that massively parallel reporter assay is a powerful approach to identify and molecularly characterize GxE in the specific context of caffeine consumption.


Assuntos
Células Endoteliais , Interação Gene-Ambiente , Cafeína/farmacologia , Regulação da Expressão Gênica , Locos de Características Quantitativas
4.
medRxiv ; 2023 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-37425837

RESUMO

Metabolites are small molecules that are useful for estimating disease risk and elucidating disease biology. Nevertheless, their causal effects on human diseases have not been evaluated comprehensively. We performed two-sample Mendelian randomization to systematically infer the causal effects of 1,099 plasma metabolites measured in 6,136 Finnish men from the METSIM study on risk of 2,099 binary disease endpoints measured in 309,154 Finnish individuals from FinnGen. We identified evidence for 282 causal effects of 70 metabolites on 183 disease endpoints (FDR<1%). We found 25 metabolites with potential causal effects across multiple disease domains, including ascorbic acid 2-sulfate affecting 26 disease endpoints in 12 disease domains. Our study suggests that N-acetyl-2-aminooctanoate and glycocholenate sulfate affect risk of atrial fibrillation through two distinct metabolic pathways and that N-methylpipecolate may mediate the causal effect of N6, N6-dimethyllysine on anxious personality disorder. This study highlights the broad causal impact of plasma metabolites and widespread metabolic connections across diseases.

5.
Genome Res ; 33(6): 839-856, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37442575

RESUMO

Synthetic glucocorticoids, such as dexamethasone, have been used as a treatment for many immune conditions, such as asthma and, more recently, severe COVID-19. Single-cell data can capture more fine-grained details on transcriptional variability and dynamics to gain a better understanding of the molecular underpinnings of inter-individual variation in drug response. Here, we used single-cell RNA-seq to study the dynamics of the transcriptional response to glucocorticoids in activated peripheral blood mononuclear cells from 96 African American children. We used novel statistical approaches to calculate a mean-independent measure of gene expression variability and a measure of transcriptional response pseudotime. Using these approaches, we showed that glucocorticoids reverse the effects of immune stimulation on both gene expression mean and variability. Our novel measure of gene expression response dynamics, based on the diagonal linear discriminant analysis, separated individual cells by response status on the basis of their transcriptional profiles and allowed us to identify different dynamic patterns of gene expression along the response pseudotime. We identified genetic variants regulating gene expression mean and variability, including treatment-specific effects, and showed widespread genetic regulation of the transcriptional dynamics of the gene expression response.


Assuntos
COVID-19 , Glucocorticoides , Criança , Humanos , Glucocorticoides/farmacologia , Glucocorticoides/metabolismo , Leucócitos Mononucleares/metabolismo , COVID-19/genética , Regulação da Expressão Gênica
6.
Bioinformatics ; 39(6)2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37279733

RESUMO

MOTIVATION: Replicability is the cornerstone of scientific research. The current statistical method for high-dimensional replicability analysis either cannot control the false discovery rate (FDR) or is too conservative. RESULTS: We propose a statistical method, JUMP, for the high-dimensional replicability analysis of two studies. The input is a high-dimensional paired sequence of p-values from two studies and the test statistic is the maximum of p-values of the pair. JUMP uses four states of the p-value pairs to indicate whether they are null or non-null. Conditional on the hidden states, JUMP computes the cumulative distribution function of the maximum of p-values for each state to conservatively approximate the probability of rejection under the composite null of replicability. JUMP estimates unknown parameters and uses a step-up procedure to control FDR. By incorporating different states of composite null, JUMP achieves a substantial power gain over existing methods while controlling the FDR. Analyzing two pairs of spatially resolved transcriptomic datasets, JUMP makes biological discoveries that otherwise cannot be obtained by using existing methods. AVAILABILITY AND IMPLEMENTATION: An R package JUMP implementing the JUMP method is available on CRAN (https://CRAN.R-project.org/package=JUMP).


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Perfilação da Expressão Gênica/métodos
7.
Nat Commun ; 14(1): 2229, 2023 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-37076491

RESUMO

Expression quantitative trait locus (eQTL) studies illuminate genomic variants that regulate specific genes and contribute to fine-mapped loci discovered via genome-wide association studies (GWAS). Efforts to maximize their accuracy are ongoing. Using 240 glomerular (GLOM) and 311 tubulointerstitial (TUBE) micro-dissected samples from human kidney biopsies, we discovered 5371 GLOM and 9787 TUBE genes with at least one variant significantly associated with expression (eGene) by incorporating kidney single-nucleus open chromatin data and transcription start site distance as an "integrative prior" for Bayesian statistical fine-mapping. The use of an integrative prior resulted in higher resolution eQTLs illustrated by (1) smaller numbers of variants in credible sets with greater confidence, (2) increased enrichment of partitioned heritability for GWAS of two kidney traits, (3) an increased number of variants colocalized with the GWAS loci, and (4) enrichment of computationally predicted functional regulatory variants. A subset of variants and genes were validated experimentally in vitro and using a Drosophila nephrocyte model. More broadly, this study demonstrates that tissue-specific eQTL maps informed by single-nucleus open chromatin data have enhanced utility for diverse downstream analyses.


Assuntos
Estudo de Associação Genômica Ampla , Nefropatias , Humanos , Estudo de Associação Genômica Ampla/métodos , Teorema de Bayes , Nefropatias/genética , Genômica , Cromatina/genética , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença/genética
8.
Nat Commun ; 14(1): 230, 2023 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-36646693

RESUMO

Puberty is an important developmental period marked by hormonal, metabolic and immune changes. Puberty also marks a shift in sex differences in susceptibility to asthma. Yet, little is known about the gene expression changes in immune cells that occur during pubertal development. Here we assess pubertal development and leukocyte gene expression in a longitudinal cohort of 251 children with asthma. We identify substantial gene expression changes associated with age and pubertal development. Gene expression changes between pre- and post-menarcheal females suggest a shift from predominantly innate to adaptive immunity. We show that genetic effects on gene expression change dynamically during pubertal development. Gene expression changes during puberty are correlated with gene expression changes associated with asthma and may explain sex differences in prevalence. Our results show that molecular data used to study the genetics of early onset diseases should consider pubertal development as an important factor that modifies the transcriptome.


Assuntos
Asma , Puberdade , Humanos , Masculino , Criança , Feminino , Puberdade/genética , Menarca , Asma/genética , Asma/epidemiologia , Leucócitos , Fatores Etários , Estudos Longitudinais
9.
Am J Hum Genet ; 110(1): 44-57, 2023 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-36608684

RESUMO

Integrative genetic association methods have shown great promise in post-GWAS (genome-wide association study) analyses, in which one of the most challenging tasks is identifying putative causal genes and uncovering molecular mechanisms of complex traits. Recent studies suggest that prevailing computational approaches, including transcriptome-wide association studies (TWASs) and colocalization analysis, are individually imperfect, but their joint usage can yield robust and powerful inference results. This paper presents INTACT, a computational framework to integrate probabilistic evidence from these distinct types of analyses and implicate putative causal genes. This procedure is flexible and can work with a wide range of existing integrative analysis approaches. It has the unique ability to quantify the uncertainty of implicated genes, enabling rigorous control of false-positive discoveries. Taking advantage of this highly desirable feature, we further propose an efficient algorithm, INTACT-GSE, for gene set enrichment analysis based on the integrated probabilistic evidence. We examine the proposed computational methods and illustrate their improved performance over the existing approaches through simulation studies. We apply the proposed methods to analyze the multi-tissue eQTL data from the GTEx project and eight large-scale complex- and molecular-trait GWAS datasets from multiple consortia and the UK Biobank. Overall, we find that the proposed methods markedly improve the existing putative gene implication methods and are particularly advantageous in evaluating and identifying key gene sets and biological pathways underlying complex traits.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Humanos , Transcriptoma/genética , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial/genética , Locos de Características Quantitativas/genética , Simulação por Computador , Polimorfismo de Nucleotídeo Único/genética , Predisposição Genética para Doença
10.
Am J Hum Genet ; 109(10): 1727-1741, 2022 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-36055244

RESUMO

Transcriptomics data have been integrated with genome-wide association studies (GWASs) to help understand disease/trait molecular mechanisms. The utility of metabolomics, integrated with transcriptomics and disease GWASs, to understand molecular mechanisms for metabolite levels or diseases has not been thoroughly evaluated. We performed probabilistic transcriptome-wide association and locus-level colocalization analyses to integrate transcriptomics results for 49 tissues in 706 individuals from the GTEx project, metabolomics results for 1,391 plasma metabolites in 6,136 Finnish men from the METSIM study, and GWAS results for 2,861 disease traits in 260,405 Finnish individuals from the FinnGen study. We found that genetic variants that regulate metabolite levels were more likely to influence gene expression and disease risk compared to the ones that do not. Integrating transcriptomics with metabolomics results prioritized 397 genes for 521 metabolites, including 496 previously identified gene-metabolite pairs with strong functional connections and suggested 33.3% of such gene-metabolite pairs shared the same causal variants with genetic associations of gene expression. Integrating transcriptomics and metabolomics individually with FinnGen GWAS results identified 1,597 genes for 790 disease traits. Integrating transcriptomics and metabolomics jointly with FinnGen GWAS results helped pinpoint metabolic pathways from genes to diseases. We identified putative causal effects of UGT1A1/UGT1A4 expression on gallbladder disorders through regulating plasma (E,E)-bilirubin levels, of SLC22A5 expression on nasal polyps and plasma carnitine levels through distinct pathways, and of LIPC expression on age-related macular degeneration through glycerophospholipid metabolic pathways. Our study highlights the power of integrating multiple sets of molecular traits and GWAS results to deepen understanding of disease pathophysiology.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Bilirrubina , Carnitina , Glicerofosfolipídeos , Humanos , Masculino , Metabolômica , Locos de Características Quantitativas/genética , Membro 5 da Família 22 de Carreadores de Soluto/genética , Transcriptoma/genética
11.
Am J Hum Genet ; 109(5): 825-837, 2022 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-35523146

RESUMO

Transcriptome-wide association studies and colocalization analysis are popular computational approaches for integrating genetic-association data from molecular and complex traits. They show the unique ability to go beyond variant-level genetic-association evidence and implicate critical functional units, e.g., genes, in disease etiology. However, in practice, when the two approaches are applied to the same molecular and complex-trait data, the inference results can be markedly different. This paper systematically investigates the inferential reproducibility between the two approaches through theoretical derivation, numerical experiments, and analyses of four complex trait GWAS and GTEx eQTL data. We identify two classes of inconsistent inference results. We find that the first class of inconsistent results (i.e., genes with strong colocalization but weak transcriptome-wide association study [TWAS] signals) might suggest an interesting biological phenomenon, i.e., horizontal pleiotropy; thus, the two approaches are truly complementary. The inconsistency in the second class (i.e., genes with weak colocalization but strong TWAS signals) can be understood and effectively reconciled. To this end, we propose a computational approach for locus-level colocalization analysis. We demonstrate that the joint TWAS and locus-level colocalization analysis improves specificity and sensitivity for implicating biologically relevant genes.


Assuntos
Estudo de Associação Genômica Ampla , Transcriptoma , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Reprodutibilidade dos Testes , Transcriptoma/genética
12.
Nat Commun ; 13(1): 1644, 2022 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-35347128

RESUMO

Few studies have explored the impact of rare variants (minor allele frequency < 1%) on highly heritable plasma metabolites identified in metabolomic screens. The Finnish population provides an ideal opportunity for such explorations, given the multiple bottlenecks and expansions that have shaped its history, and the enrichment for many otherwise rare alleles that has resulted. Here, we report genetic associations for 1391 plasma metabolites in 6136 men from the late-settlement region of Finland. We identify 303 novel association signals, more than one third at variants rare or enriched in Finns. Many of these signals identify genes not previously implicated in metabolite genome-wide association studies and suggest mechanisms for diseases and disease-related traits.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Alelos , Finlândia , Frequência do Gene , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Humanos , Masculino , Fenótipo
13.
bioRxiv ; 2022 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-35313584

RESUMO

Synthetic glucocorticoids, such as dexamethasone, have been used as treatment for many immune conditions, such as asthma and more recently severe COVID-19. Single cell data can capture more fine-grained details on transcriptional variability and dynamics to gain a better understanding of the molecular underpinnings of inter-individual variation in drug response. Here, we used single cell RNA-seq to study the dynamics of the transcriptional response to glucocorticoids in activated Peripheral Blood Mononuclear Cells from 96 African American children. We employed novel statistical approaches to calculate a mean-independent measure of gene expression variability and a measure of transcriptional response pseudotime. Using these approaches, we demonstrated that glucocorticoids reverse the effects of immune stimulation on both gene expression mean and variability. Our novel measure of gene expression response dynamics, based on the diagonal linear discriminant analysis, separated individual cells by response status on the basis of their transcriptional profiles and allowed us to identify different dynamic patterns of gene expression along the response pseudotime. We identified genetic variants regulating gene expression mean and variability, including treatment-specific effects, and demonstrated widespread genetic regulation of the transcriptional dynamics of the gene expression response.

14.
Cell Rep ; 37(8): 110057, 2021 11 23.
Artigo em Inglês | MEDLINE | ID: mdl-34818542

RESUMO

The gut microbiome exhibits extreme compositional variation between hominid hosts. However, it is unclear how this variation impacts host physiology across species and whether this effect can be mediated through microbial regulation of host gene expression in interacting epithelial cells. Here, we characterize the transcriptional response of human colonic epithelial cells in vitro to live microbial communities extracted from humans, chimpanzees, gorillas, and orangutans. We find that most host genes exhibit a conserved response, whereby they respond similarly to the four hominid microbiomes. However, hundreds of host genes exhibit a divergent response, whereby they respond only to microbiomes from specific host species. Such genes are associated with intestinal diseases in humans, including inflammatory bowel disease and Crohn's disease. Last, we find that inflammation-associated microbial species regulate the expression of host genes previously associated with inflammatory bowel disease, suggesting health-related consequences for species-specific host-microbiome interactions across hominids.


Assuntos
Microbioma Gastrointestinal/genética , Regulação da Expressão Gênica/genética , Hominidae/microbiologia , Animais , Bactérias/genética , Células Epiteliais/metabolismo , Fezes/microbiologia , Expressão Gênica/genética , Gorilla gorilla/microbiologia , Hominidae/genética , Humanos , Doenças Inflamatórias Intestinais/genética , Microbiota/genética , Pan troglodytes/microbiologia , Filogenia , Pongo/microbiologia , RNA Ribossômico 16S/genética , Especificidade da Espécie
15.
Elife ; 102021 06 18.
Artigo em Inglês | MEDLINE | ID: mdl-34142656

RESUMO

Social interactions and the overall psychosocial environment have a demonstrated impact on health, particularly for people living in disadvantaged urban areas. Here, we investigated the effect of psychosocial experiences on gene expression in peripheral blood immune cells of children with asthma in Metro Detroit. Using RNA-sequencing and a new machine learning approach, we identified transcriptional signatures of 19 variables including psychosocial factors, blood cell composition, and asthma symptoms. Importantly, we found 169 genes associated with asthma or allergic disease that are regulated by psychosocial factors and 344 significant gene-environment interactions for gene expression levels. These results demonstrate that immune gene expression mediates the link between negative psychosocial experiences and asthma risk.


Assuntos
Asma , Interação Gene-Ambiente , Adolescente , Asma/epidemiologia , Asma/genética , Asma/metabolismo , Asma/psicologia , Criança , Feminino , Genótipo , Humanos , Estudos Longitudinais , Masculino , Michigan , Transcriptoma/genética
16.
Elife ; 102021 05 14.
Artigo em Inglês | MEDLINE | ID: mdl-33988505

RESUMO

Genetic effects on gene expression and splicing can be modulated by cellular and environmental factors; yet interactions between genotypes, cell type, and treatment have not been comprehensively studied together. We used an induced pluripotent stem cell system to study multiple cell types derived from the same individuals and exposed them to a large panel of treatments. Cellular responses involved different genes and pathways for gene expression and splicing and were highly variable across contexts. For thousands of genes, we identified variable allelic expression across contexts and characterized different types of gene-environment interactions, many of which are associated with complex traits. Promoter functional and evolutionary features distinguished genes with elevated allelic imbalance mean and variance. On average, half of the genes with dynamic regulatory interactions were missed by large eQTL mapping studies, indicating the importance of exploring multiple treatments to reveal previously unrecognized regulatory loci that may be important for disease.


The activity of the genes in a cell depends on the type of cell they are in, the interactions with other genes, the environment and genetics. Active genes produce a greater number of mRNA molecules, which act as messenger molecules to instruct the cell to produce proteins. The amount of mRNA molecules in cells can be measured to assess the levels of gene activity. Genes produce mRNAs through a process called transcription, and the collection of all the mRNA molecules in a cell is called the transcriptome. Cells obtained from human samples can be grown in the lab under different conditions, and this can be used to transform them into different types of cells. These cells can then be exposed to different treatments ­ such as specific chemicals ­ to understand how the environment affects them. Cells derived from different people may respond differently to the same treatment based on their unique genetics. Exposing different types of cells from many people to different treatments can help explain how genetics, the environment and cell type affect gene activity. Findley et al. grew three different types of cells from six different people in the lab. The cells were exposed to 28 different treatments, which reflect different environmental changes. Studying all these different factors together allowed Findley et al. to understand how genetics, cell type and environment affect the activity of over 53,000 genes. Around half of the effects due to an interaction between genetics and the environment and had not been seen in other larger studies of the transcriptome. Many of these newly observed changes are in genes that have connections to different diseases, including heart disease. The results of Findley et al. provide evidence indicating to which extent lifestyle and the environment can interact with an individual's genetic makeup to impact gene activity and long-term health. The more researchers can understand these factors, the more useful they can be in helping to predict, detect and treat illnesses. The findings also show how genes and the environment interact, which may be relevant to understanding disease development. There is more work to be done to understand a wider range of environmental factors across more cell types. It will also be important to establish how this work on cells grown in the lab translates to human health.


Assuntos
Regulação da Expressão Gênica/genética , Células-Tronco Pluripotentes Induzidas/metabolismo , Linfócitos/metabolismo , Miócitos Cardíacos/metabolismo , Processamento Alternativo , Diferenciação Celular/genética , Linhagem Celular , Feminino , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Linfócitos/citologia , Miócitos Cardíacos/citologia , Locos de Características Quantitativas , Análise de Sequência de RNA
17.
Cell ; 184(10): 2633-2648.e19, 2021 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-33864768

RESUMO

Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.


Assuntos
Doença/genética , Herança Multifatorial/genética , População/genética , RNA Longo não Codificante/genética , Transcriptoma , Doença da Artéria Coronariana/genética , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 2/genética , Perfilação da Expressão Gênica , Variação Genética , Humanos , Doenças Inflamatórias Intestinais/genética , Especificidade de Órgãos/genética , Locos de Características Quantitativas
18.
G3 (Bethesda) ; 11(2)2021 02 09.
Artigo em Inglês | MEDLINE | ID: mdl-33585870

RESUMO

Over the last decade, GWAS meta-analyses have used a strict P-value threshold of 5 × 10-8 to classify associations as significant. Here, we use our current understanding of frequently studied traits including lipid levels, height, and BMI to revisit this genome-wide significance threshold. We compare the performance of studies using the P = 5 × 10-8 threshold in terms of true and false positive rate to other multiple testing strategies: (1) less stringent P-value thresholds, (2) controlling the FDR with the Benjamini-Hochberg and Benjamini-Yekutieli procedure, and (3) controlling the Bayesian FDR with posterior probabilities. We applied these procedures to re-analyze results from the Global Lipids and GIANT GWAS meta-analysis consortia and supported them with extensive simulation that mimics the empirical data. We observe in simulated studies with sample sizes ∼20,000 and >120,000 that relaxing the P-value threshold to 5 × 10-7 increased discovery at the cost of 18% and 8% of additional loci being false positive results, respectively. FDR and Bayesian FDR are well controlled for both sample sizes with a few exceptions that disappear under a less stringent definition of true positives and the two approaches yield similar results. Our work quantifies the value of using a relaxed P-value threshold in large studies to increase their true positive discovery but also show the excess false positive rates due to such actions in modest-sized studies. These results may guide investigators considering different thresholds in replication studies and downstream work such as gene-set enrichment or pathway analysis. Finally, we demonstrate the viability of FDR-controlling procedures in GWAS.


Assuntos
Estudo de Associação Genômica Ampla , Teorema de Bayes , Simulação por Computador , Fenótipo , Probabilidade
19.
Genome Biol ; 22(1): 49, 2021 01 26.
Artigo em Inglês | MEDLINE | ID: mdl-33499903

RESUMO

The resources generated by the GTEx consortium offer unprecedented opportunities to advance our understanding of the biology of human diseases. Here, we present an in-depth examination of the phenotypic consequences of transcriptome regulation and a blueprint for the functional interpretation of genome-wide association study-discovered loci. Across a broad set of complex traits and diseases, we demonstrate widespread dose-dependent effects of RNA expression and splicing. We develop a data-driven framework to benchmark methods that prioritize causal genes and find no single approach outperforms the combination of multiple approaches. Using colocalization and association approaches that take into account the observed allelic heterogeneity of gene expression, we propose potential target genes for 47% (2519 out of 5385) of the GWAS loci examined.


Assuntos
Expressão Gênica , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/métodos , Genótipo , Genes , Humanos , Herança Multifatorial , Transcriptoma
20.
Biometrics ; 77(2): 573-586, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-32627167

RESUMO

Directed acyclic mixed graphs (DAMGs) provide a useful representation of network topology with both directed and undirected edges subject to the restriction of no directed cycles in the graph. This graphical framework may arise in many biomedical studies, for example, when a directed acyclic graph (DAG) of interest is contaminated with undirected edges induced by some unobserved confounding factors (eg, unmeasured environmental factors). Directed edges in a DAG are widely used to evaluate causal relationships among variables in a network, but detecting them is challenging when the underlying causality is obscured by some shared latent factors. The objective of this paper is to develop an effective structural equation model (SEM) method to extract reliable causal relationships from a DAMG. The proposed approach, termed structural factor equation model (SFEM), uses the SEM to capture the network topology of the DAG while accounting for the undirected edges in the graph with a factor analysis model. The latent factors in the SFEM enable the identification and removal of undirected edges, leading to a simpler and more interpretable causal network. The proposed method is evaluated and compared to existing methods through extensive simulation studies, and illustrated through the construction of gene regulatory networks related to breast cancer.


Assuntos
Modelos Teóricos , Projetos de Pesquisa , Causalidade , Análise Fatorial
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA