Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 77
Filtrar
1.
medRxiv ; 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38699369

RESUMO

Multi-ancestry statistical fine-mapping of cis-molecular quantitative trait loci (cis-molQTL) aims to improve the precision of distinguishing causal cis-molQTLs from tagging variants. However, existing approaches fail to reflect shared genetic architectures. To solve this limitation, we present the Sum of Shared Single Effects (SuShiE) model, which leverages LD heterogeneity to improve fine-mapping precision, infer cross-ancestry effect size correlations, and estimate ancestry-specific expression prediction weights. We apply SuShiE to mRNA expression measured in PBMCs (n=956) and LCLs (n=814) together with plasma protein levels (n=854) from individuals of diverse ancestries in the TOPMed MESA and GENOA studies. We find SuShiE fine-maps cis-molQTLs for 16% more genes compared with baselines while prioritizing fewer variants with greater functional enrichment. SuShiE infers highly consistent cis-molQTL architectures across ancestries on average; however, we also find evidence of heterogeneity at genes with predicted loss-of-function intolerance, suggesting that environmental interactions may partially explain differences in cis-molQTL effect sizes across ancestries. Lastly, we leverage estimated cis-molQTL effect-sizes to perform individual-level TWAS and PWAS on six white blood cell-related traits in AOU Biobank individuals (n=86k), and identify 44 more genes compared with baselines, further highlighting its benefits in identifying genes relevant for complex disease risk. Overall, SuShiE provides new insights into the cis-genetic architecture of molecular traits.

2.
Cell Genom ; 4(4): 100526, 2024 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-38537633

RESUMO

Hispanic/Latino children have the highest risk of acute lymphoblastic leukemia (ALL) in the US compared to other racial/ethnic groups, yet the basis of this remains incompletely understood. Through genetic fine-mapping analyses, we identified a new independent childhood ALL risk signal near IKZF1 in self-reported Hispanic/Latino individuals, but not in non-Hispanic White individuals, with an effect size of ∼1.44 (95% confidence interval = 1.33-1.55) and a risk allele frequency of ∼18% in Hispanic/Latino populations and <0.5% in European populations. This risk allele was positively associated with Indigenous American ancestry, showed evidence of selection in human history, and was associated with reduced IKZF1 expression. We identified a putative causal variant in a downstream enhancer that is most active in pro-B cells and interacts with the IKZF1 promoter. This variant disrupts IKZF1 autoregulation at this enhancer and results in reduced enhancer activity in B cell progenitors. Our study reveals a genetic basis for the increased ALL risk in Hispanic/Latino children.


Assuntos
Predisposição Genética para Doença , Leucemia-Linfoma Linfoblástico de Células Precursoras , Humanos , Criança , Predisposição Genética para Doença/genética , Polimorfismo de Nucleotídeo Único , Fatores de Transcrição/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Hispânico ou Latino/genética , Fator de Transcrição Ikaros/genética
4.
medRxiv ; 2024 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-38370648

RESUMO

Asthma is a complex disease caused by genetic and environmental factors. Epidemiological studies have shown that in children, wheezing during rhinovirus infection (a cause of the common cold) is associated with asthma development during childhood. This has led scientists to hypothesize there could be a causal relationship between rhinovirus infection and asthma or that RV-induced wheezing identifies individuals at increased risk for asthma development. However, not all children who wheeze when they have a cold develop asthma. Genome-wide association studies (GWAS) have identified hundreds of genetic variants contributing to asthma susceptibility, with the vast majority of likely causal variants being non-coding. Integrative analyses with transcriptomic and epigenomic datasets have indicated that T cells drive asthma risk, which has been supported by mouse studies. However, the datasets ascertained in these integrative analyses lack airway epithelial cells. Furthermore, large-scale transcriptomic T cell studies have not identified the regulatory effects of most non-coding risk variants in asthma GWAS, indicating there could be additional cell types harboring these "missing regulatory effects". Given that airway epithelial cells are the first line of defense against rhinovirus, we hypothesized they could be mediators of genetic susceptibility to asthma. Here we integrate GWAS data with transcriptomic datasets of airway epithelial cells subject to stimuli that could induce activation states relevant to asthma. We demonstrate that epithelial cultures infected with rhinovirus significantly upregulate childhood-onset asthma-associated genes. We show that this upregulation occurs specifically in non-ciliated epithelial cells. This enrichment for genes in asthma risk loci, or 'asthma heritability enrichment' is also significant for epithelial genes upregulated with influenza infection, but not with SARS-CoV-2 infection or cytokine activation. Additionally, cells from patients with asthma showed a stronger heritability enrichment compared to cells from healthy individuals. Overall, our results suggest that rhinovirus infection is an environmental factor that interacts with genetic risk factors through non-ciliated airway epithelial cells to drive childhood-onset asthma.

5.
Science ; 383(6690): eabn3263, 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38422184

RESUMO

Vocal production learning ("vocal learning") is a convergently evolved trait in vertebrates. To identify brain genomic elements associated with mammalian vocal learning, we integrated genomic, anatomical, and neurophysiological data from the Egyptian fruit bat (Rousettus aegyptiacus) with analyses of the genomes of 215 placental mammals. First, we identified a set of proteins evolving more slowly in vocal learners. Then, we discovered a vocal motor cortical region in the Egyptian fruit bat, an emergent vocal learner, and leveraged that knowledge to identify active cis-regulatory elements in the motor cortex of vocal learners. Machine learning methods applied to motor cortex open chromatin revealed 50 enhancers robustly associated with vocal learning whose activity tended to be lower in vocal learners. Our research implicates convergent losses of motor cortex regulatory elements in mammalian vocal learning evolution.


Assuntos
Elementos Facilitadores Genéticos , Eutérios , Evolução Molecular , Regulação da Expressão Gênica , Córtex Motor , Neurônios Motores , Proteínas , Vocalização Animal , Animais , Quirópteros/genética , Quirópteros/fisiologia , Vocalização Animal/fisiologia , Córtex Motor/citologia , Córtex Motor/fisiologia , Cromatina/metabolismo , Neurônios Motores/fisiologia , Laringe/fisiologia , Epigênese Genética , Genoma , Proteínas/genética , Proteínas/metabolismo , Sequência de Aminoácidos , Eutérios/genética , Eutérios/fisiologia , Aprendizado de Máquina
6.
Cell Genom ; 4(1): 100469, 2024 Jan 10.
Artigo em Inglês | MEDLINE | ID: mdl-38190103

RESUMO

Epigenetics underpins the regulation of genes known to play a key role in the adaptive and innate immune system (AIIS). We developed a method, EpiNN, that leverages epigenetic data to detect AIIS-relevant genomic regions and used it to detect 2,765 putative AIIS loci. Experimental validation of one of these loci, DNMT1, provided evidence for a novel AIIS-specific transcription start site. We built a genome-wide AIIS annotation and used linkage disequilibrium (LD) score regression to test whether it predicts regional heritability using association statistics for 176 traits. We detected significant heritability effects (average |τ∗|=1.65) for 20 out of 26 immune-relevant traits. In a meta-analysis, immune-relevant traits and diseases were 4.45× more enriched for heritability than other traits. The EpiNN annotation was also depleted of trans-ancestry genetic correlation, indicating ancestry-specific effects. These results underscore the effectiveness of leveraging supervised learning algorithms and epigenetic data to detect loci implicated in specific classes of traits and diseases.


Assuntos
Genômica , Locos de Características Quantitativas , Fenótipo , Desequilíbrio de Ligação/genética , Epigênese Genética/genética
7.
medRxiv ; 2023 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-38106023

RESUMO

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.

8.
medRxiv ; 2023 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-37905038

RESUMO

Multi-ancestry genome-wide association studies (GWAS) have highlighted the existence of variants with ancestry-specific effect sizes. Understanding where and why these ancestry-specific effects occur is fundamental to understanding the genetic basis of human diseases and complex traits. Here, we characterized genes differentially expressed across ancestries (ancDE genes) at the cell-type level by leveraging single-cell RNA-seq data in peripheral blood mononuclear cells for 21 individuals with East Asian (EAS) ancestry and 23 individuals with European (EUR) ancestry (172K cells); then, we tested if variants surrounding those genes were enriched in disease variants with ancestry-specific effect sizes by leveraging ancestry-matched GWAS of 31 diseases and complex traits (average N = 90K and 267K in EAS and EUR, respectively). We observed that ancDE genes tend to be cell-type-specific, to be enriched in genes interacting with the environment, and in variants with ancestry-specific disease effect sizes, suggesting the impact of shared cell-type-specific gene-by-environment (GxE) interactions between regulatory and disease architectures. Finally, we illustrated how GxE interactions might have led to ancestry-specific MCL1 expression in B cells, and ancestry-specific allele effect sizes in lymphocyte count GWAS for variants surrounding MCL1. Our results imply that large single-cell and GWAS datasets in diverse populations are required to improve our understanding on the effect of genetic variants on human diseases.

9.
Am J Hum Genet ; 110(11): 1863-1874, 2023 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-37879338

RESUMO

Genome-wide association studies (GWASs) across thousands of traits have revealed the pervasive pleiotropy of trait-associated genetic variants. While methods have been proposed to characterize pleiotropic components across groups of phenotypes, scaling these approaches to ultra-large-scale biobanks has been challenging. Here, we propose FactorGo, a scalable variational factor analysis model to identify and characterize pleiotropic components using biobank GWAS summary data. In extensive simulations, we observe that FactorGo outperforms the state-of-the-art (model-free) approach tSVD in capturing latent pleiotropic factors across phenotypes while maintaining a similar computational cost. We apply FactorGo to estimate 100 latent pleiotropic factors from GWAS summary data of 2,483 phenotypes measured in European-ancestry Pan-UK BioBank individuals (N = 420,531). Next, we find that factors from FactorGo are more enriched with relevant tissue-specific annotations than those identified by tSVD (p = 2.58E-10) and validate our approach by recapitulating brain-specific enrichment for BMI and the height-related connection between reproductive system and muscular-skeletal growth. Finally, our analyses suggest shared etiologies between rheumatoid arthritis and periodontal condition in addition to alkaline phosphatase as a candidate prognostic biomarker for prostate cancer. Overall, FactorGo improves our biological understanding of shared etiologies across thousands of GWASs.


Assuntos
Artrite Reumatoide , Estudo de Associação Genômica Ampla , Masculino , Humanos , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial , Fenótipo , Encéfalo , Artrite Reumatoide/genética , Polimorfismo de Nucleotídeo Único/genética , Pleiotropia Genética
10.
medRxiv ; 2023 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-37034739

RESUMO

Genome-wide association studies (GWAS) across thousands of traits have revealed the pervasive pleiotropy of trait-associated genetic variants. While methods have been proposed to characterize pleiotropic components across groups of phenotypes, scaling these approaches to ultra large-scale biobanks has been challenging. Here, we propose FactorGo, a scalable variational factor analysis model to identify and characterize pleiotropic components using biobank GWAS summary data. In extensive simulations, we observe that FactorGo outperforms the state-of-the-art (model-free) approach tSVD in capturing latent pleiotropic factors across phenotypes, while maintaining a similar computational cost. We apply FactorGo to estimate 100 latent pleiotropic factors from GWAS summary data of 2,483 phenotypes measured in European-ancestry Pan-UK BioBank individuals (N=420,531). Next, we find that factors from FactorGo are more enriched with relevant tissue-specific annotations than those identified by tSVD (P=2.58E-10), and validate our approach by recapitulating brain-specific enrichment for BMI and the height-related connection between reproductive system and muscular-skeletal growth. Finally, our analyses suggest novel shared etiologies between rheumatoid arthritis and periodontal condition, in addition to alkaline phosphatase as a candidate prognostic biomarker for prostate cancer. Overall, FactorGo improves our biological understanding of shared etiologies across thousands of GWAS.

11.
Science ; 380(6643): eabn7930, 2023 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-37104580

RESUMO

Understanding the regulatory landscape of the human genome is a long-standing objective of modern biology. Using the reference-free alignment across 241 mammalian genomes produced by the Zoonomia Consortium, we charted evolutionary trajectories for 0.92 million human candidate cis-regulatory elements (cCREs) and 15.6 million human transcription factor binding sites (TFBSs). We identified 439,461 cCREs and 2,024,062 TFBSs under evolutionary constraint. Genes near constrained elements perform fundamental cellular processes, whereas genes near primate-specific elements are involved in environmental interaction, including odor perception and immune response. About 20% of TFBSs are transposable element-derived and exhibit intricate patterns of gains and losses during primate evolution whereas sequence variants associated with complex traits are enriched in constrained TFBSs. Our annotations illuminate the regulatory functions of the human genome.


Assuntos
Evolução Molecular , Genoma Humano , Mamíferos , Elementos Reguladores de Transcrição , Fatores de Transcrição , Animais , Humanos , Sítios de Ligação , Elementos de DNA Transponíveis , Mamíferos/classificação , Mamíferos/genética , Primatas/classificação , Primatas/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Filogenia
12.
Science ; 380(6643): eabn2937, 2023 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-37104612

RESUMO

Thousands of genomic regions have been associated with heritable human diseases, but attempts to elucidate biological mechanisms are impeded by an inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function, agnostic to cell type or disease mechanism. Single-base phyloP scores from 240 mammals identified 3.3% of the human genome as significantly constrained and likely functional. We compared phyloP scores to genome annotation, association studies, copy-number variation, clinical genetics findings, and cancer data. Constrained positions are enriched for variants that explain common disease heritability more than other functional annotations. Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.


Assuntos
Doença , Variação Genética , Animais , Humanos , Evolução Biológica , Genoma Humano , Estudo de Associação Genômica Ampla , Genômica , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Doença/genética
13.
Elife ; 122023 03 20.
Artigo em Inglês | MEDLINE | ID: mdl-36939312

RESUMO

The genetic variants introduced into the ancestors of modern humans from interbreeding with Neanderthals have been suggested to contribute an unexpected extent to complex human traits. However, testing this hypothesis has been challenging due to the idiosyncratic population genetic properties of introgressed variants. We developed rigorous methods to assess the contribution of introgressed Neanderthal variants to heritable trait variation and applied these methods to analyze 235,592 introgressed Neanderthal variants and 96 distinct phenotypes measured in about 300,000 unrelated white British individuals in the UK Biobank. Introgressed Neanderthal variants make a significant contribution to trait variation (explaining 0.12% of trait variation on average). However, the contribution of introgressed variants tends to be significantly depleted relative to modern human variants matched for allele frequency and linkage disequilibrium (about 59% depletion on average), consistent with purifying selection on introgressed variants. Different from previous studies (McArthur et al., 2021), we find no evidence for elevated heritability across the phenotypes examined. We identified 348 independent significant associations of introgressed Neanderthal variants with 64 phenotypes. Previous work (Skov et al., 2020) has suggested that a majority of such associations are likely driven by statistical association with nearby modern human variants that are the true causal variants. Applying a customized fine-mapping led us to identify 112 regions across 47 phenotypes containing 4303 unique genetic variants where introgressed variants are highly likely to have a phenotypic effect. Examination of these variants reveals their substantial impact on genes that are important for the immune system, development, and metabolism.


Assuntos
Hominidae , Homem de Neandertal , Animais , Humanos , Homem de Neandertal/genética , Herança Multifatorial , Hominidae/genética , Frequência do Gene , Genética Populacional , Genoma Humano
14.
bioRxiv ; 2023 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-36945512

RESUMO

Although thousands of genomic regions have been associated with heritable human diseases, attempts to elucidate biological mechanisms are impeded by a general inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function that is agnostic to cell type or disease mechanism. Here, single base phyloP scores from the whole genome alignment of 240 placental mammals identified 3.5% of the human genome as significantly constrained, and likely functional. We compared these scores to large-scale genome annotation, genome-wide association studies (GWAS), copy number variation, clinical genetics findings, and cancer data sets. Evolutionarily constrained positions are enriched for variants explaining common disease heritability (more than any other functional annotation). Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.

15.
Res Sq ; 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38168385

RESUMO

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.

16.
Nat Genet ; 54(10): 1479-1492, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36175791

RESUMO

Genome-wide association studies provide a powerful means of identifying loci and genes contributing to disease, but in many cases, the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important for identifying pathogenic processes and developing therapeutics. In the present study, we introduce sc-linker, a framework for integrating single-cell RNA-sequencing, epigenomic SNP-to-gene maps and genome-wide association study summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. The inferred disease enrichments recapitulated known biology and highlighted notable cell-disease relationships, including γ-aminobutyric acid-ergic neurons in major depressive disorder, a disease-dependent M-cell program in ulcerative colitis and a disease-specific complement cascade process in multiple sclerosis. In autoimmune disease, both healthy and disease-dependent immune cell-type programs were associated, whereas only disease-dependent epithelial cell programs were prominent, suggesting a role in disease response rather than initiation. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.


Assuntos
Transtorno Depressivo Maior , Estudo de Associação Genômica Ampla , Transtorno Depressivo Maior/genética , Predisposição Genética para Doença , Genética Humana , Humanos , Polimorfismo de Nucleotídeo Único/genética , RNA , Ácido gama-Aminobutírico
17.
Cell Genom ; 2(7)2022 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-35873673

RESUMO

We assess contributions to autoimmune disease of genes whose regulation is driven by enhancer regions (enhancer-related) and genes that regulate other genes in trans (candidate master-regulator). We link these genes to SNPs using several SNP-to-gene (S2G) strategies and apply heritability analyses to draw three conclusions about 11 autoimmune/blood-related diseases/traits. First, several characterizations of enhancer-related genes using functional genomics data are informative for autoimmune disease heritability after conditioning on a broad set of regulatory annotations. Second, candidate master-regulator genes defined using trans-eQTL in blood are also conditionally informative for autoimmune disease heritability. Third, integrating enhancer-related and master-regulator gene sets with protein-protein interaction (PPI) network information magnified their disease signal. The resulting PPI-enhancer gene score produced >2-fold stronger heritability signal and >2-fold stronger enrichment for drug targets, compared with the recently proposed enhancer domain score. In each case, functionally informed S2G strategies produced 4.1- to 13-fold stronger disease signals than conventional window-based strategies.

18.
Nat Genet ; 54(6): 827-836, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35668300

RESUMO

Disease-associated single-nucleotide polymorphisms (SNPs) generally do not implicate target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis. Here, we developed a heritability-based framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk. Our optimal combined S2G strategy (cS2G) included seven constituent S2G strategies and achieved a precision of 0.75 and a recall of 0.33, more than doubling the recall of any individual strategy. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 5,095 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. We further applied cS2G to provide an empirical assessment of disease omnigenicity; we determined that the top 1% of genes explained roughly half of the SNP heritability linked to all genes and that gene-level architectures vary with variant allele frequency.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
19.
Nat Genet ; 54(4): 450-458, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35393596

RESUMO

Polygenic risk scores suffer reduced accuracy in non-European populations, exacerbating health disparities. We propose PolyPred, a method that improves cross-population polygenic risk scores by combining two predictors: a new predictor that leverages functionally informed fine-mapping to estimate causal effects (instead of tagging effects), addressing linkage disequilibrium differences, and BOLT-LMM, a published predictor. When a large training sample is available in the non-European target population, we propose PolyPred+, which further incorporates the non-European training data. We applied PolyPred to 49 diseases/traits in four UK Biobank populations using UK Biobank British training data, and observed relative improvements versus BOLT-LMM ranging from +7% in south Asians to +32% in Africans, consistent with simulations. We applied PolyPred+ to 23 diseases/traits in UK Biobank east Asians using both UK Biobank British and Biobank Japan training data, and observed improvements of +24% versus BOLT-LMM and +12% versus PolyPred. Summary statistics-based analogs of PolyPred and PolyPred+ attained similar improvements.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Desequilíbrio de Ligação , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , Fatores de Risco
20.
Am J Hum Genet ; 109(4): 692-709, 2022 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-35271803

RESUMO

Recent works have shown that SNP heritability-which is dominated by low-effect common variants-may not be the most relevant quantity for localizing high-effect/critical disease genes. Here, we introduce methods to estimate the proportion of phenotypic variance explained by a given assignment of SNPs to a single gene ("gene-level heritability"). We partition gene-level heritability by minor allele frequency (MAF) to find genes whose gene-level heritability is explained exclusively by "low-frequency/rare" variants (0.5% ≤ MAF < 1%). Applying our method to ∼16K protein-coding genes and 25 quantitative traits in the UK Biobank (N = 290K "White British"), we find that, on average across traits, ∼2.5% of nonzero-heritability genes have a rare-variant component and only ∼0.8% (327 gene-trait pairs) have heritability exclusively from rare variants. Of these 327 gene-trait pairs, 114 (35%) were not detected by existing gene-level association testing methods. The additional genes we identify are significantly enriched for known disease genes, and we find several examples of genes that have been previously implicated in phenotypically related Mendelian disorders. Notably, the rare-variant component of gene-level heritability exhibits trends different from those of common-variant gene-level heritability. For example, while total gene-level heritability increases with gene length, the rare-variant component is significantly larger among shorter genes; the cumulative distributions of gene-level heritability also vary across traits and reveal differences in the relative contributions of rare/common variants to overall gene-level polygenicity. While nonzero gene-level heritability does not imply causality, if interpreted in the correct context, gene-level heritability can reveal useful insights into complex-trait genetic architecture.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Frequência do Gene/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Herança Multifatorial/genética , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA