Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Assunto da revista
País de afiliação
Intervalo de ano de publicação
1.
PLoS Genet ; 19(7): e1010807, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37418489

RESUMO

Germline mutation is the mechanism by which genetic variation in a population is created. Inferences derived from mutation rate models are fundamental to many population genetics methods. Previous models have demonstrated that nucleotides flanking polymorphic sites-the local sequence context-explain variation in the probability that a site is polymorphic. However, limitations to these models exist as the size of the local sequence context window expands. These include a lack of robustness to data sparsity at typical sample sizes, lack of regularization to generate parsimonious models and lack of quantified uncertainty in estimated rates to facilitate comparison between models. To address these limitations, we developed Baymer, a regularized Bayesian hierarchical tree model that captures the heterogeneous effect of sequence contexts on polymorphism probabilities. Baymer implements an adaptive Metropolis-within-Gibbs Markov Chain Monte Carlo sampling scheme to estimate the posterior distributions of sequence-context based probabilities that a site is polymorphic. We show that Baymer accurately infers polymorphism probabilities and well-calibrated posterior distributions, robustly handles data sparsity, appropriately regularizes to return parsimonious models, and scales computationally at least up to 9-mer context windows. We demonstrate application of Baymer in three ways-first, identifying differences in polymorphism probabilities between continental populations in the 1000 Genomes Phase 3 dataset, second, in a sparse data setting to examine the use of polymorphism models as a proxy for de novo mutation probabilities as a function of variant age, sequence context window size, and demographic history, and third, comparing model concordance between different great ape species. We find a shared context-dependent mutation rate architecture underlying our models, enabling a transfer-learning inspired strategy for modeling germline mutations. In summary, Baymer is an accurate polymorphism probability estimation algorithm that automatically adapts to data sparsity at different sequence context levels, thereby making efficient use of the available data.


Assuntos
Genoma Humano , Taxa de Mutação , Humanos , Genoma Humano/genética , Teorema de Bayes , Mutação , Polimorfismo Genético , Cadeias de Markov , Método de Monte Carlo
2.
Ann Hum Biol ; 50(1): 258-266, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37343163

RESUMO

CONTEXT: Like other complex phenotypes, human height reflects a combination of environmental and genetic factors, but is notable for being exceptionally easy to measure. Height has therefore been commonly used to make observations later generalised to other phenotypes though the appropriateness of such generalisations is not always considered. OBJECTIVES: We aimed to assess height's suitability as a model for other complex phenotypes and review recent advances in height genetics with regard to their implications for complex phenotypes more broadly. METHODS: We conducted a comprehensive literature search in PubMed and Google Scholar for articles relevant to the genetics of height and its comparatibility to other phenotypes. RESULTS: Height is broadly similar to other phenotypes apart from its high heritability and ease of measurment. Recent genome-wide association studies (GWAS) have identified over 12,000 independent signals associated with height and saturated height's common single nucleotide polymorphism based heritability of height within a subset of the genome in individuals similar to European reference populations. CONCLUSIONS: Given the similarity of height to other complex traits, the saturation of GWAS's ability to discover additional height-associated variants signals potential limitations to the omnigenic model of complex-phenotype inheritance, indicating the likely future power of polygenic scores and risk scores, and highlights the increasing need for large-scale variant-to-gene mapping efforts.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Fenótipo , Genoma Humano , Polimorfismo de Nucleotídeo Único
3.
bioRxiv ; 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38562830

RESUMO

Over 1,100 independent signals have been identified with genome-wide association studies (GWAS) for bone mineral density (BMD), a key risk factor for mortality-increasing fragility fractures; however, the effector gene(s) for most remain unknown. Informed by a variant-to-gene mapping strategy implicating 89 non-coding elements predicted to regulate osteoblast gene expression at BMD GWAS loci, we executed a single-cell CRISPRi screen in human fetal osteoblast 1.19 cells (hFOBs). The BMD relevance of hFOBs was supported by heritability enrichment from cross-cell type stratified LD-score regression involving 98 cell types grouped into 15 tissues. 24 genes showed perturbation in the screen, with four (ARID5B, CC2D1B, EIF4G2, and NCOA3) exhibiting consistent effects upon siRNA knockdown on three measures of osteoblast maturation and mineralization. Lastly, additional heritability enrichments, genetic correlations, and multi-trait fine-mapping revealed that many BMD GWAS signals are pleiotropic and likely mediate their effects via non-bone tissues that warrant attention in future screens.

4.
bioRxiv ; 2024 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-38826407

RESUMO

The expansion of biobanks has significantly propelled genomic discoveries yet the sheer scale of data within these repositories poses formidable computational hurdles, particularly in handling extensive matrix operations required by prevailing statistical frameworks. In this work, we introduce computational optimizations to the SAIGE (Scalable and Accurate Implementation of Generalized Mixed Model) algorithm, notably employing a GPU-based distributed computing approach to tackle these challenges. We applied these optimizations to conduct a large-scale genome-wide association study (GWAS) across 2,068 phenotypes derived from electronic health records of 635,969 diverse participants from the Veterans Affairs (VA) Million Veteran Program (MVP). Our strategies enabled scaling up the analysis to over 6,000 nodes on the Department of Energy (DOE) Oak Ridge Leadership Computing Facility (OLCF) Summit High-Performance Computer (HPC), resulting in a 20-fold acceleration compared to the baseline model. We also provide a Docker container with our optimizations that was successfully used on multiple cloud infrastructures on UK Biobank and All of Us datasets where we showed significant time and cost benefits over the baseline SAIGE model.

5.
Science ; 385(6706): eadj1182, 2024 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-39024449

RESUMO

One of the justifiable criticisms of human genetic studies is the underrepresentation of participants from diverse populations. Lack of inclusion must be addressed at-scale to identify causal disease factors and understand the genetic causes of health disparities. We present genome-wide associations for 2068 traits from 635,969 participants in the Department of Veterans Affairs Million Veteran Program, a longitudinal study of diverse United States Veterans. Systematic analysis revealed 13,672 genomic risk loci; 1608 were only significant after including non-European populations. Fine-mapping identified causal variants at 6318 signals across 613 traits. One-third (n = 2069) were identified in participants from non-European populations. This reveals a broadly similar genetic architecture across populations, highlights genetic insights gained from underrepresented groups, and presents an extensive atlas of genetic associations.


Assuntos
Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Veteranos , Humanos , Masculino , Variação Genética , Estudos Longitudinais , Polimorfismo de Nucleotídeo Único , Estados Unidos , United States Department of Veterans Affairs , Feminino
6.
medRxiv ; 2023 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-37503172

RESUMO

Heart failure (HF) is a complex trait, influenced by environmental and genetic factors, that affects over 30 million individuals worldwide. Historically, the genetics of HF have been studied in Mendelian forms of disease, where rare genetic variants have been linked to familial cardiomyopathies. More recently, genome-wide association studies (GWAS) have successfully identified common genetic variants associated with risk of HF. However, the relative importance of genetic variants across the allele-frequency spectrum remains incompletely characterized. Here, we report the results of common- and rare-variant association studies of all-cause heart failure, applying recently developed methods to quantify the heritability of HF attributable to different classes of genetic variation. We combine GWAS data across multiple populations including 207,346 individuals with HF and 2,151,210 without, identifying 176 risk loci at genome-wide significance (p < 5×10-8). Signals at newly identified common-variant loci include coding variants in Mendelian cardiomyopathy genes (MYBPC3, BAG3), as well as regulators of lipoprotein (LPL) and glucose metabolism (GIPR, GLP1R), and are enriched in cardiac, muscle, nerve, and vascular tissues, as well as myocyte and adipocyte cell types. Gene burden studies across three biobanks (PMBB, UKB, AOU) including 27,208 individuals with HF and 349,126 without uncover exome-wide significant (p < 3.15×10-6) associations for HF and rare predicted loss-of-function (pLoF) variants in TTN, MYBPC3, FLNC, and BAG3. Total burden heritability of rare coding variants (2.2%, 95% CI 0.99-3.5%) is highly concentrated in a small set of Mendelian cardiomyopathy genes, and is lower than heritability attributable to common variants (4.3%, 95% CI 3.9-4.7%) which is more diffusely spread throughout the genome. Finally, we demonstrate that common-variant background, in the form of a polygenic risk score (PRS), significantly modifies the risk of HF among carriers of pathogenic truncating variants in the Mendelian cardiomyopathy gene TTN. These findings suggest a significant polygenic component to HF exists that is not captured by current clinical genetic testing.

7.
medRxiv ; 2023 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-37425708

RESUMO

Genome-wide association studies (GWAS) have underrepresented individuals from non-European populations, impeding progress in characterizing the genetic architecture and consequences of health and disease traits. To address this, we present a population-stratified phenome-wide GWAS followed by a multi-population meta-analysis for 2,068 traits derived from electronic health records of 635,969 participants in the Million Veteran Program (MVP), a longitudinal cohort study of diverse U.S. Veterans genetically similar to the respective African (121,177), Admixed American (59,048), East Asian (6,702), and European (449,042) superpopulations defined by the 1000 Genomes Project. We identified 38,270 independent variants associating with one or more traits at experiment-wide P<4.6×10-11 significance; fine-mapping 6,318 signals identified from 613 traits to single-variant resolution. Among these, a third (2,069) of the associations were found only among participants genetically similar to non-European reference populations, demonstrating the importance of expanding diversity in genetic studies. Our work provides a comprehensive atlas of phenome-wide genetic associations for future studies dissecting the architecture of complex traits in diverse populations.

8.
Front Genet ; 12: 752390, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34804120

RESUMO

Alzheimer's Disease (AD) is a progressive neurologic disease and the most common form of dementia. While the causes of AD are not completely understood, genetics plays a key role in the etiology of AD, and thus finding genetic factors holds the potential to uncover novel AD mechanisms. For this study, we focus on copy number variation (CNV) detection and burden analysis. Leveraging whole-genome sequence (WGS) data released by Alzheimer's Disease Sequencing Project (ADSP), we developed a scalable bioinformatics pipeline to identify CNVs. This pipeline was applied to 1,737 AD cases and 2,063 cognitively normal controls. As a result, we observed 237,306 and 42,767 deletions and duplications, respectively, with an average of 2,255 deletions and 1,820 duplications per subject. The burden tests show that Non-Hispanic-White cases on average have 16 more duplications than controls do (p-value 2e-6), and Hispanic cases have larger deletions than controls do (p-value 6.8e-5).

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA