Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
1.
Am J Hum Genet ; 111(5): 990-995, 2024 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-38636510

RESUMO

Since genotype imputation was introduced, researchers have been relying on the estimated imputation quality from imputation software to perform post-imputation quality control (QC). However, this quality estimate (denoted as Rsq) performs less well for lower-frequency variants. We recently published MagicalRsq, a machine-learning-based imputation quality calibration, which leverages additional typed markers from the same cohort and outperforms Rsq as a QC metric. In this work, we extended the original MagicalRsq to allow cross-cohort model training and named the new model MagicalRsq-X. We removed the cohort-specific estimated minor allele frequency and included linkage disequilibrium scores and recombination rates as additional features. Leveraging whole-genome sequencing data from TOPMed, specifically participants in the BioMe, JHS, WHI, and MESA studies, we performed comprehensive cross-cohort evaluations for predominantly European and African ancestral individuals based on their inferred global ancestry with the 1000 Genomes and Human Genome Diversity Project data as reference. Our results suggest MagicalRsq-X outperforms Rsq in almost every setting, with 7.3%-14.4% improvement in squared Pearson correlation with true R2, corresponding to 85-218 K variant gains. We further developed a metric to quantify the genetic distances of a target cohort relative to a reference cohort and showed that such metric largely explained the performance of MagicalRsq-X models. Finally, we found MagicalRsq-X saved up to 53 known genome-wide significant variants in one of the largest blood cell trait GWASs that would be missed using the original Rsq for QC. In conclusion, MagicalRsq-X shows superiority for post-imputation QC and benefits genetic studies by distinguishing well and poorly imputed lower-frequency variants.


Assuntos
Frequência do Gene , Genótipo , Polimorfismo de Nucleotídeo Único , Software , Humanos , Estudos de Coortes , Desequilíbrio de Ligação , Estudo de Associação Genômica Ampla/métodos , Genoma Humano , Controle de Qualidade , Aprendizado de Máquina , Sequenciamento Completo do Genoma/normas , Sequenciamento Completo do Genoma/métodos
2.
Hum Mol Genet ; 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38747556

RESUMO

Inflammation biomarkers can provide valuable insight into the role of inflammatory processes in many diseases and conditions. Sequencing based analyses of such biomarkers can also serve as an exemplar of the genetic architecture of quantitative traits. To evaluate the biological insight, which can be provided by a multi-ancestry, whole-genome based association study, we performed a comprehensive analysis of 21 inflammation biomarkers from up to 38 465 individuals with whole-genome sequencing from the Trans-Omics for Precision Medicine (TOPMed) program (with varying sample size by trait, where the minimum sample size was n = 737 for MMP-1). We identified 22 distinct single-variant associations across 6 traits-E-selectin, intercellular adhesion molecule 1, interleukin-6, lipoprotein-associated phospholipase A2 activity and mass, and P-selectin-that remained significant after conditioning on previously identified associations for these inflammatory biomarkers. We further expanded upon known biomarker associations by pairing the single-variant analysis with a rare variant set-based analysis that further identified 19 significant rare variant set-based associations with 5 traits. These signals were distinct from both significant single variant association signals within TOPMed and genetic signals observed in prior studies, demonstrating the complementary value of performing both single and rare variant analyses when analyzing quantitative traits. We also confirm several previously reported signals from semi-quantitative proteomics platforms. Many of these signals demonstrate the extensive allelic heterogeneity and ancestry-differentiated variant-trait associations common for inflammation biomarkers, a characteristic we hypothesize will be increasingly observed with well-powered, large-scale analyses of complex traits.

3.
Am J Hum Genet ; 110(10): 1704-1717, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37802043

RESUMO

Long non-coding RNAs (lncRNAs) are known to perform important regulatory functions in lipid metabolism. Large-scale whole-genome sequencing (WGS) studies and new statistical methods for variant set tests now provide an opportunity to assess more associations between rare variants in lncRNA genes and complex traits across the genome. In this study, we used high-coverage WGS from 66,329 participants of diverse ancestries with measurement of blood lipids and lipoproteins (LDL-C, HDL-C, TC, and TG) in the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program to investigate the role of lncRNAs in lipid variability. We aggregated rare variants for 165,375 lncRNA genes based on their genomic locations and conducted rare-variant aggregate association tests using the STAAR (variant-set test for association using annotation information) framework. We performed STAAR conditional analysis adjusting for common variants in known lipid GWAS loci and rare-coding variants in nearby protein-coding genes. Our analyses revealed 83 rare lncRNA variant sets significantly associated with blood lipid levels, all of which were located in known lipid GWAS loci (in a ±500-kb window of a Global Lipids Genetics Consortium index variant). Notably, 61 out of 83 signals (73%) were conditionally independent of common regulatory variation and rare protein-coding variation at the same loci. We replicated 34 out of 61 (56%) conditionally independent associations using the independent UK Biobank WGS data. Our results expand the genetic architecture of blood lipids to rare variants in lncRNAs.


Assuntos
RNA Longo não Codificante , Humanos , RNA Longo não Codificante/genética , Estudo de Associação Genômica Ampla , Medicina de Precisão , Sequenciamento Completo do Genoma/métodos , Lipídeos/genética , Polimorfismo de Nucleotídeo Único/genética
4.
Am J Hum Genet ; 109(3): 446-456, 2022 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-35216679

RESUMO

Attempts to identify and prioritize functional DNA elements in coding and non-coding regions, particularly through use of in silico functional annotation data, continue to increase in popularity. However, specific functional roles can vary widely from one variant to another, making it challenging to summarize different aspects of variant function with a one-dimensional rating. Here we propose multi-dimensional annotation-class integrative estimation (MACIE), an unsupervised multivariate mixed-model framework capable of integrating annotations of diverse origin to assess multi-dimensional functional roles for both coding and non-coding variants. Unlike existing one-dimensional scoring methods, MACIE views variant functionality as a composite attribute encompassing multiple characteristics and estimates the joint posterior functional probabilities of each genomic position. This estimate offers more comprehensive and interpretable information in the presence of multiple aspects of functionality. Applied to a variety of independent coding and non-coding datasets, MACIE demonstrates powerful and robust performance in discriminating between functional and non-functional variants. We also show an application of MACIE to fine-mapping and heritability enrichment analysis by using the lipids GWAS summary statistics data from the European Network for Genetic and Genomic Epidemiology Consortium.


Assuntos
Genoma Humano , Estudo de Associação Genômica Ampla , Genoma Humano/genética , Estudo de Associação Genômica Ampla/métodos , Genômica , Humanos , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único/genética , Probabilidade
5.
Nat Methods ; 19(12): 1599-1611, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36303018

RESUMO

Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.


Assuntos
Estudo de Associação Genômica Ampla , Genoma , Humanos , Estudo de Associação Genômica Ampla/métodos , Sequenciamento Completo do Genoma/métodos , Fenótipo , Variação Genética
6.
Nucleic Acids Res ; 51(D1): D1300-D1311, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36350676

RESUMO

Large biobank-scale whole genome sequencing (WGS) studies are rapidly identifying a multitude of coding and non-coding variants. They provide an unprecedented resource for illuminating the genetic basis of human diseases. Variant functional annotations play a critical role in WGS analysis, result interpretation, and prioritization of disease- or trait-associated causal variants. Existing functional annotation databases have limited scope to perform online queries and functionally annotate the genotype data of large biobank-scale WGS studies. We develop the Functional Annotation of Variants Online Resources (FAVOR) to meet these pressing needs. FAVOR provides a comprehensive multi-faceted variant functional annotation online portal that summarizes and visualizes findings of all possible nine billion single nucleotide variants (SNVs) across the genome. It allows for rapid variant-, gene- and region-level queries of variant functional annotations. FAVOR integrates variant functional information from multiple sources to describe the functional characteristics of variants and facilitates prioritizing plausible causal variants influencing human phenotypes. Furthermore, we provide a scalable annotation tool, FAVORannotator, to functionally annotate large-scale WGS studies and efficiently store the genotype and their variant functional annotation data in a single file using the annotated Genomic Data Structure (aGDS) format, making downstream analysis more convenient. FAVOR and FAVORannotator are available at https://favor.genohub.org.


Assuntos
Genoma Humano , Software , Humanos , Anotação de Sequência Molecular , Genômica , Genótipo , Variação Genética
7.
Bioinformatics ; 38(11): 3116-3117, 2022 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-35441669

RESUMO

SUMMARY: We developed the variant-Set Test for Association using Annotation infoRmation (STAAR) workflow description language (WDL) workflow to facilitate the analysis of rare variants in whole genome sequencing association studies. The open-access STAAR workflow written in the WDL allows a user to perform rare variant testing for both gene-centric and genetic region approaches, enabling genome-wide, candidate and conditional analyses. It incorporates functional annotations into the workflow as introduced in the STAAR method in order to boost the rare variant analysis power. This tool was specifically developed and optimized to be implemented on cloud-based platforms such as BioData Catalyst Powered by Terra. It provides easy-to-use functionality for rare variant analysis that can be incorporated into an exhaustive whole genome sequencing analysis pipeline. AVAILABILITY AND IMPLEMENTATION: The workflow is freely available from https://dockstore.org/workflows/github.com/sheilagaynor/STAAR_workflow. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Computação em Nuvem , Software , Fluxo de Trabalho , Genoma , Estudo de Associação Genômica Ampla
8.
Genet Epidemiol ; 45(1): 99-114, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-32924180

RESUMO

Clinical trial results have recently demonstrated that inhibiting inflammation by targeting the interleukin-1ß pathway can offer a significant reduction in lung cancer incidence and mortality, highlighting a pressing and unmet need to understand the benefits of inflammation-focused lung cancer therapies at the genetic level. While numerous genome-wide association studies (GWAS) have explored the genetic etiology of lung cancer, there remains a large gap between the type of information that may be gleaned from an association study and the depth of understanding necessary to explain and drive translational findings. Thus, in this study we jointly model and integrate extensive multiomics data sources, utilizing a total of 40 genome-wide functional annotations that augment previously published results from the International Lung Cancer Consortium (ILCCO) GWAS, to prioritize and characterize single nucleotide polymorphisms (SNPs) that increase risk of squamous cell lung cancer through the inflammatory and immune responses. Our work bridges the gap between correlative analysis and translational follow-up research, refining GWAS association measures in an interpretable and systematic manner. In particular, reanalysis of the ILCCO data highlights the impact of highly associated SNPs from nuclear factor-κB signaling pathway genes as well as major histocompatibility complex mediated variation in immune responses. One consequence of prioritizing likely functional SNPs is the pruning of variants that might be selected for follow-up work by over an order of magnitude, from potentially tens of thousands to hundreds. The strategies we introduce provide informative and interpretable approaches for incorporating extensive genome-wide annotation data in analysis of genetic association studies.


Assuntos
Estudo de Associação Genômica Ampla , Neoplasias Pulmonares , Células Epiteliais , Predisposição Genética para Doença , Humanos , Inflamação/genética , Neoplasias Pulmonares/genética , Modelos Genéticos , Polimorfismo de Nucleotídeo Único
9.
Am J Hum Genet ; 104(5): 802-814, 2019 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-30982610

RESUMO

Whole-genome sequencing (WGS) studies are being widely conducted in order to identify rare variants associated with human diseases and disease-related traits. Classical single-marker association analyses for rare variants have limited power, and variant-set-based analyses are commonly used by researchers for analyzing rare variants. However, existing variant-set-based approaches need to pre-specify genetic regions for analysis; hence, they are not directly applicable to WGS data because of the large number of intergenic and intron regions that consist of a massive number of non-coding variants. The commonly used sliding-window method requires the pre-specification of fixed window sizes, which are often unknown as a priori, are difficult to specify in practice, and are subject to limitations given that the sizes of genetic-association regions are likely to vary across the genome and phenotypes. We propose a computationally efficient and dynamic scan-statistic method (Scan the Genome [SCANG]) for analyzing WGS data; this method flexibly detects the sizes and the locations of rare-variant association regions without the need to specify a prior, fixed window size. The proposed method controls for the genome-wise type I error rate and accounts for the linkage disequilibrium among genetic variants. It allows the detected sizes of rare-variant association regions to vary across the genome. Through extensive simulated studies that consider a wide variety of scenarios, we show that SCANG substantially outperforms several alternative methods for detecting rare-variant-associations while controlling for the genome-wise type I error rates. We illustrate SCANG by analyzing the WGS lipids data from the Atherosclerosis Risk in Communities (ARIC) study.


Assuntos
Algoritmos , Biologia Computacional/métodos , Variação Genética , Genoma Humano , Estudo de Associação Genômica Ampla , Sequenciamento Completo do Genoma/métodos , Humanos , Desequilíbrio de Ligação , Modelos Genéticos
10.
Hum Genet ; 138(3): 271-285, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-30805717

RESUMO

A growing number of studies clearly demonstrate a substantial link between metabolic dysfunction and the risk of Alzheimer's disease (AD), especially glucose-related dysfunction; one hypothesis for this comorbidity is the presence of a common genetic etiology. We conducted a large-scale cross-trait GWAS to investigate the genetic overlap between AD and ten metabolic traits. Among all the metabolic traits, fasting glucose, fasting insulin and HDL were found to be genetically associated with AD. Local genetic covariance analysis found that 19q13 region had strong local genetic correlation between AD and T2D (P = 6.78 × 10- 22), LDL (P = 1.74 × 10- 253) and HDL (P = 7.94 × 10- 18). Cross-trait meta-analysis identified 4 loci that were associated with AD and fasting glucose, 3 loci that were associated with AD and fasting insulin, and 20 loci that were associated with AD and HDL (Pmeta < 1.6 × 10- 8, single trait P < 0.05). Functional analysis revealed that the shared genes are enriched in amyloid metabolic process, lipoprotein remodeling and other related biological pathways; also in pancreas, liver, blood and other tissues. Our work identifies common genetic architectures shared between AD and fasting glucose, fasting insulin and HDL, and sheds light on molecular mechanisms underlying the association between metabolic dysregulation and AD.


Assuntos
Doença de Alzheimer/diagnóstico , Doença de Alzheimer/genética , Metabolismo Energético/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Característica Quantitativa Herdável , Doença de Alzheimer/metabolismo , Glicemia , Jejum , Estudos de Associação Genética , Humanos , Insulina/sangue , Desequilíbrio de Ligação , Redes e Vias Metabólicas , Fenótipo , Locos de Características Quantitativas
11.
Respir Res ; 20(1): 64, 2019 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-30940143

RESUMO

BACKGROUND: A growing number of studies clearly demonstrate a substantial association between chronic obstructive pulmonary disease (COPD) and cardiovascular diseases (CVD), although little is known about the shared genetics that contribute to this association. METHODS: We conducted a large-scale cross-trait genome-wide association study to investigate genetic overlap between COPD (Ncase = 12,550, Ncontrol = 46,368) from the International COPD Genetics Consortium and four primary cardiac traits: resting heart rate (RHR) (N = 458,969), high blood pressure (HBP) (Ncase = 144,793, Ncontrol = 313,761), coronary artery disease (CAD)(Ncase = 60,801, Ncontrol = 123,504), and stroke (Ncase = 40,585, Ncontrol = 406,111) from UK Biobank, CARDIoGRAMplusC4D Consortium, and International Stroke Genetics Consortium data. RESULTS: RHR and HBP had modest genetic correlation, and CAD had borderline evidence with COPD at a genome-wide level. We found evidence of local genetic correlation with particular regions of the genome. Cross-trait meta-analysis of COPD identified 21 loci jointly associated with RHR, 22 loci with HBP, and 3 loci with CAD. Functional analysis revealed that shared genes were enriched in smoking-related pathways and in cardiovascular, nervous, and immune system tissues. An examination of smoking-related genetic variants identified SNPs located in 15q25.1 region associated with cigarettes per day, with effects on RHR and CAD. A Mendelian randomization analysis showed a significant positive causal effect of COPD on RHR (causal estimate = 0.1374, P = 0.008). CONCLUSION: In a set of large-scale GWAS, we identify evidence of shared genetics between COPD and cardiac traits.


Assuntos
Doenças Cardiovasculares/genética , Variação Genética/genética , Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único/genética , Doença Pulmonar Obstrutiva Crônica/genética , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/epidemiologia , Bases de Dados Genéticas/tendências , Predisposição Genética para Doença/epidemiologia , Predisposição Genética para Doença/genética , Humanos , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Característica Quantitativa Herdável
12.
J Environ Manage ; 237: 569-575, 2019 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-30826638

RESUMO

BACKGROUND: China and other developing countries in Asia follow similar economic growth patterns described by the flying geese (FG) model, which explains the "catching-up" process of industrialization in latecomer economies. Japan, newly industrialized economies, and China have followed this path, with similar economic development trajectories. Based on the FG model, we postulated a "flying S" hypothesis stating that if a country is located within an FG region and its energy matrix is relatively constant, its per capita CO2 emission curve will mirror that of "leading geese" countries in the same FG group. METHOD: Historical CO2 emissions data were obtained from literature review and national reports and were calculated using bottom-up methods. A sigmoid-shaped, non-linear mixed effect model was applied to examine ex post data with 1000 simulated predictions to construct 95% empirical bands from these fits. By multiplying by estimated population, we predicted total emissions of selected FG countries. RESULTS: Per capita CO2 emissions from the same FG group mirror each other, especially among second and third industrial sectors. We estimated an annual 18,252.24 million tons of CO2 emissions (MtCO2) (95% CI = 9458.88-23,972.88) in China and 8281.76 MtCO2 (95% CI = 2765.68-14,959.12) in India in 2030. CONCLUSION: This study bridges the macroeconomic FG paradigm to study climate change and proposes a "flying S" hypothesis to predict greenhouse gas emissions in East Asia. By applying our theory to empirical data, we provide an alternative framework to predict CO2 emissions in 2030 and beyond.


Assuntos
Dióxido de Carbono , Carbono , Ásia , China , Índia , Japão
13.
Hum Genet ; 137(1): 15-30, 2018 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-29288389

RESUMO

Over a decade of genome-wide association, studies have made great strides toward the detection of genes and genetic mechanisms underlying complex traits. However, the majority of associated loci reside in non-coding regions that are functionally uncharacterized in general. Now, the availability of large-scale tissue and cell type-specific transcriptome and epigenome data enables us to elucidate how non-coding genetic variants can affect gene expressions and are associated with phenotypic changes. Here, we provide an overview of this emerging field in human genomics, summarizing available data resources and state-of-the-art analytic methods to facilitate in-silico prioritization of non-coding regulatory mutations. We also highlight the limitations of current approaches and discuss the direction of much-needed future research.


Assuntos
Redes Reguladoras de Genes , Variação Genética , Genoma Humano , Genômica , Estudos de Associação Genética , Loci Gênicos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação
15.
HGG Adv ; 5(3): 100320, 2024 Jul 18.
Artigo em Inglês | MEDLINE | ID: mdl-38902927

RESUMO

The KRAS mutation is the most common oncogenic driver in patients with non-small cell lung cancer (NSCLC). However, a detailed understanding of how self-reported race and/or ethnicity (SIRE), genetically inferred ancestry (GIA), and their interaction affect KRAS mutation is largely unknown. Here, we investigated the associations between SIRE, quantitative GIA, and KRAS mutation and its allele-specific subtypes in a multi-ethnic cohort of 3,918 patients from the Boston Lung Cancer Survival cohort and the Chinese OrigiMed cohort with an independent validation cohort of 1,450 patients with NSCLC. This comprehensive analysis included detailed covariates such as age at diagnosis, sex, clinical stage, cancer histology, and smoking status. We report that SIRE is significantly associated with KRAS mutations, modified by sex, with SIRE-Asian patients showing lower rates of KRAS mutation, transversion substitution, and the allele-specific subtype KRASG12C compared to SIRE-White patients after adjusting for potential confounders. Moreover, GIA was found to correlate with KRAS mutations, where patients with a higher proportion of European ancestry had an increased risk of KRAS mutations, especially more transition substitutions and KRASG12D. Notably, among SIRE-White patients, an increase in European ancestry was linked to a higher likelihood of KRAS mutations, whereas an increase in admixed American ancestry was associated with a reduced likelihood, suggesting that quantitative GIA offers additional information beyond SIRE. The association of SIRE, GIA, and their interplay with KRAS driver mutations in NSCLC highlights the importance of incorporating both into population-based cancer research, aiming to refine clinical decision-making processes and mitigate health disparities.


Assuntos
Alelos , Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Mutação , Proteínas Proto-Oncogênicas p21(ras) , Humanos , Carcinoma Pulmonar de Células não Pequenas/genética , Carcinoma Pulmonar de Células não Pequenas/etnologia , Carcinoma Pulmonar de Células não Pequenas/patologia , Proteínas Proto-Oncogênicas p21(ras)/genética , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/etnologia , Neoplasias Pulmonares/patologia , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Prevalência , Etnicidade/genética , Grupos Raciais/genética , Predisposição Genética para Doença
16.
Nat Commun ; 14(1): 3111, 2023 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-37253714

RESUMO

Circulating metabolite levels may reflect the state of the human organism in health and disease, however, the genetic architecture of metabolites is not fully understood. We have performed a whole-genome sequencing association analysis of both common and rare variants in up to 11,840 multi-ethnic participants from five studies with up to 1666 circulating metabolites. We have discovered 1985 novel variant-metabolite associations, and validated 761 locus-metabolite associations reported previously. Seventy-nine novel variant-metabolite associations have been replicated, including three genetic loci located on the X chromosome that have demonstrated its involvement in metabolic regulation. Gene-based analysis have provided further support for seven metabolite-replicated loci pairs and their biologically plausible genes. Among those novel replicated variant-metabolite pairs, follow-up analyses have revealed that 26 metabolites have colocalized with 21 tissues, seven metabolite-disease outcome associations have been putatively causal, and 7 metabolites might be regulated by plasma protein levels. Our results have depicted the genetic contribution to circulating metabolite levels, providing additional insights into understanding human disease.


Assuntos
Etnicidade , Locos de Características Quantitativas , Humanos , Etnicidade/genética , Metaboloma/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único
17.
Nat Genet ; 55(1): 154-164, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36564505

RESUMO

Meta-analysis of whole genome sequencing/whole exome sequencing (WGS/WES) studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Existing rare variant meta-analysis approaches are not scalable to biobank-scale WGS data. Here we present MetaSTAAR, a powerful and resource-efficient rare variant meta-analysis framework for large-scale WGS/WES studies. MetaSTAAR accounts for relatedness and population structure, can analyze both quantitative and dichotomous traits and boosts the power of rare variant tests by incorporating multiple variant functional annotations. Through meta-analysis of four lipid traits in 30,138 ancestrally diverse samples from 14 studies of the Trans Omics for Precision Medicine (TOPMed) Program, we show that MetaSTAAR performs rare variant meta-analysis at scale and produces results comparable to using pooled data. Additionally, we identified several conditionally significant rare variant associations with lipid traits. We further demonstrate that MetaSTAAR is scalable to biobank-scale cohorts through meta-analysis of TOPMed WGS data and UK Biobank WES data of ~200,000 samples.


Assuntos
Estudo de Associação Genômica Ampla , Lipídeos , Estudo de Associação Genômica Ampla/métodos , Sequenciamento Completo do Genoma/métodos , Sequenciamento do Exoma , Fenótipo , Lipídeos/genética
18.
medRxiv ; 2023 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-37425772

RESUMO

Long non-coding RNAs (lncRNAs) are known to perform important regulatory functions. Large-scale whole genome sequencing (WGS) studies and new statistical methods for variant set tests now provide an opportunity to assess the associations between rare variants in lncRNA genes and complex traits across the genome. In this study, we used high-coverage WGS from 66,329 participants of diverse ancestries with blood lipid levels (LDL-C, HDL-C, TC, and TG) in the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program to investigate the role of lncRNAs in lipid variability. We aggregated rare variants for 165,375 lncRNA genes based on their genomic locations and conducted rare variant aggregate association tests using the STAAR (variant-Set Test for Association using Annotation infoRmation) framework. We performed STAAR conditional analysis adjusting for common variants in known lipid GWAS loci and rare coding variants in nearby protein coding genes. Our analyses revealed 83 rare lncRNA variant sets significantly associated with blood lipid levels, all of which were located in known lipid GWAS loci (in a ±500 kb window of a Global Lipids Genetics Consortium index variant). Notably, 61 out of 83 signals (73%) were conditionally independent of common regulatory variations and rare protein coding variations at the same loci. We replicated 34 out of 61 (56%) conditionally independent associations using the independent UK Biobank WGS data. Our results expand the genetic architecture of blood lipids to rare variants in lncRNA, implicating new therapeutic opportunities.

19.
bioRxiv ; 2023 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-37961350

RESUMO

Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally-scalable analytical pipeline for functionally-informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits (low-density lipoprotein cholesterol, high-density lipoprotein cholesterol and triglycerides) in 61,861 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered new associations with lipid traits missed by single-trait analysis, including rare variants within an enhancer of NIPSNAP3A and an intergenic region on chromosome 1.

20.
Circ Genom Precis Med ; 16(6): e004176, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38014529

RESUMO

BACKGROUND: Individuals with type 2 diabetes (T2D) have an increased risk of coronary artery disease (CAD), but questions remain about the underlying pathology. Identifying which CAD loci are modified by T2D in the development of subclinical atherosclerosis (coronary artery calcification [CAC], carotid intima-media thickness, or carotid plaque) may improve our understanding of the mechanisms leading to the increased CAD in T2D. METHODS: We compared the common and rare variant associations of known CAD loci from the literature on CAC, carotid intima-media thickness, and carotid plaque in up to 29 670 participants, including up to 24 157 normoglycemic controls and 5513 T2D cases leveraging whole-genome sequencing data from the Trans-Omics for Precision Medicine program. We included first-order T2D interaction terms in each model to determine whether CAD loci were modified by T2D. The genetic main and interaction effects were assessed using a joint test to determine whether a CAD variant, or gene-based rare variant set, was associated with the respective subclinical atherosclerosis measures and then further determined whether these loci had a significant interaction test. RESULTS: Using a Bonferroni-corrected significance threshold of P<1.6×10-4, we identified 3 genes (ATP1B1, ARVCF, and LIPG) associated with CAC and 2 genes (ABCG8 and EIF2B2) associated with carotid intima-media thickness and carotid plaque, respectively, through gene-based rare variant set analysis. Both ATP1B1 and ARVCF also had significantly different associations for CAC in T2D cases versus controls. No significant interaction tests were identified through the candidate single-variant analysis. CONCLUSIONS: These results highlight T2D as an important modifier of rare variant associations in CAD loci with CAC.


Assuntos
Aterosclerose , Doença da Artéria Coronariana , Diabetes Mellitus Tipo 2 , Placa Aterosclerótica , Humanos , Doença da Artéria Coronariana/genética , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/genética , Espessura Intima-Media Carotídea , Fatores de Risco , Aterosclerose/genética , Genômica
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa