Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 36
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
2.
Nature ; 572(7769): 323-328, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31367044

RESUMO

Exome-sequencing studies have generally been underpowered to identify deleterious alleles with a large effect on complex traits as such alleles are mostly rare. Because the population of northern and eastern Finland has expanded considerably and in isolation following a series of bottlenecks, individuals of these populations have numerous deleterious alleles at a relatively high frequency. Here, using exome sequencing of nearly 20,000 individuals from these regions, we investigate the role of rare coding variants in clinically relevant quantitative cardiometabolic traits. Exome-wide association studies for 64 quantitative traits identified 26 newly associated deleterious alleles. Of these 26 alleles, 19 are either unique to or more than 20 times more frequent in Finnish individuals than in other Europeans and show geographical clustering comparable to Mendelian disease mutations that are characteristic of the Finnish population. We estimate that sequencing studies of populations without this unique history would require hundreds of thousands to millions of participants to achieve comparable association power.


Assuntos
Sequenciamento do Exoma , Estudos de Associação Genética/métodos , Predisposição Genética para Doença/genética , Variação Genética/genética , Locos de Características Quantitativas/genética , Alelos , HDL-Colesterol/genética , Análise por Conglomerados , Determinação de Ponto Final , Finlândia , Mapeamento Geográfico , Humanos , Herança Multifatorial/genética , Reprodutibilidade dos Testes
3.
PLoS Genet ; 17(7): e1009584, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34242216

RESUMO

Based on epidemiologic and embryologic patterns, nonsyndromic orofacial clefts- the most common craniofacial birth defects in humans- are commonly categorized into cleft lip with or without cleft palate (CL/P) and cleft palate alone (CP), which are traditionally considered to be etiologically distinct. However, some evidence of shared genetic risk in IRF6, GRHL3 and ARHGAP29 regions exists; only FOXE1 has been recognized as significantly associated with both CL/P and CP in genome-wide association studies (GWAS). We used a new statistical approach, PLACO (pleiotropic analysis under composite null), on a combined multi-ethnic GWAS of 2,771 CL/P and 611 CP case-parent trios. At the genome-wide significance threshold of 5 × 10-8, PLACO identified 1 locus in 1q32.2 (IRF6) that appears to increase risk for one OFC subgroup but decrease risk for the other. At a suggestive significance threshold of 10-6, we found 5 more loci with compelling candidate genes having opposite effects on CL/P and CP: 1p36.13 (PAX7), 3q29 (DLG1), 4p13 (LIMCH1), 4q21.1 (SHROOM3) and 17q22 (NOG). Additionally, we replicated the recognized shared locus 9q22.33 (FOXE1), and identified 2 loci in 19p13.12 (RAB8A) and 20q12 (MAFB) that appear to influence risk of both CL/P and CP in the same direction. We found locus-specific effects may vary by racial/ethnic group at these regions of genetic overlap, and failed to find evidence of sex-specific differences. We confirmed shared etiology of the two OFC subtypes comprising CL/P, and additionally found suggestive evidence of differences in their pathogenesis at 2 loci of genetic overlap. Our novel findings include 6 new loci of genetic overlap between CL/P and CP; 3 new loci between pairwise OFC subtypes; and 4 loci not previously implicated in OFCs. Our in-silico validation showed PLACO is robust to subtype-specific effects, and can achieve massive power gains over existing approaches for identifying genetic overlap between disease subtypes. In summary, we found suggestive evidence for new genetic regions and confirmed some recognized OFC genes either exerting shared risk or with opposite effects on risk to OFC subtypes.


Assuntos
Fenda Labial/genética , Fissura Palatina/genética , Pleiotropia Genética , Biologia Computacional , Simulação por Computador , Etnicidade , Feminino , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Masculino , Reprodutibilidade dos Testes
4.
Genet Epidemiol ; 46(2): 122-138, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35043453

RESUMO

Physical inactivity (PA) is an important risk factor for a wide range of diseases. Previous genome-wide association studies (GWAS), based on self-reported data or a small number of phenotypes derived from accelerometry, have identified a limited number of genetic loci associated with habitual PA and provided evidence for involvement of central nervous system in mediating genetic effects. In this study, we derived 27 PA phenotypes from wrist accelerometry data obtained from 88,411 UK Biobank study participants. Single-variant association analysis based on mixed-effects models and transcriptome-wide association studies (TWAS) together identified 5 novel loci that were not detected by previous studies of PA, sleep duration and self-reported chronotype. For both novel and previously known loci, we discovered associations with novel phenotypes including active-to-sedentary transition probability, light-intensity PA, activity during different times of the day and proxy phenotypes to sleep and circadian patterns. Follow-up studies including TWAS, colocalization, tissue-specific heritability enrichment, gene-set enrichment and genetic correlation analyses indicated the role of the blood and immune system in modulating the genetic effects and a secondary role of the digestive and endocrine systems. Our findings provided important insights into the genetic architecture of PA and its underlying mechanisms.


Assuntos
Estudo de Associação Genômica Ampla , Modelos Genéticos , Acelerometria , Exercício Físico/fisiologia , Loci Gênicos , Predisposição Genética para Doença , Humanos
5.
Genet Epidemiol ; 46(5-6): 266-284, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35451532

RESUMO

Genetic association studies of child health outcomes often employ family-based study designs. One of the most popular family-based designs is the case-parent trio design that considers the smallest possible nuclear family consisting of two parents and their affected child. This trio design is particularly advantageous for studying relatively rare disorders because it is less prone to type 1 error inflation due to population stratification compared to population-based study designs (e.g., case-control studies). However, obtaining genetic data from both parents is difficult, from a practical perspective, and many large studies predominantly measure genetic variants in mother-child dyads. While some statistical methods for analyzing parent-child dyad data (most commonly involving mother-child pairs) exist, it is not clear if they provide the same advantage as trio methods in protecting against population stratification, or if a specific dyad design (e.g., case-mother dyads vs. case-mother/control-mother dyads) is more advantageous. In this article, we review existing statistical methods for analyzing genome-wide marker data on dyads and perform extensive simulation experiments to benchmark their type I errors and statistical power under different scenarios. We extend our evaluation to existing methods for analyzing a combination of case-parent trios and dyads together. We apply these methods on genotyped and imputed data from multiethnic mother-child pairs only, case-parent trios only or combinations of both dyads and trios from the Gene, Environment Association Studies consortium (GENEVA), where each family was ascertained through a child affected by nonsyndromic cleft lip with or without cleft palate. Results from the GENEVA study corroborate the findings from our simulation experiments. Finally, we provide recommendations for using statistical genetic association methods for dyads.


Assuntos
Fenda Labial , Fissura Palatina , Benchmarking , Fenda Labial/genética , Fissura Palatina/genética , Feminino , Estudos de Associação Genética , Humanos , Modelos Genéticos , Mães , Relações Pais-Filho , Polimorfismo de Nucleotídeo Único
6.
PLoS Genet ; 16(12): e1009218, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33290408

RESUMO

There is increasing evidence that pleiotropy, the association of multiple traits with the same genetic variants/loci, is a very common phenomenon. Cross-phenotype association tests are often used to jointly analyze multiple traits from a genome-wide association study (GWAS). The underlying methods, however, are often designed to test the global null hypothesis that there is no association of a genetic variant with any of the traits, the rejection of which does not implicate pleiotropy. In this article, we propose a new statistical approach, PLACO, for specifically detecting pleiotropic loci between two traits by considering an underlying composite null hypothesis that a variant is associated with none or only one of the traits. We propose testing the null hypothesis based on the product of the Z-statistics of the genetic variants across two studies and derive a null distribution of the test statistic in the form of a mixture distribution that allows for fractions of variants to be associated with none or only one of the traits. We borrow approaches from the statistical literature on mediation analysis that allow asymptotic approximation of the null distribution avoiding estimation of nuisance parameters related to mixture proportions and variance components. Simulation studies demonstrate that the proposed method can maintain type I error and can achieve major power gain over alternative simpler methods that are typically used for testing pleiotropy. PLACO allows correlation in summary statistics between studies that may arise due to sharing of controls between disease traits. Application of PLACO to publicly available summary data from two large case-control GWAS of Type 2 Diabetes and of Prostate Cancer implicated a number of novel shared genetic regions: 3q23 (ZBTB38), 6q25.3 (RGS17), 9p22.1 (HAUS6), 9p13.3 (UBAP2), 11p11.2 (RAPSN), 14q12 (AKAP6), 15q15 (KNL1) and 18q23 (ZNF236).


Assuntos
Diabetes Mellitus Tipo 2/genética , Pleiotropia Genética , Estudo de Associação Genômica Ampla/métodos , Neoplasias da Próstata/genética , Locos de Características Quantitativas , Humanos , Masculino , Modelos Genéticos
7.
BMC Med Res Methodol ; 22(1): 143, 2022 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-35590267

RESUMO

BACKGROUND: Cohort collaborations often require meta-analysis of exposure-outcome association estimates across cohorts as an alternative to pooling individual-level data that requires a laborious process of data harmonization on individual-level data. However, it is likely that important confounders are not all measured uniformly across the cohorts due to differences in study protocols. This imbalance in measurement of confounders leads to association estimates that are not comparable across cohorts and impedes the meta-analysis of results. METHODS: In this article, we empirically show some asymptotic relations between fully adjusted and unadjusted exposure-outcome effect estimates, and provide theoretical justification for the same. We leverage these results to obtain fully adjusted estimates for the cohorts with no information on confounders by borrowing information from cohorts with complete measurement on confounders. We implement this novel method in CIMBAL (confounder imbalance), which additionally provides a meta-analyzed estimate that appropriately accounts for the dependence between estimates arising due to borrowing of information across cohorts. We perform extensive simulation experiments to study CIMBAL's statistical properties. We illustrate CIMBAL using National Children's Study (NCS) data to estimate association of maternal education and low birth weight in infants, adjusting for maternal age at delivery, race/ethnicity, marital status, and income. RESULTS: Our simulation studies indicate that estimates of exposure-outcome association from CIMBAL are closer to the truth than those from commonly-used approaches for meta-analyzing cohorts with disparate confounder measurements. CIMBAL is not too sensitive to heterogeneity in underlying joint distributions of exposure, outcome and confounders but is very sensitive to heterogeneity of confounding bias across cohorts. Application of CIMBAL to NCS data for a proof-of-concept analysis further illustrates the utility and advantages of CIMBAL. CONCLUSIONS: CIMBAL provides a practical approach for meta-analyzing cohorts with imbalance in measurement of confounders under a weak assumption that the cohorts are independently sampled from populations with the same confounding bias.


Assuntos
Projetos de Pesquisa , Viés , Criança , Estudos de Coortes , Simulação por Computador , Humanos , Lactente
8.
Cleft Palate Craniofac J ; : 10556656221135926, 2022 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-36384317

RESUMO

Novel or rare damaging mutations have been implicated in the developmental pathogenesis of nonsyndromic cleft lip with or without cleft palate (nsCL ± P). Thus, we investigated the human genome for high-impact mutations that could explain the risk of nsCL ± P in our cohorts.We conducted next-generation sequencing (NGS) analysis of 130 nsCL ± P case-parent African trios to identify pathogenic variants that contribute to the risk of clefting. We replicated this analysis using whole-exome sequence data from a Brazilian nsCL ± P cohort. Computational analyses were then used to predict the mechanism by which these variants could result in increased risks for nsCL ± P.We discovered damaging mutations within the AFDN gene, a cell adhesion molecule (CAMs) that was previously shown to contribute to cleft palate in mice. These mutations include p.Met1164Ile, p.Thr453Asn, p.Pro1638Ala, p.Arg669Gln, p.Ala1717Val, and p.Arg1596His. We also discovered a novel splicing p.Leu1588Leu mutation in this protein. Computational analysis suggests that these amino acid changes affect the interactions with other cleft-associated genes including nectins (PVRL1, PVRL2, PVRL3, and PVRL4) CDH1, CTNNA1, and CTNND1.This is the first report on the contribution of AFDN to the risk for nsCL ± P in humans. AFDN encodes AFADIN, an important CAM that forms calcium-independent complexes with nectins 1 and 4 (encoded by the genes PVRL1 and PVRL4). This discovery shows the power of NGS analysis of multiethnic cleft samples in combination with a computational approach in the understanding of the pathogenesis of nsCL ± P.

9.
BMC Infect Dis ; 21(1): 533, 2021 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-34098885

RESUMO

BACKGROUND: Many popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures, lockdowns, and other non-pharmaceutical interventions. We study how five epidemiological models forecast and assess the course of the pandemic in India: a baseline curve-fitting model, an extended SIR (eSIR) model, two extended SEIR (SAPHIRE and SEIR-fansy) models, and a semi-mechanistic Bayesian hierarchical model (ICM). METHODS: Using COVID-19 case-recovery-death count data reported in India from March 15 to October 15 to train the models, we generate predictions from each of the five models from October 16 to December 31. To compare prediction accuracy with respect to reported cumulative and active case counts and reported cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) for each of the five models. For reported cumulative cases and deaths, we compute Pearson's and Lin's correlation coefficients to investigate how well the projected and observed reported counts agree. We also present underreporting factors when available, and comment on uncertainty of projections from each model. RESULTS: For active case counts, SMAPE values are 35.14% (SEIR-fansy) and 37.96% (eSIR). For cumulative case counts, SMAPE values are 6.89% (baseline), 6.59% (eSIR), 2.25% (SAPHIRE) and 2.29% (SEIR-fansy). For cumulative death counts, the SMAPE values are 4.74% (SEIR-fansy), 8.94% (eSIR) and 0.77% (ICM). Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) cumulative case counts as well. We compute underreporting factors as of October 31 and note that for cumulative cases, the SEIR-fansy model yields an underreporting factor of 7.25 and ICM model yields 4.54 for the same quantity. For total (sum of reported and unreported) cumulative deaths the SEIR-fansy model reports an underreporting factor of 2.97. On October 31, we observe 8.18 million cumulative reported cases, while the projections (in millions) from the baseline model are 8.71 (95% credible interval: 8.63-8.80), while eSIR yields 8.35 (7.19-9.60), SAPHIRE returns 8.17 (7.90-8.52) and SEIR-fansy projects 8.51 (8.18-8.85) million cases. Cumulative case projections from the eSIR model have the highest uncertainty in terms of width of 95% credible intervals, followed by those from SAPHIRE, the baseline model and finally SEIR-fansy. CONCLUSIONS: In this comparative paper, we describe five different models used to study the transmission dynamics of the SARS-Cov-2 virus in India. While simulation studies are the only gold standard way to compare the accuracy of the models, here we were uniquely poised to compare the projected case-counts against observed data on a test period. The largest variability across models is observed in predicting the "total" number of infections including reported and unreported cases (on which we have no validation data). The degree of under-reporting has been a major concern in India and is characterized in this report. Overall, the SEIR-fansy model appeared to be a good choice with publicly available R-package and desired flexibility plus accuracy.


Assuntos
COVID-19/epidemiologia , COVID-19/transmissão , Pandemias , Teorema de Bayes , Controle de Doenças Transmissíveis/métodos , Simulação por Computador , Previsões , Humanos , Índia/epidemiologia , Modelos Estatísticos
10.
Genet Epidemiol ; 42(2): 134-145, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29226385

RESUMO

Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits.


Assuntos
Estudo de Associação Genômica Ampla , Metanálise como Assunto , Idoso , HDL-Colesterol/genética , LDL-Colesterol/genética , Doença da Artéria Coronariana/genética , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Triglicerídeos/genética
11.
Am J Epidemiol ; 188(12): 2069-2077, 2019 12 31.
Artigo em Inglês | MEDLINE | ID: mdl-31509181

RESUMO

The field of genetic epidemiology is relatively young and brings together genetics, epidemiology, and biostatistics to identify and implement the best study designs and statistical analyses for identifying genes controlling risk for complex and heterogeneous diseases (i.e., those where genes and environmental risk factors both contribute to etiology). The field has moved quickly over the past 40 years partly because the technology of genotyping and sequencing has forced it to adapt while adhering to the fundamental principles of genetics. In the last two decades, the available tools for genetic epidemiology have expanded from a genetic focus (considering 1 gene at a time) to a genomic focus (considering the entire genome), and now they must further expand to integrate information from other "-omics" (e.g., epigenomics, transcriptomics as measured by RNA expression) at both the individual and the population levels. Additionally, we can now also evaluate gene and environment interactions across populations to better understand exposure and the heterogeneity in disease risk. The future challenges facing genetic epidemiology are considerable both in scale and techniques, but the importance of the field will not diminish because by design it ties scientific goals with public health applications.


Assuntos
Epidemiologia Molecular/tendências , Genômica/tendências
12.
Genet Epidemiol ; 41(5): 413-426, 2017 07.
Artigo em Inglês | MEDLINE | ID: mdl-28393390

RESUMO

In the past decade, many genome-wide association studies (GWASs) have been conducted to explore association of single nucleotide polymorphisms (SNPs) with complex diseases using a case-control design. These GWASs not only collect information on the disease status (primary phenotype, D) and the SNPs (genotypes, X), but also collect extensive data on several risk factors and traits. Recent literature and grant proposals point toward a trend in reusing existing large case-control data for exploring genetic associations of some additional traits (secondary phenotypes, Y) collected during the study. These secondary phenotypes may be correlated, and a proper analysis warrants a multivariate approach. Commonly used multivariate methods are not equipped to properly account for the non-random sampling scheme. Current ad hoc practices include analyses without any adjustment, and analyses with D adjusted as a covariate. Our theoretical and empirical studies suggest that the type I error for testing genetic association of secondary traits can be substantial when X as well as Y are associated with D, even when there is no association between X and Y in the underlying (target) population. Whether using D as a covariate helps maintain type I error depends heavily on the disease mechanism and the underlying causal structure (which is often unknown). To avoid grossly incorrect inference, we have proposed proportional odds model adjusted for propensity score (POM-PS). It uses a proportional odds logistic regression of X on Y and adjusts estimated conditional probability of being diseased as a covariate. We demonstrate the validity and advantage of POM-PS, and compare to some existing methods in extensive simulation experiments mimicking plausible scenarios of dependency among Y, X, and D. Finally, we use POM-PS to jointly analyze four adiposity traits using a type 2 diabetes (T2D) case-control sample from the population-based Metabolic Syndrome in Men (METSIM) study. Only POM-PS analysis of the T2D case-control sample seems to provide valid association signals.


Assuntos
Diabetes Mellitus Tipo 2/fisiopatologia , Marcadores Genéticos/genética , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Característica Quantitativa Herdável , Adiposidade/genética , Idoso , Estudos de Casos e Controles , Simulação por Computador , Genótipo , Humanos , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade
13.
Genet Epidemiol ; 40(1): 20-34, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26638693

RESUMO

Genome-wide association studies (GWASs) for complex diseases often collect data on multiple correlated endo-phenotypes. Multivariate analysis of these correlated phenotypes can improve the power to detect genetic variants. Multivariate analysis of variance (MANOVA) can perform such association analysis at a GWAS level, but the behavior of MANOVA under different trait models has not been carefully investigated. In this paper, we show that MANOVA is generally very powerful for detecting association but there are situations, such as when a genetic variant is associated with all the traits, where MANOVA may not have any detection power. In these situations, marginal model based methods, however, perform much better than multivariate methods. We investigate the behavior of MANOVA, both theoretically and using simulations, and derive the conditions where MANOVA loses power. Based on our findings, we propose a unified score-based test statistic USAT that can perform better than MANOVA in such situations and nearly as well as MANOVA elsewhere. Our proposed test reports an approximate asymptotic P-value for association and is computationally very efficient to implement at a GWAS level. We have studied through extensive simulations the performance of USAT, MANOVA, and other existing approaches and demonstrated the advantage of using the USAT approach to detect association between a genetic variant and multivariate phenotypes. We applied USAT to data from three correlated traits collected on 5, 816 Caucasian individuals from the Atherosclerosis Risk in Communities (ARIC, The ARIC Investigators []) Study and detected some interesting associations.


Assuntos
Aterosclerose/genética , Estudo de Associação Genômica Ampla , Análise Multivariada , Simulação por Computador , Genótipo , Humanos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único
14.
Hum Hered ; 79(2): 69-79, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26044550

RESUMO

BACKGROUND: Genome-wide association studies (GWASs) have identified hundreds of genetic variants associated with complex diseases, but these variants appear to explain very little of the disease heritability. The typical single-locus association analysis in a GWAS fails to detect variants with small effect sizes and to capture higher-order interaction among these variants. Multilocus association analysis provides a powerful alternative by jointly modeling the variants within a gene or a pathway and by reducing the burden of multiple hypothesis testing in a GWAS. METHODS: Here, we propose a powerful and flexible dimension reduction approach to model multilocus association. We use a Bayesian partitioning model which clusters SNPs according to their direction of association, models higher-order interactions using a flexible scoring scheme and uses posterior marginal probabilities to detect association between the SNP set and the disease. RESULTS: We illustrate our method using extensive simulation studies and applying it to detect multilocus interaction in Atherosclerosis Risk in Communities (ARIC) GWAS with type 2 diabetes. CONCLUSION: We demonstrate that our approach has better power to detect multilocus interactions than several existing approaches. When applied to the ARIC study dataset with 9,328 individuals to study gene-based associations for type 2 diabetes, our method identified some novel variants not detected by conventional single-locus association analyses.


Assuntos
Teorema de Bayes , Estudos de Casos e Controles , Modelos Genéticos , Aterosclerose/genética , Simulação por Computador , Diabetes Mellitus Tipo 2/genética , Estudo de Associação Genômica Ampla , Humanos , Polimorfismo de Nucleotídeo Único
15.
Hum Hered ; 80(1): 1-11, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26159893

RESUMO

Studies of complex human diseases and traits associated with candidate genes are potentially vulnerable to bias (confounding) due to population stratification and inbreeding, especially in admixed population. In GWAS, the principal components (PCs) method provides a global ancestry value per subject, allowing corrections for population stratification. However, these coefficients are typically estimated assuming unrelated individuals, and if family structure is present and ignored, such substructures may induce artifactual PCs. Extensions of the PCs method have been proposed by Konishi and Rao [Biometrika 1992;79:631-641], taking into account only siblings' relatedness, and by Oualkacha et al. [Stat Appl Genet Mol Biol 2012, DOI: 10.2202/1544-6115.1711], taking into account large pedigrees and high-dimensional phenotype data. In this work, we extend these methods to estimate the global individual ancestry coefficients from PCs derived from different variance component matrix estimators using SNPs from two simulated data sets and two real data sets: the GENOA sibship data consisting of European and African-American subjects and the Baependi Heart Study consisting of 80 extended Brazilian families, both with genotyping data from the Affymetrix 6.0 chip. Our results show that the family structure plays an important role in the estimation of the global individual ancestry value for extended pedigrees but not for sibships.


Assuntos
Família , Predisposição Genética para Doença , Genética Médica/métodos , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Feminino , Humanos , Masculino
16.
Hum Hered ; 76(2): 53-63, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24247328

RESUMO

OBJECTIVES: A gene-based genome-wide association study (GWAS) provides a powerful alternative to the traditional single single nucleotide polymorphism (SNP) association analysis due to its substantial reduction in the multiple testing burden and possible gain in power due to modeling multiple SNPs within a gene. A gene-based association analysis on multivariate traits is often of interest, but it imposes substantial analytical as well as computational challenges to implement it at a genome-wide level. METHODS: We propose a rapid implementation of the multivariate multiple linear regression (RMMLR) approach in unrelated individuals as well as in families. Our approach allows for covariates. Moreover, the asymptotic distribution of the test statistic is not heavily influenced by the linkage disequilibrium (LD) among the SNPs and hence can be used efficiently to perform a gene-based GWAS. We have developed a corresponding R package to implement such multivariate gene-based GWAS with this RMMLR approach. RESULTS: Through extensive simulation, we compared several approaches for both single and multivariate traits. Our RMMLR approach maintained a correct type I error level even for sets of SNPs in strong LD. It also demonstrated a substantial gain in power to detect a gene when it is associated with a subset of the traits. We also studied performances of the approaches on the Minnesota Center for Twin Family Research dataset. CONCLUSIONS: In our overall comparison, our RMMLR approach provides an efficient and powerful tool to perform a gene-based GWAS with single or multivariate traits and maintains the type I error appropriately.


Assuntos
Genes/genética , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Herança Multifatorial/genética , Simulação por Computador , Genótipo , Humanos , Modelos Lineares , Análise Multivariada , Polimorfismo de Nucleotídeo Único/genética
17.
Diabetes ; 2024 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-38869630

RESUMO

Genetic studies of non-traditional glycemic biomarkers, glycated albumin and fructosamine, can shed light on unknown aspects of type 2 diabetes genetics and biology. We performed a multi-phenotype GWAS of glycated albumin and fructosamine from 7,395 White and 2,016 Black participants in the Atherosclerosis Risk in Communities (ARIC) study on common variants from genotyped/imputed data. We discovered 2 genome-wide significant loci, one mapping to known type 2 diabetes gene (ARAP1/STARD10) and another mapping to a novel region (UGT1A complex of genes) using multi-omics gene-mapping strategies in diabetes-relevant tissues. We identified additional loci that were ancestry- and sex-specific (e.g., PRKCA in African ancestry, FCGRT in European ancestry, TEX29 in males). Further, we implemented multi-phenotype gene-burden tests on whole-exome sequence data from 6,590 White and 2,309 Black ARIC participants. Ten variant sets annotated to genes across different variant aggregation strategies were exome-wide significant only in multi-ancestry analysis, of which CD1D, EGFL7/AGPAT2 and MIR126 had notable enrichment of rare predicted loss of function variants in African ancestry despite smaller sample sizes. Overall, 8 out of 14 discovered loci and genes were implicated to influence these biomarkers via glycemic pathways, and most of them were not previously implicated in studies of type 2 diabetes. This study illustrates improved locus discovery and potential effector gene discovery by leveraging joint patterns of related biomarkers across the entire allele frequency spectrum in multi-ancestry analysis. Future investigation of the loci and genes potentially acting through glycemic pathways may help us better understand risk of developing type 2 diabetes.

18.
J Clin Endocrinol Metab ; 109(1): e306-e313, 2023 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-37453101

RESUMO

CONTEXT: Genome-wide association studies have identified germline variants associated with elevated PTC risk. It is also known that somatic driver mutations contribute to PTC development and as such PTCs can be further categorized into different molecular subtypes based on their somatic alterations. However, it remains unknown whether identified germline variants predictive of PTC risk are associated with specific molecular subtypes. OBJECTIVE: The primary goal of the present study is to determine whether germline genetic risk, as assessed using a polygenic score (PGS) is associated with molecular subtypes of papillary thyroid carcinoma (PTC), defined based on tumor driver mutation status. METHODS: This study was carried out using data from The Cancer Genome Atlas (TCGA) thyroid cancer study. A previously validated 10-single-nucleotide variation PGS for PTC derived from genome-wide association study hits was calculated to ascertain germline genetic risk. The primary molecular subtypes of interest were defined by tumor driver mutation status (BRAFV600E-mutated vs RAS-mutated vs "other"). We also explored associations between PGS and molecular subtypes defined by messenger RNA (mRNA) expression, microRNA expression, and DNA methylation patterns. Polytomous logistic regression analysis was used to assess the association between PGS and PTC molecular subtype with and without adjustment for clinical variables. Odds ratios (ORs) with their 95% CIs were estimated. RESULTS: A total of 359 patients were included in the study. PGS was significantly associated specific tumor molecular subtypes defined by tumor driver mutation status. Increasing germline risk was associated with having a higher odd of BRAFV600E-mutated PTC compared to PTCs without driver mutations in the "other" category. No significant difference was detected in terms of PGS tumor categorization in the RAS subtype compared to BRAFV600E. In exploratory analyses, PGS was also associated with mRNA-, microRNA-, and DNA methylation-defined molecular subtypes, as defined by the TCGA PTC study. CONCLUSION: PGS has molecular subtype-specific associations in PTC, which has implications for their use in risk prediction.


Assuntos
Carcinoma Papilar , MicroRNAs , Neoplasias da Glândula Tireoide , Humanos , Câncer Papilífero da Tireoide/genética , Estudo de Associação Genômica Ampla , Carcinoma Papilar/genética , Carcinoma Papilar/patologia , Neoplasias da Glândula Tireoide/patologia , MicroRNAs/genética , RNA Mensageiro/metabolismo , Mutação , Proteínas Proto-Oncogênicas B-raf/genética
19.
medRxiv ; 2023 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-37609313

RESUMO

DNA methylation studies of incident type 2 diabetes in US populations are limited, and to our knowledge none included individuals of African descent living in the US. We performed an epigenome-wide association analysis of blood-based methylation levels at CpG sites with incident type 2 diabetes using Cox regression in 2,091 Black and 1,029 White individuals from the Atherosclerosis Risk in Communities study. At an epigenome-wide significance threshold of 10-7, we detected 7 novel diabetes-associated CpG sites in C1orf151 (cg05380846: HR= 0.89, p = 8.4 × 10-12), ZNF2 (cg01585592: HR= 0.88, p = 1.6 × 10-9), JPH3 (cg16696007: HR= 0.87, p = 7.8 × 10-9), GPX6 (cg02793507: HR= 0.85, p = 2.7 × 10-8 and cg00647063: HR= 1.20, p = 2.5 × 10-8), chr17q25 (cg16865890: HR= 0.8, p = 6.9 × 10-8), and chr11p15 (cg13738793: HR= 1.11, p = 7.7 × 10-8). The CpG sites at C1orf151, ZNF2, JPH3 and GPX6, were identified in Black adults, chr17q25 was identified in White adults, and chr11p15 was identified upon meta-analyzing the two groups. The CpG sites at JPH3 and GPX6 were likely associated with incident type 2 diabetes independent of BMI. All the CpG sites, except at JPH3, were likely consequences of elevated glucose at baseline. We additionally replicated known type 2 diabetes-associated CpG sites including cg19693031 at TXNIP, cg00574958 at CPT1A, cg16567056 at PLBC2, cg11024682 at SREBF1, cg08857797 at VPS25, and cg06500161 at ABCG1, 3 of which were replicated in Black adults at the epigenome-wide threshold. We observed modest increase in type 2 diabetes variance explained upon addition of the significantly associated CpG sites to a Cox model that included traditional type 2 diabetes risk factors and fasting glucose (increase from 26.2% to 30.5% in Black adults; increase from 36.9% to 39.4% in White adults). We examined if groups of proximal CpG sites were associated with incident type 2 diabetes using a gene-region specific and a gene-region agnostic differentially methylated region (DMR) analysis. Our DMR analyses revealed several clusters of significant CpG sites, including a DMR consisting of a previously discovered CpG site at ADCY7 and promoter regions of TP63 which were differentially methylated across all race groups. This study illustrates improved discovery of CpG sites/regions by leveraging both individual CpG site and DMR analyses in an unexplored population. Our findings include genes linked to diabetes in experimental studies (e.g., GPX6, JPH3, and TP63), and future gene-specific methylation studies could elucidate the link between genes, environment, and methylation in the pathogenesis of type 2 diabetes.

20.
PLOS Glob Public Health ; 3(12): e0002063, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38150465

RESUMO

There has been raging discussion and debate around the quality of COVID death data in South Asia. According to WHO, of the 5.5 million reported COVID-19 deaths from 2020-2021, 0.57 million (10%) were contributed by five low and middle income countries (LMIC) countries in the Global South: India, Pakistan, Bangladesh, Sri Lanka and Nepal. However, a number of excess death estimates show that the actual death toll from COVID-19 is significantly higher than the reported number of deaths. For example, the IHME and WHO both project around 14.9 million total deaths, of which 4.5-5.5 million were attributed to these five countries in 2020-2021. We focus our gaze on the COVID-19 performance of these five countries where 23.5% of the world population lives in 2020 and 2021, via a counterfactual lens and ask, to what extent the mortality of one LMIC would have been affected if it adopted the pandemic policies of another, similar country? We use a Bayesian semi-mechanistic model developed by Mishra et al. (2021) to compare both the reported and estimated total death tolls by permuting the time-varying reproduction number (Rt) across these countries over a similar time period. Our analysis shows that, in the first half of 2021, mortality in India in terms of reported deaths could have been reduced to 96 and 102 deaths per million compared to actual 170 reported deaths per million had it adopted the policies of Nepal and Pakistan respectively. In terms of total deaths, India could have averted 481 and 466 deaths per million had it adopted the policies of Bangladesh and Pakistan. On the other hand, India had a lower number of reported COVID-19 deaths per million (48 deaths per million) and a lower estimated total deaths per million (80 deaths per million) in the second half of 2021, and LMICs other than Pakistan would have lower reported mortality had they followed India's strategy. The gap between the reported and estimated total deaths highlights the varying level and extent of under-reporting of deaths across the subcontinent, and that model estimates are contingent on accuracy of the death data. Our analysis shows the importance of timely public health intervention and vaccines for lowering mortality and the need for better coverage infrastructure for the death registration system in LMICs.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa