Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
Hum Mol Genet ; 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38747556

RESUMO

Inflammation biomarkers can provide valuable insight into the role of inflammatory processes in many diseases and conditions. Sequencing based analyses of such biomarkers can also serve as an exemplar of the genetic architecture of quantitative traits. To evaluate the biological insight, which can be provided by a multi-ancestry, whole-genome based association study, we performed a comprehensive analysis of 21 inflammation biomarkers from up to 38 465 individuals with whole-genome sequencing from the Trans-Omics for Precision Medicine (TOPMed) program (with varying sample size by trait, where the minimum sample size was n = 737 for MMP-1). We identified 22 distinct single-variant associations across 6 traits-E-selectin, intercellular adhesion molecule 1, interleukin-6, lipoprotein-associated phospholipase A2 activity and mass, and P-selectin-that remained significant after conditioning on previously identified associations for these inflammatory biomarkers. We further expanded upon known biomarker associations by pairing the single-variant analysis with a rare variant set-based analysis that further identified 19 significant rare variant set-based associations with 5 traits. These signals were distinct from both significant single variant association signals within TOPMed and genetic signals observed in prior studies, demonstrating the complementary value of performing both single and rare variant analyses when analyzing quantitative traits. We also confirm several previously reported signals from semi-quantitative proteomics platforms. Many of these signals demonstrate the extensive allelic heterogeneity and ancestry-differentiated variant-trait associations common for inflammation biomarkers, a characteristic we hypothesize will be increasingly observed with well-powered, large-scale analyses of complex traits.

3.
Nature ; 622(7984): 784-793, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37821707

RESUMO

The Mexico City Prospective Study is a prospective cohort of more than 150,000 adults recruited two decades ago from the urban districts of Coyoacán and Iztapalapa in Mexico City1. Here we generated genotype and exome-sequencing data for all individuals and whole-genome sequencing data for 9,950 selected individuals. We describe high levels of relatedness and substantial heterogeneity in ancestry composition across individuals. Most sequenced individuals had admixed Indigenous American, European and African ancestry, with extensive admixture from Indigenous populations in central, southern and southeastern Mexico. Indigenous Mexican segments of the genome had lower levels of coding variation but an excess of homozygous loss-of-function variants compared with segments of African and European origin. We estimated ancestry-specific allele frequencies at 142 million genomic variants, with an effective sample size of 91,856 for Indigenous Mexican ancestry at exome variants, all available through a public browser. Using whole-genome sequencing, we developed an imputation reference panel that outperforms existing panels at common variants in individuals with high proportions of central, southern and southeastern Indigenous Mexican ancestry. Our work illustrates the value of genetic studies in diverse populations and provides foundational imputation and allele frequency resources for future genetic studies in Mexico and in the United States, where the Hispanic/Latino population is predominantly of Mexican descent.


Assuntos
Sequenciamento do Exoma , Genoma Humano , Genótipo , Hispânico ou Latino , Adulto , Humanos , África/etnologia , América/etnologia , Europa (Continente)/etnologia , Frequência do Gene/genética , Genética Populacional , Genoma Humano/genética , Técnicas de Genotipagem , Hispânico ou Latino/genética , Homozigoto , Mutação com Perda de Função/genética , México , Estudos Prospectivos
4.
bioRxiv ; 2023 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-37745480

RESUMO

Inflammation biomarkers can provide valuable insight into the role of inflammatory processes in many diseases and conditions. Sequencing based analyses of such biomarkers can also serve as an exemplar of the genetic architecture of quantitative traits. To evaluate the biological insight, which can be provided by a multi-ancestry, whole-genome based association study, we performed a comprehensive analysis of 21 inflammation biomarkers from up to 38,465 individuals with whole-genome sequencing from the Trans-Omics for Precision Medicine (TOPMed) program. We identified 22 distinct single-variant associations across 6 traits - E-selectin, intercellular adhesion molecule 1, interleukin-6, lipoprotein-associated phospholipase A2 activity and mass, and P-selectin - that remained significant after conditioning on previously identified associations for these inflammatory biomarkers. We further expanded upon known biomarker associations by pairing the single-variant analysis with a rare variant set-based analysis that further identified 19 significant rare variant set-based associations with 5 traits. These signals were distinct from both significant single variant association signals within TOPMed and genetic signals observed in prior studies, demonstrating the complementary value of performing both single and rare variant analyses when analyzing quantitative traits. We also confirm several previously reported signals from semi-quantitative proteomics platforms. Many of these signals demonstrate the extensive allelic heterogeneity and ancestry-differentiated variant-trait associations common for inflammation biomarkers, a characteristic we hypothesize will be increasingly observed with well-powered, large-scale analyses of complex traits.

5.
Diabetes ; 72(5): 653-665, 2023 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-36791419

RESUMO

Few studies have demonstrated reproducible gene-diet interactions (GDIs) impacting metabolic disease risk factors, likely due in part to measurement error in dietary intake estimation and insufficient capture of rare genetic variation. We aimed to identify GDIs across the genetic frequency spectrum impacting the macronutrient-glycemia relationship in genetically and culturally diverse cohorts. We analyzed 33,187 participants free of diabetes from 10 National Heart, Lung, and Blood Institute Trans-Omics for Precision Medicine program cohorts with whole-genome sequencing, self-reported diet, and glycemic trait data. We fit cohort-specific, multivariable-adjusted linear mixed models for the effect of diet, modeled as an isocaloric substitution of carbohydrate for fat, and its interactions with common and rare variants genome-wide. In main effect meta-analyses, participants consuming more carbohydrate had modestly lower glycemic trait values (e.g., for glycated hemoglobin [HbA1c], -0.013% HbA1c/250 kcal substitution). In GDI meta-analyses, a common African ancestry-enriched variant (rs79762542) reached study-wide significance and replicated in the UK Biobank cohort, indicating a negative carbohydrate-HbA1c association among major allele homozygotes only. Simulations revealed that >150,000 samples may be necessary to identify similar macronutrient GDIs under realistic assumptions about effect size and measurement error. These results generate hypotheses for further exploration of modifiable metabolic disease risk in additional cohorts with African ancestry. ARTICLE HIGHLIGHTS: We aimed to identify genetic modifiers of the dietary macronutrient-glycemia relationship using whole-genome sequence data from 10 Trans-Omics for Precision Medicine program cohorts. Substitution models indicated a modest reduction in glycemia associated with an increase in dietary carbohydrate at the expense of fat. Genome-wide interaction analysis identified one African ancestry-enriched variant near the FRAS1 gene that may interact with macronutrient intake to influence hemoglobin A1c. Simulation-based power calculations accounting for measurement error suggested that substantially larger sample sizes may be necessary to discover further gene-macronutrient interactions.


Assuntos
Diabetes Mellitus , Dieta , Humanos , Hemoglobinas Glicadas/genética , Diabetes Mellitus/genética , Ingestão de Alimentos , Inibidores de Dissociação do Nucleotídeo Guanina/genética , Estudo de Associação Genômica Ampla
6.
Biometrics ; 79(2): 1472-1484, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-35218565

RESUMO

Sample sizes vary substantially across tissues in the Genotype-Tissue Expression (GTEx) project, where considerably fewer samples are available from certain inaccessible tissues, such as the substantia nigra (SSN), than from accessible tissues, such as blood. This severely limits power for identifying tissue-specific expression quantitative trait loci (eQTL) in undersampled tissues. Here we propose Surrogate Phenotype Regression Analysis (Spray) for leveraging information from a correlated surrogate outcome (eg, expression in blood) to improve inference on a partially missing target outcome (eg, expression in SSN). Rather than regarding the surrogate outcome as a proxy for the target outcome, Spray jointly models the target and surrogate outcomes within a bivariate regression framework. Unobserved values of either outcome are treated as missing data. We describe and implement an expectation conditional maximization algorithm for performing estimation in the presence of bilateral outcome missingness. Spray estimates the same association parameter estimated by standard eQTL mapping and controls the type I error even when the target and surrogate outcomes are truly uncorrelated. We demonstrate analytically and empirically, using simulations and GTEx data, that in comparison with marginally modeling the target outcome, jointly modeling the target and surrogate outcomes increases estimation precision and improves power.


Assuntos
Algoritmos , Locos de Características Quantitativas , Fenótipo , Análise de Regressão
7.
Nat Genet ; 55(1): 154-164, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36564505

RESUMO

Meta-analysis of whole genome sequencing/whole exome sequencing (WGS/WES) studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Existing rare variant meta-analysis approaches are not scalable to biobank-scale WGS data. Here we present MetaSTAAR, a powerful and resource-efficient rare variant meta-analysis framework for large-scale WGS/WES studies. MetaSTAAR accounts for relatedness and population structure, can analyze both quantitative and dichotomous traits and boosts the power of rare variant tests by incorporating multiple variant functional annotations. Through meta-analysis of four lipid traits in 30,138 ancestrally diverse samples from 14 studies of the Trans Omics for Precision Medicine (TOPMed) Program, we show that MetaSTAAR performs rare variant meta-analysis at scale and produces results comparable to using pooled data. Additionally, we identified several conditionally significant rare variant associations with lipid traits. We further demonstrate that MetaSTAAR is scalable to biobank-scale cohorts through meta-analysis of TOPMed WGS data and UK Biobank WES data of ~200,000 samples.


Assuntos
Estudo de Associação Genômica Ampla , Lipídeos , Estudo de Associação Genômica Ampla/métodos , Sequenciamento Completo do Genoma/métodos , Sequenciamento do Exoma , Fenótipo , Lipídeos/genética
8.
Nat Methods ; 19(12): 1599-1611, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36303018

RESUMO

Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.


Assuntos
Estudo de Associação Genômica Ampla , Genoma , Humanos , Estudo de Associação Genômica Ampla/métodos , Sequenciamento Completo do Genoma/métodos , Fenótipo , Variação Genética
9.
Commun Biol ; 5(1): 756, 2022 07 28.
Artigo em Inglês | MEDLINE | ID: mdl-35902682

RESUMO

The genetic determinants of fasting glucose (FG) and fasting insulin (FI) have been studied mostly through genome arrays, resulting in over 100 associated variants. We extended this work with high-coverage whole genome sequencing analyses from fifteen cohorts in NHLBI's Trans-Omics for Precision Medicine (TOPMed) program. Over 23,000 non-diabetic individuals from five race-ethnicities/populations (African, Asian, European, Hispanic and Samoan) were included. Eight variants were significantly associated with FG or FI across previously identified regions MTNR1B, G6PC2, GCK, GCKR and FOXA2. We additionally characterize suggestive associations with FG or FI near previously identified SLC30A8, TCF7L2, and ADCY5 regions as well as APOB, PTPRT, and ROBO1. Functional annotation resources including the Diabetes Epigenome Atlas were compiled for each signal (chromatin states, annotation principal components, and others) to elucidate variant-to-function hypotheses. We provide a catalog of nucleotide-resolution genomic variation spanning intergenic and intronic regions creating a foundation for future sequencing-based investigations of glycemic traits.


Assuntos
Diabetes Mellitus Tipo 2 , Jejum , Diabetes Mellitus Tipo 2/genética , Glucose , Humanos , Insulina/genética , National Heart, Lung, and Blood Institute (U.S.) , Proteínas do Tecido Nervoso/genética , Polimorfismo de Nucleotídeo Único , Medicina de Precisão , Receptores Imunológicos/genética , Estados Unidos
10.
Cell Rep Methods ; 2(5): 100218, 2022 05 23.
Artigo em Inglês | MEDLINE | ID: mdl-35637906

RESUMO

Expression quantitative trait locus (eQTL) analysis associates SNPs with gene expression; these relationships can be represented as a bipartite network with association strength as "edge weights" between SNPs and genes. However, most eQTL networks use binary edge weights based on thresholded FDR estimates: definitions that influence reproducibility and downstream analyses. We constructed twenty-nine tissue-specific eQTL networks using GTEx data and evaluated a comprehensive set of network specifications based on false discovery rates, test statistics, and p values, focusing on the degree centrality-a metric of an SNP or gene node's potential network influence. We found a thresholded Benjamini-Hochberg q value weighted by the Z-statistic balances metric reproducibility and computational efficiency. Our estimated gene degrees positively correlate with gene degrees in gene regulatory networks, demonstrating that these networks are complementary in understanding regulation. Gene degrees also correlate with genetic diversity, and heritability analyses show that highly connected nodes are enriched for tissue-relevant traits.


Assuntos
Redes Reguladoras de Genes , Locos de Características Quantitativas , Locos de Características Quantitativas/genética , Reprodutibilidade dos Testes , Redes Reguladoras de Genes/genética , Fenótipo , Genômica
11.
Bioinformatics ; 38(11): 3116-3117, 2022 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-35441669

RESUMO

SUMMARY: We developed the variant-Set Test for Association using Annotation infoRmation (STAAR) workflow description language (WDL) workflow to facilitate the analysis of rare variants in whole genome sequencing association studies. The open-access STAAR workflow written in the WDL allows a user to perform rare variant testing for both gene-centric and genetic region approaches, enabling genome-wide, candidate and conditional analyses. It incorporates functional annotations into the workflow as introduced in the STAAR method in order to boost the rare variant analysis power. This tool was specifically developed and optimized to be implemented on cloud-based platforms such as BioData Catalyst Powered by Terra. It provides easy-to-use functionality for rare variant analysis that can be incorporated into an exhaustive whole genome sequencing analysis pipeline. AVAILABILITY AND IMPLEMENTATION: The workflow is freely available from https://dockstore.org/workflows/github.com/sheilagaynor/STAAR_workflow. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Computação em Nuvem , Software , Fluxo de Trabalho , Genoma , Estudo de Associação Genômica Ampla
12.
Bioinformatics ; 38(9): 2661-2663, 2022 04 28.
Artigo em Inglês | MEDLINE | ID: mdl-35244140

RESUMO

SUMMARY: Amidst the continuing spread of coronavirus disease-19 (COVID-19), real-time data analysis and visualization remain critical the general public to track the pandemic's impact and to inform policy making by officials. Multiple metrics permit the evaluation of the spread, infection and mortality of infectious diseases. For example, numbers of new cases and deaths provide easily interpretable measures of absolute impact within a given population and time frame, while the effective reproduction rate provides an epidemiological measure of the rate of spread. By evaluating multiple metrics concurrently, users can leverage complementary insights into the impact and current state of the pandemic when formulating prevention and safety plans for oneself and others. We describe COVID-19 Spread Mapper, a unified framework for estimating and quantifying the uncertainty in the smoothed daily effective reproduction number, case rate and death rate in a region using log-linear models. We apply this framework to characterize COVID-19 impact at multiple geographic resolutions, including by US county and state as well as by country, demonstrating the variation across resolutions and the need for harmonized efforts to control the pandemic. We provide an open-source online dashboard for real-time analysis and visualization of multiple key metrics, which are critical to evaluate the impact of COVID-19 and make informed policy decisions. AVAILABILITY AND IMPLEMENTATION: Our model and tool are publicly available as implemented in R and hosted at https://metrics.covid19-analysis.org/. The source code is freely available from https://github.com/lin-lab/COVID19-Rt and https://github.com/lin-lab/COVID19-Viz. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , Humanos , COVID-19/epidemiologia , SARS-CoV-2 , Pandemias/prevenção & controle , Software
13.
J Oral Facial Pain Headache ; 35(2): 105-112, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34129655

RESUMO

AIMS: To determine the relationship between hormonal contraceptive (HC) use and painful symptoms, particularly those associated with headache and painful temporomandibular disorders (TMD). METHODS: Data from the Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) prospective cohort study were used. During the 2.5-year median follow-up period, quarterly health update (QHU) questionnaires were completed by 1,475 women aged 18 to 44 years who did not have TMD, menopause, hysterectomy, or hormone replacement therapy use at baseline. QHU questionnaires evaluated HC use, symptoms of headache and TMD, and pain of ≥ 1 day duration in 12 body regions. Participants who developed TMD symptoms were examined to classify clinical TMD. Headache symptoms were classified based on the International Classification of Headache Disorders 3 (ICHD-3). Associations between HC use and pain symptoms were analyzed using generalized estimating equations and Cox models. RESULTS: HC use, endorsed in 33.7% of QHU questionnaires, was significantly associated with concurrent symptoms of TMD (odds ratio [OR]: 1.20, 95% CI: 1.06 to 1.35) and headache (OR: 1.26, 95% CI: 1.11 to 1.43). HC use was also significantly associated with concurrent pain of ≥ 1 day duration in the head (OR: 1.38, 95% CI: 1.16 to 1.63), face (OR: 1.44, 95% CI: 1.13 to 1.83), and legs (OR: 1.22, 95% CI: 1.01 to 1.47), but not elsewhere. Initiation of HC use was associated with increased odds of subsequent TMD symptoms (OR: 1.37, 95% CI: 1.13 to 1.66) and pain of ≥ 1 day in the head (OR: 1.37, 95% CI: 1.01 to 1.85). Discontinuing HC use was associated with lower odds of subsequent headache (OR: 0.82, 95% CI: 0.67 to 0.99). HC use was not significantly associated with subsequent onset of examiner-classified TMD. CONCLUSION: These findings imply that HC influences craniofacial pain, and that this pain diminishes after cessation of HC use.


Assuntos
Anticoncepcionais , Dor Facial , Dor Facial/induzido quimicamente , Feminino , Cefaleia/induzido quimicamente , Humanos , Estudos Prospectivos , Medição de Risco , Fatores de Risco
14.
BMC Public Health ; 21(1): 1007, 2021 05 28.
Artigo em Inglês | MEDLINE | ID: mdl-34049526

RESUMO

BACKGROUND: Identifying county-level characteristics associated with high coronavirus 2019 (COVID-19) burden can help allow for data-driven, equitable allocation of public health intervention resources and reduce burdens on health care systems. METHODS: Synthesizing data from various government and nonprofit institutions for all 3142 United States (US) counties, we studied county-level characteristics that were associated with cumulative and weekly case and death rates through 12/21/2020. We used generalized linear mixed models to model cumulative and weekly (40 repeated measures per county) cases and deaths. Cumulative and weekly models included state fixed effects and county-specific random effects. Weekly models additionally allowed covariate effects to vary by season and included US Census region-specific B-splines to adjust for temporal trends. RESULTS: Rural counties, counties with more minorities and white/non-white segregation, and counties with more people with no high school diploma and with medical comorbidities were associated with higher cumulative COVID-19 case and death rates. In the spring, urban counties and counties with more minorities and white/non-white segregation were associated with increased weekly case and death rates. In the fall, rural counties were associated with larger weekly case and death rates. In the spring, summer, and fall, counties with more residents with socioeconomic disadvantage and medical comorbidities were associated greater weekly case and death rates. CONCLUSIONS: These county-level associations are based off complete data from the entire country, come from a single modeling framework that longitudinally analyzes the US COVID-19 pandemic at the county-level, and are applicable to guiding government resource allocation policies to different US counties.


Assuntos
COVID-19 , Segregação Social , Humanos , Pandemias , População Rural , SARS-CoV-2 , Estados Unidos/epidemiologia
15.
Genet Epidemiol ; 45(1): 99-114, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-32924180

RESUMO

Clinical trial results have recently demonstrated that inhibiting inflammation by targeting the interleukin-1ß pathway can offer a significant reduction in lung cancer incidence and mortality, highlighting a pressing and unmet need to understand the benefits of inflammation-focused lung cancer therapies at the genetic level. While numerous genome-wide association studies (GWAS) have explored the genetic etiology of lung cancer, there remains a large gap between the type of information that may be gleaned from an association study and the depth of understanding necessary to explain and drive translational findings. Thus, in this study we jointly model and integrate extensive multiomics data sources, utilizing a total of 40 genome-wide functional annotations that augment previously published results from the International Lung Cancer Consortium (ILCCO) GWAS, to prioritize and characterize single nucleotide polymorphisms (SNPs) that increase risk of squamous cell lung cancer through the inflammatory and immune responses. Our work bridges the gap between correlative analysis and translational follow-up research, refining GWAS association measures in an interpretable and systematic manner. In particular, reanalysis of the ILCCO data highlights the impact of highly associated SNPs from nuclear factor-κB signaling pathway genes as well as major histocompatibility complex mediated variation in immune responses. One consequence of prioritizing likely functional SNPs is the pruning of variants that might be selected for follow-up work by over an order of magnitude, from potentially tens of thousands to hundreds. The strategies we introduce provide informative and interpretable approaches for incorporating extensive genome-wide annotation data in analysis of genetic association studies.


Assuntos
Estudo de Associação Genômica Ampla , Neoplasias Pulmonares , Células Epiteliais , Predisposição Genética para Doença , Humanos , Inflamação/genética , Neoplasias Pulmonares/genética , Modelos Genéticos , Polimorfismo de Nucleotídeo Único
16.
Pain ; 162(5): 1528-1538, 2021 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-33259458

RESUMO

ABSTRACT: Traditional classification and prognostic approaches for chronic pain conditions focus primarily on anatomically based clinical characteristics not based on underlying biopsychosocial factors contributing to perception of clinical pain and future pain trajectories. Using a supervised clustering approach in a cohort of temporomandibular disorder cases and controls from the Orofacial Pain: Prospective Evaluation and Risk Assessment study, we recently developed and validated a rapid algorithm (ROPA) to pragmatically classify chronic pain patients into 3 groups that differed in clinical pain report, biopsychosocial profiles, functional limitations, and comorbid conditions. The present aim was to examine the generalizability of this clustering procedure in 2 additional cohorts: a cohort of patients with chronic overlapping pain conditions (Complex Persistent Pain Conditions study) and a real-world clinical population of patients seeking treatment at duke innovative pain therapies. In each cohort, we applied a ROPA for cluster prediction, which requires only 4 input variables: pressure pain threshold and anxiety, depression, and somatization scales. In both complex persistent pain condition and duke innovative pain therapies, we distinguished 3 clusters, including one with more severe clinical characteristics and psychological distress. We observed strong concordance with observed cluster solutions, indicating the ROPA method allows for reliable subtyping of clinical populations with minimal patient burden. The ROPA clustering algorithm represents a rapid and valid stratification tool independent of anatomic diagnosis. ROPA holds promise in classifying patients based on pathophysiological mechanisms rather than structural or anatomical diagnoses. As such, this method of classifying patients will facilitate personalized pain medicine for patients with chronic pain.


Assuntos
Dor Crônica , Transtornos de Ansiedade , Dor Crônica/diagnóstico , Análise por Conglomerados , Dor Facial , Humanos , Estudos Prospectivos
17.
medRxiv ; 2021 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-33300014

RESUMO

Identifying areas with high COVID-19 burden and their characteristics can help improve vaccine distribution and uptake, reduce burdens on health care systems, and allow for better allocation of public health intervention resources. Synthesizing data from various government and nonprofit institutions of 3,142 United States (US) counties as of 12/21/2020, we studied county-level characteristics that are associated with cumulative case and death rates using regression analyses. Our results showed counties that are more rural, counties with more White/non-White segregation, and counties with higher percentages of people of color, in poverty, with no high school diploma, and with medical comorbidities such as diabetes and hypertension are associated with higher cumulative COVID-19 case and death rates. We identify the hardest hit counties in US using model-estimated case and death rates, which provide more reliable estimates of cumulative COVID-19 burdens than those using raw observed county-specific rates. Identification of counties with high disease burdens and understanding the characteristics of these counties can help inform policies to improve vaccine distribution, deployment and uptake, prevent overwhelming health care systems, and enhance testing access, personal protection equipment access, and other resource allocation efforts, all of which can help save more lives for vulnerable communities. SIGNIFICANCE STATEMENT: We found counties that are more rural, counties with more White/non-White segregation, and counties with higher percentages of people of color, in poverty, with no high school diploma, and with medical comorbidities such as diabetes and hypertension are associated with higher cumulative COVID-19 case and death rates. We also identified individual counties with high cumulative COVID-19 burden. Identification of counties with high disease burdens and understanding the characteristics of these counties can help inform policies to improve vaccine distribution, deployment and uptake, prevent overwhelming health care systems, and enhance testing access, personal protection equipment access, and other resource allocation efforts, all of which can help save more lives for vulnerable communities.

18.
Nat Genet ; 52(9): 969-983, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32839606

RESUMO

Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce 'annotation principal components', multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.


Assuntos
Predisposição Genética para Doença/genética , Variação Genética/genética , Genoma/genética , LDL-Colesterol/genética , Simulação por Computador , Estudo de Associação Genômica Ampla/métodos , Humanos , Modelos Genéticos , Anotação de Sequência Molecular/métodos , Fenótipo , Sequenciamento Completo do Genoma/métodos
19.
Am J Hum Genet ; 106(1): 112-120, 2020 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-31883642

RESUMO

Whole-genome sequencing (WGS) can improve assessment of low-frequency and rare variants, particularly in non-European populations that have been underrepresented in existing genomic studies. The genetic determinants of C-reactive protein (CRP), a biomarker of chronic inflammation, have been extensively studied, with existing genome-wide association studies (GWASs) conducted in >200,000 individuals of European ancestry. In order to discover novel loci associated with CRP levels, we examined a multi-ancestry population (n = 23,279) with WGS (∼38× coverage) from the Trans-Omics for Precision Medicine (TOPMed) program. We found evidence for eight distinct associations at the CRP locus, including two variants that have not been identified previously (rs11265259 and rs181704186), both of which are non-coding and more common in individuals of African ancestry (∼10% and ∼1% minor allele frequency, respectively, and rare or monomorphic in 1000 Genomes populations of East Asian, South Asian, and European ancestry). We show that the minor (G) allele of rs181704186 is associated with lower CRP levels and decreased transcriptional activity and protein binding in vitro, providing a plausible molecular mechanism for this African ancestry-specific signal. The individuals homozygous for rs181704186-G have a mean CRP level of 0.23 mg/L, in contrast to individuals heterozygous for rs181704186 with mean CRP of 2.97 mg/L and major allele homozygotes with mean CRP of 4.11 mg/L. This study demonstrates the utility of WGS in multi-ethnic populations to drive discovery of complex trait associations of large effect and to identify functional alleles in noncoding regulatory regions.


Assuntos
Povo Asiático/genética , População Negra/genética , Proteína C-Reativa/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , População Branca/genética , Sequenciamento Completo do Genoma/métodos , Estudos de Coortes , Frequência do Gene , Estudo de Associação Genômica Ampla , Humanos , Desequilíbrio de Ligação
20.
Bioinformatics ; 35(22): 4568-4576, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31062858

RESUMO

MOTIVATION: Cancer genomics studies frequently aim to identify genes that are differentially expressed between clinically distinct patient subgroups, generally by testing single genes one at a time. However, the results of any individual transcriptomic study are often not fully reproducible. A particular challenge impeding statistical analysis is the difficulty of distinguishing between differential expression comprising part of the genomic disease etiology and that induced by downstream effects. More robust analytical approaches that are well-powered to detect potentially causative genes, are less prone to discovering spurious associations, and can deliver reproducible findings across different studies are needed. RESULTS: We propose a set-based procedure for testing of differential expression and show that this set-based approach can produce more robust results by aggregating information across multiple, correlated genomic markers. Specifically, we adapt the Generalized Berk-Jones statistic to test for the transcription factors that may contribute to the progression of estrogen receptor positive breast cancer. We demonstrate the ability of our method to produce reproducible findings by applying the same analysis to 21 publicly available datasets, producing a similar list of significant transcription factors across most studies. Our Generalized Berk-Jones approach produces results that show improved consistency over three set-based testing algorithms: Generalized Higher Criticism, Gene Set Analysis and Gene Set Enrichment Analysis. AVAILABILITY AND IMPLEMENTATION: Data are in the MetaGxBreast R package. Code is available at github.com/ryanrsun/gaynor_sun_GBJ_breast_cancer. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Perfilação da Expressão Gênica , Algoritmos , Neoplasias da Mama , Genoma , Humanos , Transcriptoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA