Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 55
Filter
1.
Am J Hum Genet ; 2024 Jun 12.
Article in English | MEDLINE | ID: mdl-38897203

ABSTRACT

Type 2 diabetes (T2D) is a major risk factor for heart failure (HF) and has elevated incidence among individuals with HF. Since genetics and HF can independently influence T2D, collider bias may occur when T2D (i.e., collider) is controlled for by design or analysis. Thus, we conducted a genome-wide association study (GWAS) of diabetes-related HF with correction for collider bias. We first performed a GWAS of HF to identify genetic instrumental variables (GIVs) for HF and to enable bidirectional Mendelian randomization (MR) analysis between T2D and HF. We identified 61 genomic loci, significantly associated with all-cause HF in 114,275 individuals with HF and over 1.5 million controls of European ancestry. Using a two-sample bidirectional MR approach with 59 and 82 GIVs for HF and T2D, respectively, we estimated that T2D increased HF risk (odds ratio [OR] 1.07, 95% confidence interval [CI] 1.04-1.10), while HF also increased T2D risk (OR 1.60, 95% CI 1.36-1.88). Then we performed a GWAS of diabetes-related HF corrected for collider bias due to the study design of index cases. After removing the spurious association of TCF7L2 locus due to collider bias, we identified two genome-wide significant loci close to PITX2 (chromosome 4) and CDKN2B-AS1 (chromosome 9) associated with diabetes-related HF in the Million Veteran Program and replicated the associations in the UK Biobank. Our MR findings provide strong evidence that HF increases T2D risk. As a result, collider bias leads to spurious genetic associations of diabetes-related HF, which can be effectively corrected to identify true positive loci.

2.
Am J Hum Genet ; 109(3): 433-445, 2022 03 03.
Article in English | MEDLINE | ID: mdl-35196515

ABSTRACT

Biobanks linked to massive, longitudinal electronic health record (EHR) data make numerous new genetic research questions feasible. One among these is the study of biomarker trajectories. For example, high blood pressure measurements over visits strongly predict stroke onset, and consistently high fasting glucose and Hb1Ac levels define diabetes. Recent research reveals that not only the mean level of biomarker trajectories but also their fluctuations, or within-subject (WS) variability, are risk factors for many diseases. Glycemic variation, for instance, is recently considered an important clinical metric in diabetes management. It is crucial to identify the genetic factors that shift the mean or alter the WS variability of a biomarker trajectory. Compared to traditional cross-sectional studies, trajectory analysis utilizes more data points and captures a complete picture of the impact of time-varying factors, including medication history and lifestyle. Currently, there are no efficient tools for genome-wide association studies (GWASs) of biomarker trajectories at the biobank scale, even for just mean effects. We propose TrajGWAS, a linear mixed effect model-based method for testing genetic effects that shift the mean or alter the WS variability of a biomarker trajectory. It is scalable to biobank data with 100,000 to 1,000,000 individuals and many longitudinal measurements and robust to distributional assumptions. Simulation studies corroborate that TrajGWAS controls the type I error rate and is powerful. Analysis of eleven biomarkers measured longitudinally and extracted from UK Biobank primary care data for more than 150,000 participants with 1,800,000 observations reveals loci that significantly alter the mean or WS variability.


Subject(s)
Biological Specimen Banks , Genome-Wide Association Study , Biomarkers , Cross-Sectional Studies , Electronic Health Records , Humans , Longitudinal Studies
3.
Mol Ther ; 32(6): 1849-1874, 2024 Jun 05.
Article in English | MEDLINE | ID: mdl-38584391

ABSTRACT

The clinical potential of current FDA-approved chimeric antigen receptor (CAR)-engineered T (CAR-T) cell therapy is encumbered by its autologous nature, which presents notable challenges related to manufacturing complexities, heightened costs, and limitations in patient selection. Therefore, there is a growing demand for off-the-shelf universal cell therapies. In this study, we have generated universal CAR-engineered NKT (UCAR-NKT) cells by integrating iNKT TCR engineering and HLA gene editing on hematopoietic stem cells (HSCs), along with an ex vivo, feeder-free HSC differentiation culture. The UCAR-NKT cells are produced with high yield, purity, and robustness, and they display a stable HLA-ablated phenotype that enables resistance to host cell-mediated allorejection. These UCAR-NKT cells exhibit potent antitumor efficacy to blood cancers and solid tumors, both in vitro and in vivo, employing a multifaceted array of tumor-targeting mechanisms. These cells are further capable of altering the tumor microenvironment by selectively depleting immunosuppressive tumor-associated macrophages and myeloid-derived suppressor cells. In addition, UCAR-NKT cells demonstrate a favorable safety profile with low risks of graft-versus-host disease and cytokine release syndrome. Collectively, these preclinical studies underscore the feasibility and significant therapeutic potential of UCAR-NKT cell products and lay a foundation for their translational and clinical development.


Subject(s)
Hematopoietic Stem Cells , Immunotherapy, Adoptive , Natural Killer T-Cells , Receptors, Chimeric Antigen , Humans , Hematopoietic Stem Cells/metabolism , Hematopoietic Stem Cells/cytology , Hematopoietic Stem Cells/immunology , Animals , Receptors, Chimeric Antigen/immunology , Receptors, Chimeric Antigen/genetics , Receptors, Chimeric Antigen/metabolism , Immunotherapy, Adoptive/methods , Mice , Natural Killer T-Cells/immunology , Natural Killer T-Cells/metabolism , Gene Editing , Xenograft Model Antitumor Assays , Neoplasms/therapy , Neoplasms/immunology , Cell Line, Tumor , Receptors, Antigen, T-Cell/metabolism , Receptors, Antigen, T-Cell/genetics , Receptors, Antigen, T-Cell/immunology
4.
Hum Mol Genet ; 31(22): 3873-3885, 2022 11 10.
Article in English | MEDLINE | ID: mdl-35766891

ABSTRACT

RATIONALE: Genetic variation has a substantial contribution to chronic obstructive pulmonary disease (COPD) and lung function measurements. Heritability estimates using genome-wide genotyping data can be biased if analyses do not appropriately account for the nonuniform distribution of genetic effects across the allele frequency and linkage disequilibrium (LD) spectrum. In addition, the contribution of rare variants has been unclear. OBJECTIVES: We sought to assess the heritability of COPD and lung function using whole-genome sequence data from the Trans-Omics for Precision Medicine program. METHODS: Using the genome-based restricted maximum likelihood method, we partitioned the genome into bins based on minor allele frequency and LD scores and estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio in 11 051 European ancestry and 5853 African-American participants. MEASUREMENTS AND MAIN RESULTS: In European ancestry participants, the estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio were 35.5%, 55.6% and 32.5%, of which 18.8%, 19.7%, 17.8% were from common variants, and 16.6%, 35.8%, and 14.6% were from rare variants. These estimates had wide confidence intervals, with common variants and some sets of rare variants showing a statistically significant contribution (P-value < 0.05). In African-Americans, common variant heritability was similar to European ancestry participants, but lower sample size precluded calculation of rare variant heritability. CONCLUSIONS: Our study provides updated and unbiased estimates of heritability for COPD and lung function, and suggests an important contribution of rare variants. Larger studies of more diverse ancestry will improve accuracy of these estimates.


Subject(s)
Genetic Predisposition to Disease , Pulmonary Disease, Chronic Obstructive , Humans , Polymorphism, Single Nucleotide/genetics , Pulmonary Disease, Chronic Obstructive/genetics , Genome-Wide Association Study , Phenotype
5.
Bioinformatics ; 39(4)2023 04 03.
Article in English | MEDLINE | ID: mdl-37067496

ABSTRACT

MOTIVATION: In a genome-wide association study, analyzing multiple correlated traits simultaneously is potentially superior to analyzing the traits one by one. Standard methods for multivariate genome-wide association study operate marker-by-marker and are computationally intensive. RESULTS: We present a sparsity constrained regression algorithm for multivariate genome-wide association study based on iterative hard thresholding and implement it in a convenient Julia package MendelIHT.jl. In simulation studies with up to 100 quantitative traits, iterative hard thresholding exhibits similar true positive rates, smaller false positive rates, and faster execution times than GEMMA's linear mixed models and mv-PLINK's canonical correlation analysis. On UK Biobank data with 470 228 variants, MendelIHT completed a three-trait joint analysis (n=185 656) in 20 h and an 18-trait joint analysis (n=104 264) in 53 h with an 80 GB memory footprint. In short, MendelIHT enables geneticists to fit a single regression model that simultaneously considers the effect of all SNPs and dozens of traits. AVAILABILITY AND IMPLEMENTATION: Software, documentation, and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelIHT.jl.


Subject(s)
Genome-Wide Association Study , Software , Algorithms , Computer Simulation , Phenotype , Polymorphism, Single Nucleotide
6.
Am J Respir Crit Care Med ; 206(10): 1220-1229, 2022 11 15.
Article in English | MEDLINE | ID: mdl-35771531

ABSTRACT

Rationale: A common MUC5B gene polymorphism, rs35705950-T, is associated with idiopathic pulmonary fibrosis (IPF), but its role in severe acute respiratory syndrome coronavirus 2 infection and disease severity is unclear. Objectives: To assess whether rs35705950-T confers differential risk for clinical outcomes associated with coronavirus disease (COVID-19) infection among participants in the Million Veteran Program (MVP). Methods: The MUC5B rs35705950-T allele was directly genotyped among MVP participants; clinical events and comorbidities were extracted from the electronic health records. Associations between the incidence or severity of COVID-19 and rs35705950-T were analyzed within each ancestry group in the MVP followed by transancestry meta-analysis. Replication and joint meta-analysis were conducted using summary statistics from the COVID-19 Host Genetics Initiative (HGI). Sensitivity analyses with adjustment for additional covariates (body mass index, Charlson comorbidity index, smoking, asbestosis, rheumatoid arthritis with interstitial lung disease, and IPF) and associations with post-COVID-19 pneumonia were performed in MVP subjects. Measurements and Main Results: The rs35705950-T allele was associated with fewer COVID-19 hospitalizations in transancestry meta-analyses within the MVP (Ncases = 4,325; Ncontrols = 507,640; OR = 0.89 [0.82-0.97]; P = 6.86 × 10-3) and joint meta-analyses with the HGI (Ncases = 13,320; Ncontrols = 1,508,841; OR, 0.90 [0.86-0.95]; P = 8.99 × 10-5). The rs35705950-T allele was not associated with reduced COVID-19 positivity in transancestry meta-analysis within the MVP (Ncases = 19,168/Ncontrols = 492,854; OR, 0.98 [0.95-1.01]; P = 0.06) but was nominally significant (P < 0.05) in the joint meta-analysis with the HGI (Ncases = 44,820; Ncontrols = 1,775,827; OR, 0.97 [0.95-1.00]; P = 0.03). Associations were not observed with severe outcomes or mortality. Among individuals of European ancestry in the MVP, rs35705950-T was associated with fewer post-COVID-19 pneumonia events (OR, 0.82 [0.72-0.93]; P = 0.001). Conclusions: The MUC5B variant rs35705950-T may confer protection in COVID-19 hospitalizations.


Subject(s)
COVID-19 , Idiopathic Pulmonary Fibrosis , Humans , COVID-19/epidemiology , COVID-19/genetics , Mucin-5B/genetics , Polymorphism, Genetic , Idiopathic Pulmonary Fibrosis/genetics , Genotype , Hospitalization , Genetic Predisposition to Disease/genetics
7.
Stat Sci ; 37(4): 494-518, 2022 Nov.
Article in English | MEDLINE | ID: mdl-37168541

ABSTRACT

Technological advances in the past decade, hardware and software alike, have made access to high-performance computing (HPC) easier than ever. We review these advances from a statistical computing perspective. Cloud computing makes access to supercomputers affordable. Deep learning software libraries make programming statistical algorithms easy and enable users to write code once and run it anywhere-from a laptop to a workstation with multiple graphics processing units (GPUs) or a supercomputer in a cloud. Highlighting how these developments benefit statisticians, we review recent optimization algorithms that are useful for high-dimensional models and can harness the power of HPC. Code snippets are provided to demonstrate the ease of programming. We also provide an easy-to-use distributed matrix data structure suitable for HPC. Employing this data structure, we illustrate various statistical applications including large-scale positron emission tomography and ℓ1-regularized Cox regression. Our examples easily scale up to an 8-GPU workstation and a 720-CPU-core cluster in a cloud. As a case in point, we analyze the onset of type-2 diabetes from the UK Biobank with 200,000 subjects and about 500,000 single nucleotide polymorphisms using the HPC ℓ1-regularized Cox regression. Fitting this half-million-variate model takes less than 45 minutes and reconfirms known associations. To our knowledge, this is the first demonstration of the feasibility of penalized regression of survival outcomes at this scale.

8.
Genet Epidemiol ; 44(3): 248-260, 2020 04.
Article in English | MEDLINE | ID: mdl-31879980

ABSTRACT

Logistic regression is the primary analysis tool for binary traits in genome-wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case-control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.


Subject(s)
Biological Specimen Banks , Genome-Wide Association Study , Algorithms , Case-Control Studies , Computer Simulation , Humans , Hypertension/genetics , Models, Genetic , Phenotype , Polymorphism, Single Nucleotide/genetics , Pulmonary Disease, Chronic Obstructive/genetics , Pulmonary Disease, Chronic Obstructive/physiopathology , Regression Analysis , Respiratory Function Tests
9.
Cardiovasc Diabetol ; 20(1): 232, 2021 12 08.
Article in English | MEDLINE | ID: mdl-34879878

ABSTRACT

AIMS: Low C-peptide levels, indicating beta-cell dysfunction, are associated with increased within-day glucose variation and hypoglycemia. In advanced type 2 diabetes, severe hypoglycemia and increased glucose variation predict cardiovascular (CVD) risk. The present study examined the association between C-peptide levels and CVD risk and whether it can be explained by visit-to-visit glucose variation and severe hypoglycemia. MATERIALS AND METHODS: Fasting C-peptide levels at baseline, composite CVD outcome, severe hypoglycemia, and visit-to-visit fasting glucose coefficient of variation (CV) and average real variability (ARV) were assessed in 1565 Veterans Affairs Diabetes Trial participants. RESULTS: There was a U-shaped relationship between C-peptide and CVD risk with increased risk with declining levels in the low range (< 0.50 nmol/l, HR 1.30 [95%CI 1.05-1.60], p = 0.02) and with rising levels in the high range (> 1.23 nmol/l, 1.27 [1.00-1.63], p = 0.05). C-peptide levels were inversely associated with the risk of severe hypoglycemia (OR 0.68 [0.60-0.77]) and visit-to-visit glucose variation (CV, standardized beta-estimate - 0.12 [SE 0.01]; ARV, - 0.10 [0.01]) (p < 0.0001 all). The association of low C-peptide levels with CVD risk was independent of cardiometabolic risk factors (1.48 [1.17-1.87, p = 0.001) and remained associated with CVD when tested in the same model with severe hypoglycemia and glucose CV. CONCLUSIONS: Low C-peptide levels were associated with increased CVD risk in advanced type 2 diabetes. The association was independent of increases in glucose variation or severe hypoglycemia. C-peptide levels may predict future glucose control patterns and CVD risk, and identify phenotypes influencing clinical decision making in advanced type 2 diabetes.


Subject(s)
Blood Glucose/metabolism , C-Peptide/blood , Cardiovascular Diseases/epidemiology , Diabetes Mellitus, Type 2/blood , Fasting/blood , Hypoglycemia/blood , Aged , Biomarkers/blood , Blood Glucose/drug effects , Cardiovascular Diseases/diagnosis , Diabetes Mellitus, Type 2/diagnosis , Diabetes Mellitus, Type 2/drug therapy , Diabetes Mellitus, Type 2/epidemiology , Female , Glycemic Control , Heart Disease Risk Factors , Humans , Hypoglycemia/diagnosis , Hypoglycemia/epidemiology , Hypoglycemic Agents/therapeutic use , Male , Middle Aged , Prognosis , Randomized Controlled Trials as Topic , Risk Assessment , Severity of Illness Index , Time Factors , United States/epidemiology , United States Department of Veterans Affairs
10.
Curr Cardiol Rep ; 23(4): 25, 2021 03 02.
Article in English | MEDLINE | ID: mdl-33655430

ABSTRACT

PURPOSE OF REVIEW: There is evidence from epidemiologic studies that variability in cardiovascular risk factors influences risk of cardiovascular disease. We review new studies and novel findings in the relationship between visit-to-visit glycemic variability and blood pressure variability and risk of adverse outcomes. RECENT FINDINGS: Visit-to-visit glycemic variability is consistently linked to macrovascular disease. This relationship has been observed in both clinical trials and retrospective studies of electronic health records. Long-term blood pressure variability also predicts cardiovascular outcomes, and the association appears stronger in those with lower levels of systolic and diastolic function. As epidemiologic evidence increases in support of a role for metabolic risk factor variability in cardiovascular risk, there is a corresponding rise in interest in applying this information toward improving risk factor prediction and treatment. Future investigation of underlying mechanisms for these associations as well as implications for therapy is also warranted. The potential additive contribution of variability of multiple parameters also merits additional scrutiny. As our technology for capturing risk factor variability continues to improve, this will only enhance our understanding of its links with vascular disease and how to best utilize this information to reduce cardiovascular outcomes.


Subject(s)
Cardiovascular Diseases , Diabetes Mellitus, Type 2 , Blood Glucose , Blood Pressure , Cardiovascular Diseases/epidemiology , Humans , Retrospective Studies , Risk Factors
11.
Genet Epidemiol ; 43(3): 250-262, 2019 04.
Article in English | MEDLINE | ID: mdl-30623484

ABSTRACT

In metagenomic studies, testing the association between microbiome composition and clinical outcomes translates to testing the nullity of variance components. Motivated by a lung human immunodeficiency virus (HIV) microbiome project, we study longitudinal microbiome data by using variance component models with more than two variance components. Current testing strategies only apply to models with exactly two variance components and when sample sizes are large. Therefore, they are not applicable to longitudinal microbiome studies. In this paper, we propose exact tests (score test, likelihood ratio test, and restricted likelihood ratio test) to (a) test the association of the overall microbiome composition in a longitudinal design and (b) detect the association of one specific microbiome cluster while adjusting for the effects from related clusters. Our approach combines the exact tests for null hypothesis with a single variance component with a strategy of reducing multiple variance components to a single one. Simulation studies demonstrate that our method has a correct type I error rate and superior power compared to existing methods at small sample sizes and weak signals. Finally, we apply our method to a longitudinal pulmonary microbiome study of HIV-infected patients and reveal two interesting genera Prevotella and Veillonella associated with forced vital capacity. Our findings shed light on the impact of the lung microbiome on HIV complexities. The method is implemented in the open-source, high-performance computing language Julia and is freely available at https://github.com/JingZhai63/VCmicrobiome.


Subject(s)
Microbiota , Models, Genetic , Computer Simulation , Humans , Longitudinal Studies , Lung/microbiology
12.
Mol Cancer ; 19(1): 159, 2020 11 12.
Article in English | MEDLINE | ID: mdl-33176804

ABSTRACT

One unmet challenge in lung cancer diagnosis is to accurately differentiate lung cancer from other lung diseases with similar clinical symptoms and radiological features, such as pulmonary tuberculosis (TB). To identify reliable biomarkers for lung cancer screening, we leverage the recently discovered non-canonical small non-coding RNAs (i.e., tRNA-derived small RNAs [tsRNAs], rRNA-derived small RNAs [rsRNAs], and YRNA-derived small RNAs [ysRNAs]) in human peripheral blood mononuclear cells and develop a molecular signature composed of distinct ts/rs/ysRNAs (TRY-RNA). Our TRY-RNA signature precisely discriminates between control, lung cancer, and pulmonary TB subjects in both the discovery and validation cohorts and outperforms microRNA-based biomarkers, which bears the diagnostic potential for lung cancer screening.


Subject(s)
Biomarkers, Tumor/genetics , Gene Expression Regulation, Neoplastic , Leukocytes, Mononuclear/metabolism , Lung Neoplasms/diagnosis , RNA, Small Untranslated/genetics , Case-Control Studies , Cohort Studies , Humans , Lung Neoplasms/blood , Lung Neoplasms/genetics , Prognosis , RNA, Small Untranslated/blood
13.
Hum Genet ; 139(1): 61-71, 2020 Jan.
Article in English | MEDLINE | ID: mdl-30915546

ABSTRACT

Statistical methods for genome-wide association studies (GWAS) continue to improve. However, the increasing volume and variety of genetic and genomic data make computational speed and ease of data manipulation mandatory in future software. In our view, a collaborative effort of statistical geneticists is required to develop open source software targeted to genetic epidemiology. Our attempt to meet this need is called the OPENMENDEL project (https://openmendel.github.io). It aims to (1) enable interactive and reproducible analyses with informative intermediate results, (2) scale to big data analytics, (3) embrace parallel and distributed computing, (4) adapt to rapid hardware evolution, (5) allow cloud computing, (6) allow integration of varied genetic data types, and (7) foster easy communication between clinicians, geneticists, statisticians, and computer scientists. This article reviews and makes recommendations to the genetic epidemiology community in the context of the OPENMENDEL project.


Subject(s)
Computational Biology/methods , Genome, Human , Genome-Wide Association Study , Models, Statistical , Programming Languages , Algorithms , Humans , Polymorphism, Single Nucleotide , Software
14.
RNA ; 24(11): 1443-1456, 2018 11.
Article in English | MEDLINE | ID: mdl-30093490

ABSTRACT

Circular RNAs (circRNAs) are a novel class of regulatory RNAs. Here, we present a comprehensive investigation of circRNA expression profiles across 11 tissues and four developmental stages in rats, along with cross-species analyses in humans and mice. Although the expression of circRNAs is positively correlated with that of cognate mRNAs, highly expressed genes tend to splice a larger fraction of circular transcripts. Moreover, circRNAs exhibit higher tissue specificity than cognate mRNAs. Intriguingly, while we observed a monotonic increase of circRNA abundance with age in the rat brain, we further discovered a dynamic, age-dependent pattern of circRNA expression in the testes that is characterized by a dramatic increase with advancing stages of sexual maturity and a decrease with aging. The age-sensitive testicular circRNAs are highly associated with spermatogenesis, independent of cognate mRNA expression. The tissue/age implications of circRNAs suggest that they present unique physiological functions rather than simply occurring as occasional by-products of gene transcription.


Subject(s)
Gene Expression Regulation, Developmental , RNA/genetics , Transcriptome , Age Factors , Animals , Gene Expression Profiling , Male , Organ Specificity/genetics , RNA, Circular , Rats , Testis/metabolism
15.
Am J Hum Genet ; 96(5): 797-807, 2015 May 07.
Article in English | MEDLINE | ID: mdl-25957468

ABSTRACT

High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals' microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. "Optimal" MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for.


Subject(s)
Genetics, Population , Microbiota/genetics , Models, Statistical , Computer Simulation , High-Throughput Nucleotide Sequencing , Humans , Phylogeny , Polymorphism, Single Nucleotide , Software
18.
Hum Hered ; 79(2): 93-104, 2015.
Article in English | MEDLINE | ID: mdl-26111731

ABSTRACT

Many correlated disease variables are analyzed jointly in genetic studies in the hope of increasing power to detect causal genetic variants. One approach involves assessing the relationship between each phenotype and each SNP individually and using a Bonferroni correction for the effective number of tests conducted. Alternatively, one can apply a multivariate regression or a dimension reduction technique, such as principal component analysis, and test for the association with the principal components of the phenotypes rather than the individual phenotypes. Inspired by the previous approaches of combining phenotypes to maximize heritability at individual SNPs, in this paper, we propose to construct a maximally heritable (MaxH) phenotype by taking advantage of the estimated total heritability and co-heritability. The heritability and co-heritability only need to be estimated once; therefore, our method is applicable to genome-wide scans. The MaxH phenotype is a linear combination of the individual phenotypes with increased heritability and power over the phenotypes being combined. Simulations show that the heritability and power achieved agree well with the theory for large samples and two phenotypes. We compare our approach with commonly used methods and assess both the heritability and the power of the MaxH phenotype. Moreover, we provide suggestions for how to choose the phenotypes for combination. An application of our approach to a GWAS on chronic obstructive pulmonary disease shows its practical relevance.


Subject(s)
Models, Genetic , Phenotype , Genome-Wide Association Study , Humans , Pulmonary Disease, Chronic Obstructive/genetics
20.
Am J Respir Crit Care Med ; 188(8): 941-7, 2013 Oct 15.
Article in English | MEDLINE | ID: mdl-23972146

ABSTRACT

RATIONALE: Previous studies of chronic obstructive pulmonary disease (COPD) have suggested that genetic factors play an important role in the development of disease. However, single-nucleotide polymorphisms that are associated with COPD in genome-wide association studies have been shown to account for only a small percentage of the genetic variance in phenotypes of COPD, such as spirometry and imaging variables. These phenotypes are highly predictive of disease, and family studies have shown that spirometric phenotypes are heritable. OBJECTIVES: To assess the heritability and coheritability of four major COPD-related phenotypes (measurements of FEV1, FEV1/FVC, percent emphysema, and percent gas trapping), and COPD affection status in smokers of non-Hispanic white and African American descent using a population design. METHODS: Single-nucleotide polymorphisms from genome-wide association studies chips were used to calculate the relatedness of pairs of individuals and a mixed model was adopted to estimate genetic variance and covariance. MEASUREMENTS AND MAIN RESULTS: In the non-Hispanic whites, estimated heritabilities of FEV1 and FEV1/FVC were both about 37%, consistent with estimates in the literature from family-based studies. For chest computed tomography scan phenotypes, estimated heritabilities were both close to 25%. Heritability of COPD affection status was estimated as 37.7% in both populations. CONCLUSIONS: This study suggests that a large portion of the genetic risk of COPD is yet to be discovered and gives rationale for additional genetic studies of COPD. The estimates of coheritability (genetic covariance) for pairs of the phenotypes suggest considerable overlap of causal genetic loci.


Subject(s)
Pulmonary Disease, Chronic Obstructive/genetics , Smoking/adverse effects , Female , Forced Expiratory Volume/genetics , Genetic Predisposition to Disease/genetics , Humans , Male , Middle Aged , Oligonucleotide Array Sequence Analysis , Phenotype , Polymorphism, Single Nucleotide/genetics , Pulmonary Disease, Chronic Obstructive/etiology , Pulmonary Disease, Chronic Obstructive/physiopathology , Racial Groups/genetics , Smoking/physiopathology , Vital Capacity/genetics
SELECTION OF CITATIONS
SEARCH DETAIL