Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 50
Filter
Add more filters

Publication year range
1.
Nature ; 616(7958): 755-763, 2023 04.
Article in English | MEDLINE | ID: mdl-37046083

ABSTRACT

Mutations in a diverse set of driver genes increase the fitness of haematopoietic stem cells (HSCs), leading to clonal haematopoiesis1. These lesions are precursors for blood cancers2-6, but the basis of their fitness advantage remains largely unknown, partly owing to a paucity of large cohorts in which the clonal expansion rate has been assessed by longitudinal sampling. Here, to circumvent this limitation, we developed a method to infer the expansion rate from data from a single time point. We applied this method to 5,071 people with clonal haematopoiesis. A genome-wide association study revealed that a common inherited polymorphism in the TCL1A promoter was associated with a slower expansion rate in clonal haematopoiesis overall, but the effect varied by driver gene. Those carrying this protective allele exhibited markedly reduced growth rates or prevalence of clones with driver mutations in TET2, ASXL1, SF3B1 and SRSF2, but this effect was not seen in clones with driver mutations in DNMT3A. TCL1A was not expressed in normal or DNMT3A-mutated HSCs, but the introduction of mutations in TET2 or ASXL1 led to the expression of TCL1A protein and the expansion of HSCs in vitro. The protective allele restricted TCL1A expression and expansion of mutant HSCs, as did experimental knockdown of TCL1A expression. Forced expression of TCL1A promoted the expansion of human HSCs in vitro and mouse HSCs in vivo. Our results indicate that the fitness advantage of several commonly mutated driver genes in clonal haematopoiesis may be mediated by TCL1A activation.


Subject(s)
Clonal Hematopoiesis , Hematopoietic Stem Cells , Animals , Humans , Mice , Alleles , Clonal Hematopoiesis/genetics , Genome-Wide Association Study , Hematopoiesis/genetics , Hematopoietic Stem Cells/cytology , Hematopoietic Stem Cells/metabolism , Mutation , Promoter Regions, Genetic
2.
Am J Hum Genet ; 109(5): 857-870, 2022 05 05.
Article in English | MEDLINE | ID: mdl-35385699

ABSTRACT

While polygenic risk scores (PRSs) enable early identification of genetic risk for chronic obstructive pulmonary disease (COPD), predictive performance is limited when the discovery and target populations are not well matched. Hypothesizing that the biological mechanisms of disease are shared across ancestry groups, we introduce a PrediXcan-derived polygenic transcriptome risk score (PTRS) to improve cross-ethnic portability of risk prediction. We constructed the PTRS using summary statistics from application of PrediXcan on large-scale GWASs of lung function (forced expiratory volume in 1 s [FEV1] and its ratio to forced vital capacity [FEV1/FVC]) in the UK Biobank. We examined prediction performance and cross-ethnic portability of PTRS through smoking-stratified analyses both on 29,381 multi-ethnic participants from TOPMed population/family-based cohorts and on 11,771 multi-ethnic participants from TOPMed COPD-enriched studies. Analyses were carried out for two dichotomous COPD traits (moderate-to-severe and severe COPD) and two quantitative lung function traits (FEV1 and FEV1/FVC). While the proposed PTRS showed weaker associations with disease than PRS for European ancestry, the PTRS showed stronger association with COPD than PRS for African Americans (e.g., odds ratio [OR] = 1.24 [95% confidence interval [CI]: 1.08-1.43] for PTRS versus 1.10 [0.96-1.26] for PRS among heavy smokers with ≥ 40 pack-years of smoking) for moderate-to-severe COPD. Cross-ethnic portability of the PTRS was significantly higher than the PRS (paired t test p < 2.2 × 10-16 with portability gains ranging from 5% to 28%) for both dichotomous COPD traits and across all smoking strata. Our study demonstrates the value of PTRS for improved cross-ethnic portability compared to PRS in predicting COPD risk.


Subject(s)
Pulmonary Disease, Chronic Obstructive , Transcriptome , Humans , Lung , National Heart, Lung, and Blood Institute (U.S.) , Pulmonary Disease, Chronic Obstructive/genetics , Risk Factors , United States/epidemiology
3.
Genome Res ; 32(10): 1918-1929, 2022 10.
Article in English | MEDLINE | ID: mdl-36220609

ABSTRACT

Extensive evidence indicates that the pathobiological processes of a complex disease are associated with perturbation in specific neighborhoods of the human protein-protein interaction (PPI) network (also known as the interactome), often referred to as the disease module. Many computational methods have been developed to integrate the interactome and omics profiles to extract context-dependent disease modules. Yet, existing methods all have fundamental limitations in terms of rigor and/or efficiency. Here, we developed a statistical physics approach based on the random-field Ising model (RFIM) for disease module detection, which is both mathematically rigorous and computationally efficient. We applied our RFIM approach to genome-wide association studies (GWAS) of ten complex diseases to examine its performance for disease module detection. We found that our RFIM approach outperforms existing methods in terms of computational efficiency, connectivity of disease modules, and robustness to the interactome incompleteness.


Subject(s)
Genome-Wide Association Study , Protein Interaction Maps , Humans , Genome-Wide Association Study/methods , Physics , Algorithms
4.
PLoS Genet ; 18(11): e1010464, 2022 11.
Article in English | MEDLINE | ID: mdl-36383614

ABSTRACT

The identification and understanding of gene-environment interactions can provide insights into the pathways and mechanisms underlying complex diseases. However, testing for gene-environment interaction remains a challenge since a.) statistical power is often limited and b.) modeling of environmental effects is nontrivial and such model misspecifications can lead to false positive interaction findings. To address the lack of statistical power, recent methods aim to identify interactions on an aggregated level using, for example, polygenic risk scores. While this strategy can increase the power to detect interactions, identifying contributing genes and pathways is difficult based on these relatively global results. Here, we propose RITSS (Robust Interaction Testing using Sample Splitting), a gene-environment interaction testing framework for quantitative traits that is based on sample splitting and robust test statistics. RITSS can incorporate sets of genetic variants and/or multiple environmental factors. Based on the user's choice of statistical/machine learning approaches, a screening step selects and combines potential interactions into scores with improved interpretability. In the testing step, the application of robust statistics minimizes the susceptibility to main effect misspecifications. Using extensive simulation studies, we demonstrate that RITSS controls the type 1 error rate in a wide range of scenarios, and we show how the screening strategy influences statistical power. In an application to lung function phenotypes and human height in the UK Biobank, RITSS identified highly significant interactions based on subcomponents of genetic risk scores. While the contributing single variant interaction signals are weak, our results indicate interaction patterns that result in strong aggregated effects, providing potential insights into underlying gene-environment interaction mechanisms.


Subject(s)
Models, Genetic , Polymorphism, Single Nucleotide , Humans , Genetic Loci , Gene-Environment Interaction , Phenotype , Computer Simulation , Genome-Wide Association Study
5.
Hum Mol Genet ; 31(22): 3873-3885, 2022 11 10.
Article in English | MEDLINE | ID: mdl-35766891

ABSTRACT

RATIONALE: Genetic variation has a substantial contribution to chronic obstructive pulmonary disease (COPD) and lung function measurements. Heritability estimates using genome-wide genotyping data can be biased if analyses do not appropriately account for the nonuniform distribution of genetic effects across the allele frequency and linkage disequilibrium (LD) spectrum. In addition, the contribution of rare variants has been unclear. OBJECTIVES: We sought to assess the heritability of COPD and lung function using whole-genome sequence data from the Trans-Omics for Precision Medicine program. METHODS: Using the genome-based restricted maximum likelihood method, we partitioned the genome into bins based on minor allele frequency and LD scores and estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio in 11 051 European ancestry and 5853 African-American participants. MEASUREMENTS AND MAIN RESULTS: In European ancestry participants, the estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio were 35.5%, 55.6% and 32.5%, of which 18.8%, 19.7%, 17.8% were from common variants, and 16.6%, 35.8%, and 14.6% were from rare variants. These estimates had wide confidence intervals, with common variants and some sets of rare variants showing a statistically significant contribution (P-value < 0.05). In African-Americans, common variant heritability was similar to European ancestry participants, but lower sample size precluded calculation of rare variant heritability. CONCLUSIONS: Our study provides updated and unbiased estimates of heritability for COPD and lung function, and suggests an important contribution of rare variants. Larger studies of more diverse ancestry will improve accuracy of these estimates.


Subject(s)
Genetic Predisposition to Disease , Pulmonary Disease, Chronic Obstructive , Humans , Polymorphism, Single Nucleotide/genetics , Pulmonary Disease, Chronic Obstructive/genetics , Genome-Wide Association Study , Phenotype
6.
Blood ; 139(3): 357-368, 2022 01 20.
Article in English | MEDLINE | ID: mdl-34855941

ABSTRACT

Chronic obstructive pulmonary disease (COPD) is associated with age and smoking, but other determinants of the disease are incompletely understood. Clonal hematopoiesis of indeterminate potential (CHIP) is a common, age-related state in which somatic mutations in clonal blood populations induce aberrant inflammatory responses. Patients with CHIP have an elevated risk for cardiovascular disease, but the association of CHIP with COPD remains unclear. We analyzed whole-genome sequencing and whole-exome sequencing data to detect CHIP in 48 835 patients, of whom 8444 had moderate to very severe COPD, from four separate cohorts with COPD phenotyping and smoking history. We measured emphysema in murine models in which Tet2 was deleted in hematopoietic cells. In the COPDGene cohort, individuals with CHIP had risks of moderate-to-severe, severe, or very severe COPD that were 1.6 (adjusted 95% confidence interval [CI], 1.1-2.2) and 2.2 (adjusted 95% CI, 1.5-3.2) times greater than those for noncarriers. These findings were consistently observed in three additional cohorts and meta-analyses of all patients. CHIP was also associated with decreased FEV1% predicted in the COPDGene cohort (mean between-group differences, -5.7%; adjusted 95% CI, -8.8% to -2.6%), a finding replicated in additional cohorts. Smoke exposure was associated with a small but significant increased risk of having CHIP (odds ratio, 1.03 per 10 pack-years; 95% CI, 1.01-1.05 per 10 pack-years) in the meta-analysis of all patients. Inactivation of Tet2 in mouse hematopoietic cells exacerbated the development of emphysema and inflammation in models of cigarette smoke exposure. Somatic mutations in blood cells are associated with the development and severity of COPD, independent of age and cumulative smoke exposure.


Subject(s)
Clonal Hematopoiesis , Pulmonary Disease, Chronic Obstructive/genetics , Animals , Female , Humans , Male , Mice , Middle Aged , Odds Ratio , Pulmonary Disease, Chronic Obstructive/etiology , Risk Factors , Smoking/adverse effects , Exome Sequencing
7.
Genet Epidemiol ; 45(7): 685-693, 2021 10.
Article in English | MEDLINE | ID: mdl-34159627

ABSTRACT

SARS-CoV-2 mortality has been extensively studied in relation to host susceptibility. How sequence variations in the SARS-CoV-2 genome affect pathogenicity is poorly understood. Starting in October 2020, using the methodology of genome-wide association studies (GWAS), we looked at the association between whole-genome sequencing (WGS) data of the virus and COVID-19 mortality as a potential method of early identification of highly pathogenic strains to target for containment. Although continuously updating our analysis, in December 2020, we analyzed 7548 single-stranded SARS-CoV-2 genomes of COVID-19 patients in the GISAID database and associated variants with mortality using a logistic regression. In total, evaluating 29,891 sequenced loci of the viral genome for association with patient/host mortality, two loci, at 12,053 and 25,088 bp, achieved genome-wide significance (p values of 4.09e-09 and 4.41e-23, respectively), though only 25,088 bp remained significant in follow-up analyses. Our association findings were exclusively driven by the samples that were submitted from Brazil (p value of 4.90e-13 for 25,088 bp). The mutation frequency of 25,088 bp in the Brazilian samples on GISAID has rapidly increased from about 0.4 in October/December 2020 to 0.77 in March 2021. Although GWAS methodology is suitable for samples in which mutation frequencies varies between geographical regions, it cannot account for mutation frequencies that change rapidly overtime, rendering a GWAS follow-up analysis of the GISAID samples that have been submitted after December 2020 as invalid. The locus at 25,088 bp is located in the P.1 strain, which later (April 2021) became one of the distinguishing loci (precisely, substitution V1176F) of the Brazilian strain as defined by the Centers for Disease Control. Specifically, the mutations at 25,088 bp occur in the S2 subunit of the SARS-CoV-2 spike protein, which plays a key role in viral entry of target host cells. Since the mutations alter amino acid coding sequences, they potentially imposing structural changes that could enhance viral infectivity and symptom severity. Our analysis suggests that GWAS methodology can provide suitable analysis tools for the real-time detection of new more transmissible and pathogenic viral strains in databases such as GISAID, though new approaches are needed to accommodate rapidly changing mutation frequencies over time, in the presence of simultaneously changing case/control ratios. Improvements of the associated metadata/patient information in terms of quality and availability will also be important to fully utilize the potential of GWAS methodology in this field.


Subject(s)
COVID-19 , Spike Glycoprotein, Coronavirus , Brazil , Genome-Wide Association Study , Humans , Mutation , Phylogeny , SARS-CoV-2 , Spike Glycoprotein, Coronavirus/genetics
8.
Eur Respir J ; 60(2)2022 08.
Article in English | MEDLINE | ID: mdl-34996830

ABSTRACT

INTRODUCTION: Loss-of-function variants in both copies of the cystic fibrosis transmembrane conductance regulator (CFTR) gene cause cystic fibrosis (CF); however, there is evidence that reduction in CFTR function due to the presence of one deleterious variant can have clinical consequences. Here, we hypothesise that CFTR variants in individuals with a history of smoking are associated with chronic obstructive pulmonary disease (COPD) and related phenotypes. METHODS: Whole-genome sequencing was performed through the National Heart, Lung, and Blood Institute TOPMed (TransOmics in Precision Medicine) programme in 8597 subjects from the COPDGene (Genetic Epidemiology of COPD) study, an observational study of current and former smokers. We extracted clinically annotated CFTR variants and performed single-variant and variant-set testing for COPD and related phenotypes. Replication was performed in 2118 subjects from the ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints) study. RESULTS: We identified 301 coding variants within the CFTR gene boundary: 147 of these have been reported in individuals with CF, including 36 CF-causing variants. We found that CF-causing variants were associated with chronic bronchitis in variant-set testing in COPDGene (one-sided p=0.0025; OR 1.53) and in meta-analysis of COPDGene and ECLIPSE (one-sided p=0.0060; OR 1.52). Single-variant testing revealed that the F508del variant was associated with chronic bronchitis in COPDGene (one-sided p=0.015; OR 1.47). In addition, we identified 32 subjects with two or more CFTR variants on separate alleles and these subjects were enriched for COPD cases (p=0.010). CONCLUSIONS: Cigarette smokers who carry one deleterious CFTR variant have higher rates of chronic bronchitis, while presence of two CFTR variants may be associated with COPD. These results indicate that genetically mediated reduction in CFTR function contributes to COPD related phenotypes, in particular chronic bronchitis.


Subject(s)
Bronchitis, Chronic , Cystic Fibrosis , Pulmonary Disease, Chronic Obstructive , Bronchitis, Chronic/complications , Cystic Fibrosis/complications , Cystic Fibrosis Transmembrane Conductance Regulator/genetics , Humans , Observational Studies as Topic , Pulmonary Disease, Chronic Obstructive/epidemiology , Smokers
9.
Eur Respir J ; 60(3)2022 09.
Article in English | MEDLINE | ID: mdl-35115341

ABSTRACT

BACKGROUND: Genetic susceptibility may be associated with earlier onset of chronic obstructive pulmonary disease (COPD). We hypothesised that a polygenic risk score (PRS) for COPD would be associated with earlier age of diagnosis of COPD. METHODS: In 6647 non-Hispanic White (NHW) and 2464 African American (AA) participants from COPDGene, and 6812 participants from the Framingham Heart Study (FHS), we tested the relationship of the PRS and age of COPD diagnosis. Age at diagnosis was determined by: 1) self-reported age at COPD diagnosis or 2) age at visits when moderate-to-severe airflow limitation (Global Initiative for Chronic Obstructive Lung Disease (GOLD) grade 2-4) was observed on spirometry. We used Cox regression to examine the overall and time-dependent effects of the PRS on incident COPD. In the COPDGene study, we also examined the PRS's predictive value for COPD at age <50 years (COPD50) using logistic regression and area under the curve (AUC) analyses, with and without the addition of other risk factors present at early life (e.g. childhood asthma). RESULTS: In Cox models, the PRS demonstrated age-dependent associations with incident COPD, with larger effects at younger ages in both cohorts. The PRS was associated with COPD50 (OR 1.55 (95% CI 1.41-1.71) for NHW, OR 1.23 (95% CI 1.05-1.43) for AA and OR 2.47 (95% CI 2.12-2.88) for FHS participants). In COPDGene, adding the PRS to known early-life risk factors improved prediction of COPD50 in NHW (AUC 0.69 versus 0.74; p<0.0001) and AA (AUC 0.61 versus 0.64; p=0.04) participants. CONCLUSIONS: A COPD PRS is associated with earlier age of diagnosis of COPD and retains predictive value when added to known early-life risk factors.


Subject(s)
Pulmonary Disease, Chronic Obstructive , Child , Genetic Predisposition to Disease , Humans , Lung , Middle Aged , Pulmonary Disease, Chronic Obstructive/diagnosis , Pulmonary Disease, Chronic Obstructive/genetics , Risk Factors , Spirometry
10.
Eur Respir J ; 60(2)2022 08.
Article in English | MEDLINE | ID: mdl-35115336

ABSTRACT

BACKGROUND: Interstitial lung abnormalities (ILA) share many features with idiopathic pulmonary fibrosis; however, it is not known if ILA are associated with decreased mean telomere length (MTL). METHODS: Telomere length was measured with quantitative PCR in the Genetic Epidemiology of Chronic Obstructive Pulmonary Disease (COPDGene) and Age Gene/Environment Susceptibility Reykjavik (AGES-Reykjavik) cohorts and Southern blot analysis was used in the Framingham Heart Study (FHS). Logistic and linear regression were used to assess the association between ILA and MTL; Cox proportional hazards models were used to assess the association between MTL and mortality. RESULTS: In all three cohorts, ILA were associated with decreased MTL. In the COPDGene and AGES-Reykjavik cohorts, after adjustment there was greater than twofold increase in the odds of ILA when comparing the shortest quartile of telomere length to the longest quartile (OR 2.2, 95% CI 1.5-3.4, p=0.0001, and OR 2.6, 95% CI 1.4-4.9, p=0.003, respectively). In the FHS, those with ILA had shorter telomeres than those without ILA (-767 bp, 95% CI 76-1584 bp, p=0.03). Although decreased MTL was associated with chronic obstructive pulmonary disease (OR 1.3, 95% CI 1.1-1.6, p=0.01) in COPDGene, the effect estimate was less than that noted with ILA. There was no consistent association between MTL and risk of death when comparing the shortest quartile of telomere length in COPDGene and AGES-Reykjavik (HR 0.82, 95% CI 0.4-1.7, p=0.6, and HR 1.2, 95% CI 0.6-2.2, p=0.5, respectively). CONCLUSION: ILA are associated with decreased MTL.


Subject(s)
Lung Diseases, Interstitial , Pulmonary Disease, Chronic Obstructive , Humans , Lung , Lung Diseases, Interstitial/epidemiology , Lung Diseases, Interstitial/genetics , Telomere/genetics , Tomography, X-Ray Computed
11.
J Allergy Clin Immunol ; 148(6): 1589-1595, 2021 12.
Article in English | MEDLINE | ID: mdl-34536413

ABSTRACT

BACKGROUND: Total serum IgE (tIgE) is an important intermediate phenotype of allergic disease. Whole genome genetic association studies across ancestries may identify important determinants of IgE. OBJECTIVE: We aimed to increase understanding of genetic variants affecting tIgE production across the ancestry and allergic disease spectrum by leveraging data from the National Heart, Lung and Blood Institute Trans-Omics for Precision Medicine program; the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA); and the Atopic Dermatitis Research Network (N = 21,901). METHODS: We performed genome-wide association within strata of study, disease, and ancestry groups, and we combined results via a meta-regression approach that models heterogeneity attributable to ancestry. We also tested for association between HLA alleles called from whole genome sequence data and tIgE, assessing replication of associations in HLA alleles called from genotype array data. RESULTS: We identified 6 loci at genome-wide significance (P < 5 × 10-9), including 4 loci previously reported as genome-wide significant for tIgE, as well as new regions in chr11q13.5 and chr15q22.2, which were also identified in prior genome-wide association studies of atopic dermatitis and asthma. In the HLA allele association study, HLA-A∗02:01 was associated with decreased tIgE level (Pdiscovery = 2 × 10-4; Preplication = 5 × 10-4; Pdiscovery+replication = 4 × 10-7), and HLA-DQB1∗03:02 was strongly associated with decreased tIgE level in Hispanic/Latino ancestry populations (PHispanic/Latino discovery+replication = 8 × 10-8). CONCLUSION: We performed the largest genome-wide association study and HLA association study of tIgE focused on ancestrally diverse populations and found several known tIgE and allergic disease loci that are relevant in non-European ancestry populations.


Subject(s)
Asthma/genetics , Dermatitis, Atopic/genetics , Ethnicity , Genotype , HLA-A2 Antigen/genetics , HLA-DQ beta-Chains/genetics , Adolescent , Adult , Aged , Child , Child, Preschool , Female , Gene Frequency , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Immunoglobulin E/blood , Male , Middle Aged , National Heart, Lung, and Blood Institute (U.S.) , United States , Whole Genome Sequencing , Young Adult
12.
Am J Respir Cell Mol Biol ; 65(5): 532-543, 2021 11.
Article in English | MEDLINE | ID: mdl-34166600

ABSTRACT

Chronic obstructive pulmonary disease (COPD) is a common, complex disease and a major cause of morbidity and mortality. Although multiple genetic determinants of COPD have been implicated by genome-wide association studies (GWASs), the pathophysiological significance of these associations remains largely unknown. From a COPD protein-protein interaction network module, we selected a network path between two COPD GWAS genes for validation studies: FAM13A (family with sequence similarity 13 member A)-AP3D1-CTGF- TGFß2. We find that TGFß2, FAM13A, and AP3D1 (but not CTGF) form a cellular protein complex. Functional characterization suggests that this complex mediates the secretion of TGFß2 through an AP-3 (adaptor protein 3)-dependent pathway, with FAM13A acting as a negative regulator by targeting a late stage of this transport that involves the dissociation of coat-cargo interaction. Moreover, we find that TGFß2 is a transmembrane protein that engages the AP-3 complex for delivery to the late endosomal compartments for subsequent secretion through exosomes. These results identify a pathophysiological context that unifies the biological network role of two COPD GWAS proteins and reveal novel mechanisms of cargo transport through an intracellular pathway.


Subject(s)
Adaptor Protein Complex 3/metabolism , Adaptor Protein Complex delta Subunits/metabolism , GTPase-Activating Proteins/metabolism , Pulmonary Disease, Chronic Obstructive/genetics , Pulmonary Disease, Chronic Obstructive/metabolism , Transforming Growth Factor beta2/metabolism , Adaptor Protein Complex 3/genetics , Adaptor Protein Complex delta Subunits/genetics , Cell Line , Exosomes/metabolism , GTPase-Activating Proteins/genetics , Genome-Wide Association Study , HEK293 Cells , Humans , Protein Interaction Maps/genetics , Protein Transport , Reproducibility of Results , Transforming Growth Factor beta2/genetics
13.
Genet Epidemiol ; 44(7): 785-794, 2020 10.
Article in English | MEDLINE | ID: mdl-32681690

ABSTRACT

Noncoding DNA contains gene regulatory elements that alter gene expression, and the function of these elements can be modified by genetic variation. Massively parallel reporter assays (MPRA) enable high-throughput identification and characterization of functional genetic variants, but the statistical methods to identify allelic effects in MPRA data have not been fully developed. In this study, we demonstrate how the baseline allelic imbalance in MPRA libraries can produce biased results, and we propose a novel, nonparametric, adaptive testing method that is robust to this bias. We compare the performance of this method with other commonly used methods, and we demonstrate that our novel adaptive method controls Type I error in a wide range of scenarios while maintaining excellent power. We have implemented these tests along with routines for simulating MPRA data in the Analysis Toolset for MPRA (@MPRA), an R package for the design and analyses of MPRA experiments. It is publicly available at http://github.com/redaq/atMPRA.


Subject(s)
DNA/genetics , Gene Expression/genetics , High-Throughput Nucleotide Sequencing/methods , RNA, Untranslated/genetics , Regulatory Sequences, Nucleic Acid/genetics , Alleles , Genetic Variation/genetics , Humans , Research Design , Software
14.
Hum Mol Genet ; 27(21): 3801-3812, 2018 11 01.
Article in English | MEDLINE | ID: mdl-30060175

ABSTRACT

Chronic obstructive pulmonary disease (COPD), one of the leading causes of death worldwide, is substantially influenced by genetic factors. Alpha-1 antitrypsin deficiency demonstrates that rare coding variants of large effect can influence COPD susceptibility. To identify additional rare coding variants in patients with severe COPD, we conducted whole exome sequencing analysis in 2543 subjects from two family-based studies (Boston Early-Onset COPD Study and International COPD Genetics Network) and one case-control study (COPDGene). Applying a gene-based segregation test in the family-based data, we identified significant segregation of rare loss of function variants in TBC1D10A and RFPL1 (P-value < 2x10-6), but were unable to find similar variants in the case-control study. In single-variant, gene-based and pathway association analyses, we were unable to find significant findings that replicated or were significant in meta-analysis. However, we found that the top results in the two datasets were in proximity to each other in the protein-protein interaction network (P-value = 0.014), suggesting enrichment of these results for similar biological processes. A network of these association results and their neighbors was significantly enriched in the transforming growth factor beta-receptor binding and cilia-related pathways. Finally, in a more detailed examination of candidate genes, we identified individuals with putative high-risk variants, including patients harboring homozygous mutations in genes associated with cutis laxa and Niemann-Pick Disease Type C. Our results likely reflect heterogeneity of genetic risk for COPD along with limitations of statistical power and functional annotation, and highlight the potential of network analysis to gain insight into genetic association studies.


Subject(s)
Exome Sequencing , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide , Pulmonary Disease, Chronic Obstructive/genetics , Adolescent , Adult , Aged , Case-Control Studies , DNA Mutational Analysis , Female , Genetic Association Studies , Humans , Male , Middle Aged , Mutation , Young Adult
15.
Am J Respir Crit Care Med ; 199(1): 52-61, 2019 01 01.
Article in English | MEDLINE | ID: mdl-30079747

ABSTRACT

RATIONALE: The identification of causal variants responsible for disease associations from genome-wide association studies (GWASs) facilitates functional understanding of the biological mechanisms by which those genetic variants influence disease susceptibility. OBJECTIVE: We aim to identify causal variants in or near the FAM13A (family with sequence similarity member 13A) GWAS locus associated with chronic obstructive pulmonary disease (COPD). METHODS: We used an integrated approach featuring conditional genetic analysis, massively parallel reporter assays (MPRAs), traditional reporter assays, chromatin conformation capture assays, and clustered regularly interspaced short palindromic repeats (CRISPR)-based gene editing to characterize COPD-associated regulatory variants in the FAM13A region in human bronchial epithelial cell lines. MEASUREMENTS AND MAIN RESULTS: Conditional genetic association suggests the presence of two independent COPD association signals in FAM13A. MPRAs identified 45 regulatory variants within FAM13A, among which six variants were prioritized for further investigation. Three COPD-associated variants demonstrated significant allele-specific activity in reporter assays. One of three variants, rs2013701, was tested in the endogenous genomic context by CRISPR-based genome editing that confirmed its allele-specific effects on FAM13A expression and on cell proliferation, providing functional characterization for this COPD-associated variant. CONCLUSIONS: The human GWAS association near FAM13A may contain independent association signals. MPRAs identified multiple functional variants in this region, including rs2013701, a putative COPD-causing variant with allele-specific regulatory activity.


Subject(s)
GTPase-Activating Proteins/genetics , Polymorphism, Single Nucleotide/genetics , Pulmonary Disease, Chronic Obstructive/genetics , CRISPR-Associated Protein 9 , CRISPR-Cas Systems , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , High-Throughput Nucleotide Sequencing , Humans , Quantitative Trait Loci/genetics
16.
Am J Hum Genet ; 99(4): 846-859, 2016 Oct 06.
Article in English | MEDLINE | ID: mdl-27666371

ABSTRACT

Recently, multiple studies have performed whole-exome or whole-genome sequencing to identify groups of rare variants associated with complex traits and diseases. They have primarily utilized case-control study designs that often require thousands of individuals to reach acceptable statistical power. Family-based studies can be more powerful because a rare variant can be enriched in an extended pedigree and segregate with the phenotype. Although many methods have been proposed for using family data to discover rare variants involved in a disease, a majority of them focus on a specific pedigree structure and are designed to analyze either binary or continuously measured outcomes. In this article, we propose RareIBD, a general and powerful approach to identifying rare variants involved in disease susceptibility. Our method can be applied to large extended families of arbitrary structure, including pedigrees with only affected individuals. The method accommodates both binary and quantitative traits. A series of simulation experiments suggest that RareIBD is a powerful test that outperforms existing approaches. In addition, our method accounts for individuals in top generations, which are not usually genotyped in extended families. In contrast to available statistical tests, RareIBD generates accurate p values even when genetic data from these individuals are missing. We applied RareIBD, as well as other methods, to two extended family datasets generated by different genotyping technologies and representing different ethnicities. The analysis of real data confirmed that RareIBD is the only method that properly controls type I error.


Subject(s)
Family , Genetic Predisposition to Disease/genetics , Genetic Variation/genetics , Pedigree , Datasets as Topic , Ethnicity/genetics , Female , Genotype , Humans , Male , Models, Genetic , Phenotype , Research Design
17.
Am J Respir Cell Mol Biol ; 59(5): 614-622, 2018 11.
Article in English | MEDLINE | ID: mdl-29949718

ABSTRACT

Genome-wide association studies have identified common variants associated with chronic obstructive pulmonary disease (COPD). Whole-genome sequencing (WGS) offers comprehensive coverage of the entire genome, as compared with genotyping arrays or exome sequencing. We hypothesized that WGS in subjects with severe COPD and smoking control subjects with normal pulmonary function would allow us to identify novel genetic determinants of COPD. We sequenced 821 patients with severe COPD and 973 control subjects from the COPDGene and Boston Early-Onset COPD studies, including both non-Hispanic white and African American individuals. We performed single-variant and grouped-variant analyses, and in addition, we assessed the overlap of variants between sequencing- and array-based imputation. Our most significantly associated variant was in a known region near HHIP (combined P = 1.6 × 10-9); additional variants approaching genome-wide significance included previously described regions in CHRNA5, TNS1, and SERPINA6/SERPINA1 (the latter in African American individuals). None of our associations were clearly driven by rare variants, and we found minimal evidence of replication of genes identified by previously reported smaller sequencing studies. With WGS, we identified more than 20 million new variants, not seen with imputation, including more than 10,000 of potential importance in previously identified COPD genome-wide association study regions. WGS in severe COPD identifies a large number of potentially important functional variants, with the strongest associations being in known COPD risk loci, including HHIP and SERPINA1. Larger sample sizes will be needed to identify associated variants in novel regions of the genome.


Subject(s)
Genome-Wide Association Study , Lung/metabolism , Polymorphism, Single Nucleotide , Pulmonary Disease, Chronic Obstructive/genetics , Severity of Illness Index , Whole Genome Sequencing/methods , Black or African American/statistics & numerical data , Aged , Case-Control Studies , Cohort Studies , Female , Genetic Predisposition to Disease , Humans , Lung/pathology , Male , Middle Aged , Pulmonary Disease, Chronic Obstructive/ethnology , White People/statistics & numerical data
18.
Genet Epidemiol ; 41(4): 309-319, 2017 05.
Article in English | MEDLINE | ID: mdl-28191685

ABSTRACT

Whole-exome sequencing using family data has identified rare coding variants in Mendelian diseases or complex diseases with Mendelian subtypes, using filters based on variant novelty, functionality, and segregation with the phenotype within families. However, formal statistical approaches are limited. We propose a gene-based segregation test (GESE) that quantifies the uncertainty of the filtering approach. It is constructed using the probability of segregation events under the null hypothesis of Mendelian transmission. This test takes into account different degrees of relatedness in families, the number of functional rare variants in the gene, and their minor allele frequencies in the corresponding population. In addition, a weighted version of this test allows incorporating additional subject phenotypes to improve statistical power. We show via simulations that the GESE and weighted GESE tests maintain appropriate type I error rate, and have greater power than several commonly used region-based methods. We apply our method to whole-exome sequencing data from 49 extended pedigrees with severe, early-onset chronic obstructive pulmonary disease (COPD) in the Boston Early-Onset COPD study (BEOCOPD) and identify several promising candidate genes. Our proposed methods show great potential for identifying rare coding variants of large effect and high penetrance for family-based sequencing data. The proposed tests are implemented in an R package that is available on CRAN (https://cran.r-project.org/web/packages/GESE/).


Subject(s)
Genetic Variation , Pulmonary Disease, Chronic Obstructive/genetics , Sequence Analysis, DNA/methods , Age of Onset , Boston , Computer Simulation , Databases, Genetic , Family , Genome, Human , Humans , Models, Genetic , Penetrance , Reference Standards
19.
Genet Epidemiol ; 40(6): 502-11, 2016 09.
Article in English | MEDLINE | ID: mdl-27312886

ABSTRACT

Family-based designs have been repeatedly shown to be powerful in detecting the significant rare variants associated with human diseases. Furthermore, human diseases are often defined by the outcomes of multiple phenotypes, and thus we expect multivariate family-based analyses may be very efficient in detecting associations with rare variants. However, few statistical methods implementing this strategy have been developed for family-based designs. In this report, we describe one such implementation: the multivariate family-based rare variant association tool (mFARVAT). mFARVAT is a quasi-likelihood-based score test for rare variant association analysis with multiple phenotypes, and tests both homogeneous and heterogeneous effects of each variant on multiple phenotypes. Simulation results show that the proposed method is generally robust and efficient for various disease models, and we identify some promising candidate genes associated with chronic obstructive pulmonary disease. The software of mFARVAT is freely available at http://healthstat.snu.ac.kr/software/mfarvat/, implemented in C++ and supported on Linux and MS Windows.


Subject(s)
Genetic Variation , Models, Genetic , Computer Simulation , Genetic Association Studies , Humans , Likelihood Functions , Phenotype
20.
Genet Epidemiol ; 40(6): 475-85, 2016 09.
Article in English | MEDLINE | ID: mdl-27325607

ABSTRACT

Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods.


Subject(s)
Genes, X-Linked , Models, Genetic , Alleles , Chromosomes, Human, X , Female , Genetic Variation , Humans , Pedigree , Phenotype , X Chromosome Inactivation
SELECTION OF CITATIONS
SEARCH DETAIL