Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 49
Filter
1.
medRxiv ; 2024 May 20.
Article in English | MEDLINE | ID: mdl-38826461

ABSTRACT

Rationale: Genetic variants and gene expression predict risk of chronic obstructive pulmonary disease (COPD), but their effect on COPD heterogeneity is unclear. Objectives: Define high-risk COPD subtypes using both genetics (polygenic risk score, PRS) and blood gene expression (transcriptional risk score, TRS) and assess differences in clinical and molecular characteristics. Methods: We defined high-risk groups based on PRS and TRS quantiles by maximizing differences in protein biomarkers in a COPDGene training set and identified these groups in COPDGene and ECLIPSE test sets. We tested multivariable associations of subgroups with clinical outcomes and compared protein-protein interaction networks and drug repurposing analyses between high-risk groups. Measurements and Main Results: We examined two high-risk omics-defined groups in non-overlapping test sets (n=1,133 NHW COPDGene, n=299 African American (AA) COPDGene, n=468 ECLIPSE). We defined "High activity" (low PRS/high TRS) and "severe risk" (high PRS/high TRS) subgroups. Participants in both subgroups had lower body-mass index (BMI), lower lung function, and alterations in metabolic, growth, and immune signaling processes compared to a low-risk (low PRS, low TRS) reference subgroup. "High activity" but not "severe risk" participants had greater prospective FEV 1 decline (COPDGene: -51 mL/year; ECLIPSE: - 40 mL/year) and their proteomic profiles were enriched in gene sets perturbed by treatment with 5-lipoxygenase inhibitors and angiotensin-converting enzyme (ACE) inhibitors. Conclusions: Concomitant use of polygenic and transcriptional risk scores identified clinical and molecular heterogeneity amongst high-risk individuals. Proteomic and drug repurposing analysis identified subtype-specific enrichment for therapies and suggest prior drug repurposing failures may be explained by patient selection.

2.
Genes (Basel) ; 15(5)2024 Apr 27.
Article in English | MEDLINE | ID: mdl-38790194

ABSTRACT

Depression is heritable, differs by sex, and has environmental risk factors such as cigarette smoking. However, the effect of single nucleotide polymorphisms (SNPs) on depression through cigarette smoking and the role of sex is unclear. In order to examine the association of SNPs with depression and smoking in the UK Biobank with replication in the COPDGene study, we used counterfactual-based mediation analysis to test the indirect or mediated effect of SNPs on broad depression through the log of pack-years of cigarette smoking, adjusting for age, sex, current smoking status, and genetic ancestry (via principal components). In secondary analyses, we adjusted for age, sex, current smoking status, genetic ancestry (via principal components), income, education, and living status (urban vs. rural). In addition, we examined sex-stratified mediation models and sex-moderated mediation models. For both analyses, we adjusted for age, current smoking status, and genetic ancestry (via principal components). In the UK Biobank, rs6424532 [LOC105378800] had a statistically significant indirect effect on broad depression through the log of pack-years of cigarette smoking (p = 4.0 × 10-4) among all participants and a marginally significant indirect effect among females (p = 0.02) and males (p = 4.0 × 10-3). Moreover, rs10501696 [GRM5] had a marginally significant indirect effect on broad depression through the log of pack-years of cigarette smoking (p = 0.01) among all participants and a significant indirect effect among females (p = 2.2 × 10-3). In the secondary analyses, the sex-moderated indirect effect was marginally significant for rs10501696 [GRM5] on broad depression through the log of pack-years of cigarette smoking (p = 0.01). In the COPDGene study, the effect of an SNP (rs10501696) in GRM5 on depressive symptoms and medication was mediated by log of pack-years (p = 0.02); however, no SNPs had a sex-moderated mediated effect on depressive symptoms. In the UK Biobank, we found SNPs in two genes [LOC105378800, GRM5] with an indirect effect on broad depression through the log of pack-years of cigarette smoking. In addition, the indirect effect for GRM5 on broad depression through smoking may be moderated by sex. These results suggest that genetic regions associated with broad depression may be mediated by cigarette smoking and this relationship may be moderated by sex.


Subject(s)
Depression , Polymorphism, Single Nucleotide , Humans , Male , Female , Depression/genetics , Depression/epidemiology , Middle Aged , Aged , Smoking/genetics , Sex Factors , Genetic Predisposition to Disease , United Kingdom/epidemiology , Cigarette Smoking/genetics , Cigarette Smoking/adverse effects , Risk Factors
3.
Article in English | MEDLINE | ID: mdl-38737375

ABSTRACT

Released mitochondrial DNA (mtDNA) in cells activates cGAS-STING pathway, which induces expression of interferon-stimulated genes (ISGs) and thereby promotes inflammation, as frequently seen in asthmatic airways. However, whether the genetic determinant, Gasdermin B (GSDMB), the most replicated asthma risk gene, regulates this pathway remains unknown. We set out to determine whether and how GSDMB regulates mtDNA-activated cGAS-STING pathway and subsequent ISGs induction in human airway epithelial cells. Using qPCR, ELISA, native polyacrylamide gel electrophoresis, co-immunoprecipitation and immunofluorescence assays, we evaluated the regulation of GSDMB on cGAS-STING pathway in both BEAS-2B cells and primary normal human bronchial epithelial cells (nHBEs). mtDNA was extracted in plasma samples from human asthmatics and the correlation between mtDNA levels and eosinophil counts was analyzed. GSDMB is significantly associated with RANTES expression in asthmatic nasal epithelial brushing samples from the Genes-environments and Admixture in Latino Americans (GALA) II study. Over-expression of GSDMB promotes DNA-induced IFN and ISGs expression in bronchial epithelial BEAS-2B cells and nHBEs. Conversely, knockout of GSDMB led to weakened induction of interferon (IFNs) and ISGs in BEAS-2B cells. Mechanistically, GSDMB interacts with the C-terminus of STING, promoting the translocation of STING to Golgi, leading to the phosphorylation of IRF3 and induction of IFNs and ISGs. mtDNA copy number in serum from asthmatics was significantly correlated with blood eosinophil counts especially in male subjects. GSDMB promotes the activation of mtDNA and poly (dA:dT)-induced activation of cGAS-STING pathway in airway epithelial cells, leading to enhanced induction of ISGs.

4.
Alzheimers Dement ; 20(5): 3397-3405, 2024 May.
Article in English | MEDLINE | ID: mdl-38563508

ABSTRACT

INTRODUCTION: Genome-wide association studies have identified numerous disease susceptibility loci (DSLs) for Alzheimer's disease (AD). However, only a limited number of studies have investigated the dependence of the genetic effect size of established DSLs on genetic ancestry. METHODS: We utilized the whole genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP) including 35,569 participants. A total of 25,459 subjects in four distinct populations (African ancestry, non-Hispanic White, admixed Hispanic, and Asian) were analyzed. RESULTS: We found that nine DSLs showed significant heterogeneity across populations. Single nucleotide polymorphism (SNP) rs2075650 in translocase of outer mitochondrial membrane 40 (TOMM40) showed the largest heterogeneity (Cochran's Q = 0.00, I2 = 90.08), followed by other SNPs in apolipoprotein C1 (APOC1) and apolipoprotein E (APOE). Two additional loci, signal-induced proliferation-associated 1 like 2 (SIPA1L2) and solute carrier 24 member 4 (SLC24A4), showed significant heterogeneity across populations. DISCUSSION: We observed substantial heterogeneity for the APOE-harboring 19q13.32 region with TOMM40/APOE/APOC1 genes. The largest risk effect was seen among African Americans, while Asians showed a surprisingly small risk effect.


Subject(s)
Alzheimer Disease , Genetic Predisposition to Disease , Genome-Wide Association Study , Mitochondrial Precursor Protein Import Complex Proteins , Polymorphism, Single Nucleotide , Humans , Alzheimer Disease/genetics , Genetic Predisposition to Disease/genetics , Polymorphism, Single Nucleotide/genetics , Apolipoproteins E/genetics , Female , Male , Apolipoprotein C-I/genetics , Aged , Membrane Transport Proteins/genetics , Genetic Loci/genetics
5.
Eur Respir J ; 63(5)2024 May.
Article in English | MEDLINE | ID: mdl-38514093

ABSTRACT

RATIONALE: Respiratory virus-induced inflammation is the leading cause of asthma exacerbation, frequently accompanied by induction of interferon-stimulated genes (ISGs). How asthma-susceptibility genes modulate cellular response upon viral infection by fine-tuning ISG induction and subsequent airway inflammation in genetically susceptible asthma patients remains largely unknown. OBJECTIVES: To decipher the functions of gasdermin B (encoded by GSDMB) in respiratory virus-induced lung inflammation. METHODS: In two independent cohorts, we analysed expression correlation between GSDMB and ISG s. In human bronchial epithelial cell line or primary bronchial epithelial cells, we generated GSDMB-overexpressing and GSDMB-deficient cells. A series of quantitative PCR, ELISA and co-immunoprecipitation assays were performed to determine the function and mechanism of GSDMB for ISG induction. We also generated a novel transgenic mouse line with inducible expression of human unique GSDMB gene in airway epithelial cells and infected the mice with respiratory syncytial virus to determine the role of GSDMB in respiratory syncytial virus-induced lung inflammation in vivo. RESULTS: GSDMB is one of the most significant asthma-susceptibility genes at 17q21 and acts as a novel RNA sensor, promoting mitochondrial antiviral-signalling protein (MAVS)-TANK binding kinase 1 (TBK1) signalling and subsequent inflammation. In airway epithelium, GSDMB is induced by respiratory viral infections. Expression of GSDMB and ISGs significantly correlated in respiratory epithelium from two independent asthma cohorts. Notably, inducible expression of human GSDMB in mouse airway epithelium led to enhanced ISGs induction and increased airway inflammation with mucus hypersecretion upon respiratory syncytial virus infection. CONCLUSIONS: GSDMB promotes ISGs expression and airway inflammation upon respiratory virus infection, thereby conferring asthma risk in risk allele carriers.


Subject(s)
Adaptor Proteins, Signal Transducing , Asthma , Gasdermins , Protein Serine-Threonine Kinases , Signal Transduction , Animals , Humans , Asthma/metabolism , Asthma/genetics , Mice , Adaptor Proteins, Signal Transducing/metabolism , Adaptor Proteins, Signal Transducing/genetics , Protein Serine-Threonine Kinases/metabolism , Protein Serine-Threonine Kinases/genetics , Mice, Transgenic , Neoplasm Proteins/genetics , Neoplasm Proteins/metabolism , Genetic Predisposition to Disease , Respiratory Syncytial Virus Infections/metabolism , Respiratory Syncytial Virus Infections/genetics , Epithelial Cells/metabolism , Cell Line , Bronchi/metabolism , Bronchi/pathology , Pneumonia/metabolism , Pneumonia/genetics , Pneumonia/virology , Female , Lung/metabolism , Lung/pathology
6.
BMC Bioinformatics ; 25(1): 43, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38273228

ABSTRACT

The computation of a similarity measure for genomic data is a standard tool in computational genetics. The principal components of such matrices are routinely used to correct for biases due to confounding by population stratification, for instance in linear regressions. However, the calculation of both a similarity matrix and its singular value decomposition (SVD) are computationally intensive. The contribution of this article is threefold. First, we demonstrate that the calculation of three matrices (called the covariance matrix, the weighted Jaccard matrix, and the genomic relationship matrix) can be reformulated in a unified way which allows for the application of a randomized SVD algorithm, which is faster than the traditional computation. The fast SVD algorithm we present is adapted from an existing randomized SVD algorithm and ensures that all computations are carried out in sparse matrix algebra. The algorithm only assumes that row-wise and column-wise subtraction and multiplication of a vector with a sparse matrix is available, an operation that is efficiently implemented in common sparse matrix packages. An exception is the so-called Jaccard matrix, which does not have a structure applicable for the fast SVD algorithm. Second, an approximate Jaccard matrix is introduced to which the fast SVD computation is applicable. Third, we establish guaranteed theoretical bounds on the accuracy (in [Formula: see text] norm and angle) between the principal components of the Jaccard matrix and the ones of our proposed approximation, thus putting the proposed Jaccard approximation on a solid mathematical foundation, and derive the theoretical runtime of our algorithm. We illustrate that the approximation error is low in practice and empirically verify the theoretical runtime scalings on both simulated data and data of the 1000 Genome Project.


Subject(s)
Genome , Genomics , Algorithms , Linear Models
7.
Epigenetics ; 18(1): 2257437, 2023 12.
Article in English | MEDLINE | ID: mdl-37731367

ABSTRACT

Background: Recent studies have identified thousands of associations between DNA methylation CpGs and complex diseases/traits, emphasizing the critical role of epigenetics in understanding disease aetiology and identifying biomarkers. However, association analyses based on methylation array data are susceptible to batch/slide effects, which can lead to inflated false positive rates or reduced statistical powerResults: We use multiple DNA methylation datasets based on the popular Illumina Infinium MethylationEPIC BeadChip array to describe consistent patterns and the joint distribution of slide effects across CpGs, confirming and extending previous results. The susceptible CpGs overlap with the Illumina Infinium HumanMethylation450 BeadChip array content.Conclusions: Our findings reveal systematic patterns in slide effects. The observations provide further insights into the characteristics of these effects and can improve existing adjustment approaches.


Subject(s)
DNA Methylation , Epigenesis, Genetic , Epigenomics , Multifactorial Inheritance
8.
Am J Respir Crit Care Med ; 208(7): 791-801, 2023 10 01.
Article in English | MEDLINE | ID: mdl-37523715

ABSTRACT

Rationale: In addition to rare genetic variants and the MUC5B locus, common genetic variants contribute to idiopathic pulmonary fibrosis (IPF) risk. The predictive power of common variants outside the MUC5B locus for IPF and interstitial lung abnormalities (ILAs) is unknown. Objectives: We tested the predictive value of IPF polygenic risk scores (PRSs) with and without the MUC5B region on IPF, ILA, and ILA progression. Methods: We developed PRSs that included (PRS-M5B) and excluded (PRS-NO-M5B) the MUC5B region (500-kb window around rs35705950-T) using an IPF genome-wide association study. We assessed PRS associations with area under the receiver operating characteristic curve (AUC) metrics for IPF, ILA, and ILA progression. Measurements and Main Results: We included 14,650 participants (1,970 IPF; 1,068 ILA) from six multi-ancestry population-based and case-control cohorts. In cases excluded from genome-wide association study, the PRS-M5B (odds ratio [OR] per SD of the score, 3.1; P = 7.1 × 10-95) and PRS-NO-M5B (OR per SD, 2.8; P = 2.5 × 10-87) were associated with IPF. Participants in the top PRS-NO-M5B quintile had ∼sevenfold odds for IPF compared with those in the first quintile. A clinical model predicted IPF (AUC, 0.61); rs35705950-T and PRS-NO-M5B demonstrated higher AUCs (0.73 and 0.7, respectively), and adding both genetic predictors to a clinical model yielded the highest performance (AUC, 0.81). The PRS-NO-M5B was associated with ILA (OR, 1.25) and ILA progression (OR, 1.16) in European ancestry participants. Conclusions: A common genetic variant risk score complements the MUC5B variant to identify individuals at high risk of interstitial lung abnormalities and pulmonary fibrosis.


Subject(s)
Genome-Wide Association Study , Idiopathic Pulmonary Fibrosis , Humans , Idiopathic Pulmonary Fibrosis/genetics , Risk Factors , Lung , Mucin-5B/genetics , Genetic Predisposition to Disease
9.
Respir Res ; 24(1): 63, 2023 Feb 26.
Article in English | MEDLINE | ID: mdl-36842969

ABSTRACT

BACKGROUND: Asthma is a heterogeneous disease with high morbidity. Advancement in high-throughput multi-omics approaches has enabled the collection of molecular assessments at different layers, providing a complementary perspective of complex diseases. Numerous computational methods have been developed for the omics-based patient classification or disease outcome prediction. Yet, a systematic benchmarking of those methods using various combinations of omics data for the prediction of asthma development is still lacking. OBJECTIVE: We aimed to investigate the computational methods in disease status prediction using multi-omics data. METHOD: We systematically benchmarked 18 computational methods using all the 63 combinations of six omics data (GWAS, miRNA, mRNA, microbiome, metabolome, DNA methylation) collected in The Vitamin D Antenatal Asthma Reduction Trial (VDAART) cohort. We evaluated each method using standard performance metrics for each of the 63 omics combinations. RESULTS: Our results indicate that overall Logistic Regression, Multi-Layer Perceptron, and MOGONET display superior performance, and the combination of transcriptional, genomic and microbiome data achieves the best prediction. Moreover, we find that including the clinical data can further improve the prediction performance for some but not all the omics combinations. CONCLUSIONS: Specific omics combinations can reach the optimal prediction of asthma development in children. And certain computational methods showed superior performance than other methods.


Subject(s)
Asthma , MicroRNAs , Pregnancy , Humans , Female , Child , Benchmarking , Genomics/methods , Asthma/diagnosis , Asthma/epidemiology , Asthma/genetics , Prognosis
10.
Brief Bioinform ; 24(1)2023 01 19.
Article in English | MEDLINE | ID: mdl-36585781

ABSTRACT

Genetic similarity matrices are commonly used to assess population substructure (PS) in genetic studies. Through simulation studies and by the application to whole-genome sequencing (WGS) data, we evaluate the performance of three genetic similarity matrices: the unweighted and weighted Jaccard similarity matrices and the genetic relationship matrix. We describe different scenarios that can create numerical pitfalls and lead to incorrect conclusions in some instances. We consider scenarios in which PS is assessed based on loci that are located across the genome ('globally') and based on loci from a specific genomic region ('locally'). We also compare scenarios in which PS is evaluated based on loci from different minor allele frequency bins: common (>5%), low-frequency (5-0.5%) and rare (<0.5%) single-nucleotide variations (SNVs). Overall, we observe that all approaches provide the best clustering performance when computed based on rare SNVs. The performance of the similarity matrices is very similar for common and low-frequency variants, but for rare variants, the unweighted Jaccard matrix provides preferable clustering features. Based on visual inspection and in terms of standard clustering metrics, its clusters are the densest and the best separated in the principal component analysis of variants with rare SNVs compared with the other methods and different allele frequency cutoffs. In an application, we assessed the role of rare variants on local and global PS, using WGS data from multiethnic Alzheimer's disease data sets and European or East Asian populations from the 1000 Genome Project.


Subject(s)
Genome , Genomics , Principal Component Analysis , Gene Frequency , Computer Simulation , Genome-Wide Association Study , Polymorphism, Single Nucleotide
11.
Hum Mol Genet ; 32(4): 696-707, 2023 01 27.
Article in English | MEDLINE | ID: mdl-36255742

ABSTRACT

BACKGROUND: Asthma is a heterogeneous common respiratory disease that remains poorly understood. The established genetic associations fail to explain the high estimated heritability, and the prevalence of asthma differs between populations and geographic regions. Robust association analyses incorporating different genetic ancestries and whole-genome sequencing data may identify novel genetic associations. METHODS: We performed family-based genome-wide association analyses of childhood-onset asthma based on whole-genome sequencing (WGS) data for the 'The Genetic Epidemiology of Asthma in Costa Rica' study (GACRS) and the Childhood Asthma Management Program (CAMP). Based on parent-child trios with children diagnosed with asthma, we performed a single variant analysis using an additive and a recessive genetic model and a region-based association analysis of low-frequency and rare variants. RESULTS: Based on 1180 asthmatic trios (894 GACRS trios and 286 CAMP trios, a total of 3540 samples with WGS data), we identified three novel genetic loci associated with childhood-onset asthma: rs4832738 on 4p14 ($P=1.72\ast{10}^{-9}$, recessive model), rs1581479 on 8p22 ($P=1.47\ast{10}^{-8}$, additive model) and rs73367537 on 10q26 ($P=1.21\ast{10}^{-8}$, additive model in GACRS only). Integrative analyses suggested potential novel candidate genes underlying these associations: PGM2 on 4p14 and FGF20 on 8p22. CONCLUSION: Our family-based whole-genome sequencing analysis identified three novel genetic loci for childhood-onset asthma. Gene expression data and integrative analyses point to PGM2 on 4p14 and FGF20 on 8p22 as linked genes. Furthermore, region-based analyses suggest independent potential low-frequency/rare variant associations on 8p22. Follow-up analyses are needed to understand the functional mechanisms and generalizability of these associations.


Subject(s)
Asthma , Genome-Wide Association Study , Humans , Genetic Predisposition to Disease , Asthma/genetics , Genetic Loci , Whole Genome Sequencing , Polymorphism, Single Nucleotide/genetics , Fibroblast Growth Factors/genetics
12.
Eur Respir J ; 61(1)2023 01.
Article in English | MEDLINE | ID: mdl-35953101

ABSTRACT

BACKGROUND: Sex differences related to immune responses can influence atopic manifestations in childhood asthma. While genome-wide association studies have investigated a sex-specific genetic architecture of the immune response, gene-by-sex interactions have not been extensively analysed for atopy-related markers including allergy skin tests, IgE and eosinophils in asthmatic children. METHODS: We performed a genome-wide gene-by-sex interaction analysis for atopy-related markers using whole-genome sequencing data based on 889 trios from the Genetic Epidemiology of Asthma in Costa Rica Study (GACRS) and 284 trios from the Childhood Asthma Management Program (CAMP). We also tested the findings in UK Biobank participants with self-reported childhood asthma. Furthermore, downstream analyses in GACRS integrated gene expression to disentangle observed associations. RESULTS: Single nucleotide polymorphism (SNP) rs1255383 at 10q11.21 demonstrated a genome-wide significant gene-by-sex interaction (pinteraction=9.08×10-10) for atopy (positive skin test) with opposite direction of effects between females and males. In the UK Biobank participants with a history of childhood asthma, the signal was consistently observed with the same sex-specific effect directions for high eosinophil count (pinteraction=0.0058). Gene expression of ZNF33B (zinc finger protein 33B), located at 10q11.21, was moderately associated with atopy in girls, but not in boys. CONCLUSIONS: We report SNPs in/near a zinc finger gene as novel sex-differential loci for atopy-related markers with opposite effect directions in females and males. A potential role for ZNF33B should be studied further as an important driver of sex-divergent features of atopy in childhood asthma.


Subject(s)
Asthma , Hypersensitivity, Immediate , Child , Humans , Male , Female , Genome-Wide Association Study , Immunoglobulin E , Asthma/epidemiology , Hypersensitivity, Immediate/genetics , Hypersensitivity, Immediate/epidemiology , Eosinophils , Polymorphism, Single Nucleotide , Genetic Predisposition to Disease
13.
BMC Bioinformatics ; 23(1): 547, 2022 Dec 19.
Article in English | MEDLINE | ID: mdl-36536276

ABSTRACT

As of June 2022, the GISAID database contains more than 11 million SARS-CoV-2 genomes, including several thousand nucleotide sequences for the most common variants such as delta or omicron. These SARS-CoV-2 strains have been collected from patients around the world since the beginning of the pandemic. We start by assessing the similarity of all pairs of nucleotide sequences using the Jaccard index and principal component analysis. As shown previously in the literature, an unsupervised cluster analysis applied to the SARS-CoV-2 genomes results in clusters of sequences according to certain characteristics such as their strain or their clade. Importantly, we observe that nucleotide sequences of common variants are often outliers in clusters of sequences stemming from variants identified earlier on during the pandemic. Motivated by this finding, we are interested in applying outlier detection to nucleotide sequences. We demonstrate that nucleotide sequences of common variants (such as alpha, delta, or omicron) can be identified solely based on a statistical outlier criterion. We argue that outlier detection might be a useful surveillance tool to identify emerging variants in real time as the pandemic progresses.


Subject(s)
COVID-19 , Humans , Base Sequence , SARS-CoV-2 , Cluster Analysis , Databases, Factual
14.
PLoS Genet ; 18(11): e1010464, 2022 11.
Article in English | MEDLINE | ID: mdl-36383614

ABSTRACT

The identification and understanding of gene-environment interactions can provide insights into the pathways and mechanisms underlying complex diseases. However, testing for gene-environment interaction remains a challenge since a.) statistical power is often limited and b.) modeling of environmental effects is nontrivial and such model misspecifications can lead to false positive interaction findings. To address the lack of statistical power, recent methods aim to identify interactions on an aggregated level using, for example, polygenic risk scores. While this strategy can increase the power to detect interactions, identifying contributing genes and pathways is difficult based on these relatively global results. Here, we propose RITSS (Robust Interaction Testing using Sample Splitting), a gene-environment interaction testing framework for quantitative traits that is based on sample splitting and robust test statistics. RITSS can incorporate sets of genetic variants and/or multiple environmental factors. Based on the user's choice of statistical/machine learning approaches, a screening step selects and combines potential interactions into scores with improved interpretability. In the testing step, the application of robust statistics minimizes the susceptibility to main effect misspecifications. Using extensive simulation studies, we demonstrate that RITSS controls the type 1 error rate in a wide range of scenarios, and we show how the screening strategy influences statistical power. In an application to lung function phenotypes and human height in the UK Biobank, RITSS identified highly significant interactions based on subcomponents of genetic risk scores. While the contributing single variant interaction signals are weak, our results indicate interaction patterns that result in strong aggregated effects, providing potential insights into underlying gene-environment interaction mechanisms.


Subject(s)
Models, Genetic , Polymorphism, Single Nucleotide , Humans , Genetic Loci , Gene-Environment Interaction , Phenotype , Computer Simulation , Genome-Wide Association Study
16.
Hum Mol Genet ; 31(22): 3873-3885, 2022 11 10.
Article in English | MEDLINE | ID: mdl-35766891

ABSTRACT

RATIONALE: Genetic variation has a substantial contribution to chronic obstructive pulmonary disease (COPD) and lung function measurements. Heritability estimates using genome-wide genotyping data can be biased if analyses do not appropriately account for the nonuniform distribution of genetic effects across the allele frequency and linkage disequilibrium (LD) spectrum. In addition, the contribution of rare variants has been unclear. OBJECTIVES: We sought to assess the heritability of COPD and lung function using whole-genome sequence data from the Trans-Omics for Precision Medicine program. METHODS: Using the genome-based restricted maximum likelihood method, we partitioned the genome into bins based on minor allele frequency and LD scores and estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio in 11 051 European ancestry and 5853 African-American participants. MEASUREMENTS AND MAIN RESULTS: In European ancestry participants, the estimated heritability of COPD, FEV1% predicted and FEV1/FVC ratio were 35.5%, 55.6% and 32.5%, of which 18.8%, 19.7%, 17.8% were from common variants, and 16.6%, 35.8%, and 14.6% were from rare variants. These estimates had wide confidence intervals, with common variants and some sets of rare variants showing a statistically significant contribution (P-value < 0.05). In African-Americans, common variant heritability was similar to European ancestry participants, but lower sample size precluded calculation of rare variant heritability. CONCLUSIONS: Our study provides updated and unbiased estimates of heritability for COPD and lung function, and suggests an important contribution of rare variants. Larger studies of more diverse ancestry will improve accuracy of these estimates.


Subject(s)
Genetic Predisposition to Disease , Pulmonary Disease, Chronic Obstructive , Humans , Polymorphism, Single Nucleotide/genetics , Pulmonary Disease, Chronic Obstructive/genetics , Genome-Wide Association Study , Phenotype
17.
PLoS One ; 17(5): e0266752, 2022.
Article in English | MEDLINE | ID: mdl-35544468

ABSTRACT

To increase power and minimize bias in statistical analyses, quantitative outcomes are often adjusted for precision and confounding variables using standard regression approaches. The outcome is modeled as a linear function of the precision variables and confounders; however, for many complex phenotypes, the assumptions of the linear regression models are not always met. As an alternative, we used neural networks for the modeling of complex phenotypes and covariate adjustments. We compared the prediction accuracy of the neural network models to that of classical approaches based on linear regression. Using data from the UK Biobank, COPDGene study, and Childhood Asthma Management Program (CAMP), we examined the features of neural networks in this context and compared them with traditional regression approaches for prediction of three outcomes: forced expiratory volume in one second (FEV1), age at smoking cessation, and log transformation of age at smoking cessation (due to age at smoking cessation being right-skewed). We used mean squared error to compare neural network and regression models, and found the models performed similarly unless the observed distribution of the phenotype was skewed, in which case the neural network had smaller mean squared error. Our results suggest neural network models have an advantage over standard regression approaches when the phenotypic distribution is skewed. However, when the distribution is not skewed, the approaches performed similarly. Our findings are relevant to studies that analyze phenotypes that are skewed by nature or where the phenotype of interest is skewed as a result of the ascertainment condition.


Subject(s)
Neural Networks, Computer , Smoking , Forced Expiratory Volume/genetics , Phenotype , Spirometry
18.
Nat Commun ; 13(1): 2979, 2022 05 27.
Article in English | MEDLINE | ID: mdl-35624101

ABSTRACT

Neutralization capacity of antibodies against Omicron after a prior SARS-CoV-2 infection in children and adolescents is not well studied. Therefore, we evaluated virus-neutralizing capacity against SARS-CoV-2 Alpha, Beta, Gamma, Delta and Omicron variants by age-stratified analyses (<5, 5-11, 12-21 years) in 177 pediatric patients hospitalized with severe acute COVID-19, acute MIS-C, and in convalescent samples of outpatients with mild COVID-19 during 2020 and early 2021. Across all patients, less than 10% show neutralizing antibody titers against Omicron. Children <5 years of age hospitalized with severe acute COVID-19 have lower neutralizing antibodies to SARS-CoV-2 variants compared with patients >5 years of age. As expected, convalescent pediatric COVID-19 and MIS-C cohorts demonstrate higher neutralization titers than hospitalized acute COVID-19 patients. Overall, children and adolescents show some loss of cross-neutralization against all variants, with the most pronounced loss against Omicron. In contrast to SARS-CoV-2 infection, children vaccinated twice demonstrated higher titers against Alpha, Beta, Gamma, Delta and Omicron. These findings can influence transmission, re-infection and the clinical disease outcome from emerging SARS-CoV-2 variants and supports the need for vaccination in children.


Subject(s)
COVID-19 , SARS-CoV-2 , Adolescent , Antibodies, Viral , COVID-19/complications , Child , Child, Preschool , Humans , Membrane Glycoproteins , Neutralization Tests , Spike Glycoprotein, Coronavirus , Systemic Inflammatory Response Syndrome , Viral Envelope Proteins
19.
Mol Psychiatry ; 27(4): 1963-1969, 2022 04.
Article in English | MEDLINE | ID: mdl-35246634

ABSTRACT

Alzheimer's disease (AD) is a genetically complex disease for which nearly 40 loci have now been identified via genome-wide association studies (GWAS). We attempted to identify groups of rare variants (alternate allele frequency <0.01) associated with AD in a region-based, whole-genome sequencing (WGS) association study (rvGWAS) of two independent AD family datasets (NIMH/NIA; 2247 individuals; 605 families). Employing a sliding window approach across the genome, we identified several regions that achieved association p values <10-6, using the burden test or the SKAT statistic. The genomic region around the dystobrevin beta (DTNB) gene was identified with the burden and SKAT test and replicated in case/control samples from the ADSP study reaching genome-wide significance after meta-analysis (pmeta = 4.74 × 10-8). SKAT analysis also revealed region-based association around the Discs large homolog 2 (DLG2) gene and replicated in case/control samples from the ADSP study (pmeta = 1 × 10-6). In conclusion, in a region-based rvGWAS of AD we identified two novel AD genes, DLG2 and DTNB, based on association with rare variants.


Subject(s)
Alzheimer Disease , Dystrophin-Associated Proteins/genetics , Neuropeptides/genetics , Alzheimer Disease/genetics , Dithionitrobenzoic Acid , Genetic Predisposition to Disease/genetics , Genome-Wide Association Study , Genomics , Guanylate Kinases/genetics , Humans , Polymorphism, Single Nucleotide/genetics , Tumor Suppressor Proteins/genetics , Whole Genome Sequencing
20.
Genet Epidemiol ; 45(7): 685-693, 2021 10.
Article in English | MEDLINE | ID: mdl-34159627

ABSTRACT

SARS-CoV-2 mortality has been extensively studied in relation to host susceptibility. How sequence variations in the SARS-CoV-2 genome affect pathogenicity is poorly understood. Starting in October 2020, using the methodology of genome-wide association studies (GWAS), we looked at the association between whole-genome sequencing (WGS) data of the virus and COVID-19 mortality as a potential method of early identification of highly pathogenic strains to target for containment. Although continuously updating our analysis, in December 2020, we analyzed 7548 single-stranded SARS-CoV-2 genomes of COVID-19 patients in the GISAID database and associated variants with mortality using a logistic regression. In total, evaluating 29,891 sequenced loci of the viral genome for association with patient/host mortality, two loci, at 12,053 and 25,088 bp, achieved genome-wide significance (p values of 4.09e-09 and 4.41e-23, respectively), though only 25,088 bp remained significant in follow-up analyses. Our association findings were exclusively driven by the samples that were submitted from Brazil (p value of 4.90e-13 for 25,088 bp). The mutation frequency of 25,088 bp in the Brazilian samples on GISAID has rapidly increased from about 0.4 in October/December 2020 to 0.77 in March 2021. Although GWAS methodology is suitable for samples in which mutation frequencies varies between geographical regions, it cannot account for mutation frequencies that change rapidly overtime, rendering a GWAS follow-up analysis of the GISAID samples that have been submitted after December 2020 as invalid. The locus at 25,088 bp is located in the P.1 strain, which later (April 2021) became one of the distinguishing loci (precisely, substitution V1176F) of the Brazilian strain as defined by the Centers for Disease Control. Specifically, the mutations at 25,088 bp occur in the S2 subunit of the SARS-CoV-2 spike protein, which plays a key role in viral entry of target host cells. Since the mutations alter amino acid coding sequences, they potentially imposing structural changes that could enhance viral infectivity and symptom severity. Our analysis suggests that GWAS methodology can provide suitable analysis tools for the real-time detection of new more transmissible and pathogenic viral strains in databases such as GISAID, though new approaches are needed to accommodate rapidly changing mutation frequencies over time, in the presence of simultaneously changing case/control ratios. Improvements of the associated metadata/patient information in terms of quality and availability will also be important to fully utilize the potential of GWAS methodology in this field.


Subject(s)
COVID-19 , Spike Glycoprotein, Coronavirus , Brazil , Genome-Wide Association Study , Humans , Mutation , Phylogeny , SARS-CoV-2 , Spike Glycoprotein, Coronavirus/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...