RESUMEN
SARS-CoV-2 mortality has been extensively studied in relation to host susceptibility. How sequence variations in the SARS-CoV-2 genome affect pathogenicity is poorly understood. Starting in October 2020, using the methodology of genome-wide association studies (GWAS), we looked at the association between whole-genome sequencing (WGS) data of the virus and COVID-19 mortality as a potential method of early identification of highly pathogenic strains to target for containment. Although continuously updating our analysis, in December 2020, we analyzed 7548 single-stranded SARS-CoV-2 genomes of COVID-19 patients in the GISAID database and associated variants with mortality using a logistic regression. In total, evaluating 29,891 sequenced loci of the viral genome for association with patient/host mortality, two loci, at 12,053 and 25,088 bp, achieved genome-wide significance (p values of 4.09e-09 and 4.41e-23, respectively), though only 25,088 bp remained significant in follow-up analyses. Our association findings were exclusively driven by the samples that were submitted from Brazil (p value of 4.90e-13 for 25,088 bp). The mutation frequency of 25,088 bp in the Brazilian samples on GISAID has rapidly increased from about 0.4 in October/December 2020 to 0.77 in March 2021. Although GWAS methodology is suitable for samples in which mutation frequencies varies between geographical regions, it cannot account for mutation frequencies that change rapidly overtime, rendering a GWAS follow-up analysis of the GISAID samples that have been submitted after December 2020 as invalid. The locus at 25,088 bp is located in the P.1 strain, which later (April 2021) became one of the distinguishing loci (precisely, substitution V1176F) of the Brazilian strain as defined by the Centers for Disease Control. Specifically, the mutations at 25,088 bp occur in the S2 subunit of the SARS-CoV-2 spike protein, which plays a key role in viral entry of target host cells. Since the mutations alter amino acid coding sequences, they potentially imposing structural changes that could enhance viral infectivity and symptom severity. Our analysis suggests that GWAS methodology can provide suitable analysis tools for the real-time detection of new more transmissible and pathogenic viral strains in databases such as GISAID, though new approaches are needed to accommodate rapidly changing mutation frequencies over time, in the presence of simultaneously changing case/control ratios. Improvements of the associated metadata/patient information in terms of quality and availability will also be important to fully utilize the potential of GWAS methodology in this field.
Asunto(s)
COVID-19 , Glicoproteína de la Espiga del Coronavirus , Brasil , Estudio de Asociación del Genoma Completo , Humanos , Mutación , Filogenia , SARS-CoV-2 , Glicoproteína de la Espiga del Coronavirus/genéticaRESUMEN
BACKGROUND: Asthma is a common respiratory disorder with a highly heterogeneous nature that remains poorly understood. The objective was to use whole genome sequencing (WGS) data to identify regions of common genetic variation contributing to lung function in individuals with a diagnosis of asthma. METHODS: WGS data were generated for 1,053 individuals from trios and extended pedigrees participating in the family-based Genetic Epidemiology of Asthma in Costa Rica study. Asthma affection status was defined through a physician's diagnosis of asthma, and most participants with asthma also had airway hyperresponsiveness (AHR) to methacholine. Family-based association tests for single variants were performed to assess the associations with lung function phenotypes. RESULTS: A genome-wide significant association was identified between baseline FEV1/FVC ratio and a single-nucleotide polymorphism in the top hit cysteine-rich secretory protein LCCL domain-containing 2 (CRISPLD2) (rs12051168; P = 3.6 × 10-8 in the unadjusted model) that retained suggestive significance in the covariate-adjusted model (P = 5.6 × 10-6). Rs12051168 was also nominally associated with other related phenotypes: baseline FEV1 (P = 3.3 × 10-3), postbronchodilator (PB) FEV1 (7.3 × 10-3), and PB FEV1/FVC ratio (P = 2.7 × 10-3). The identified baseline FEV1/FVC ratio and rs12051168 association was meta-analyzed and replicated in three independent cohorts in which most participants with asthma also had confirmed AHR (combined weighted z-score P = .015) but not in cohorts without information about AHR. CONCLUSIONS: These findings suggest that using specific asthma characteristics, such as AHR, can help identify more genetically homogeneous asthma subgroups with genotype-phenotype associations that may not be observed in all children with asthma. CRISPLD2 also may be important for baseline lung function in individuals with asthma who also may have AHR.
Asunto(s)
Asma/genética , Asma/fisiopatología , Moléculas de Adhesión Celular/genética , Volumen Espiratorio Forzado/genética , Factores Reguladores del Interferón/genética , Capacidad Vital/genética , Secuenciación Completa del Genoma , Adolescente , Adulto , Niño , Preescolar , Costa Rica , Femenino , Humanos , Masculino , Persona de Mediana Edad , Fenómenos Fisiológicos Respiratorios/genética , Adulto JovenRESUMEN
INTRODUCTION: Cigarette smoking is a major environmental risk factor for many diseases, including chronic obstructive pulmonary disease (COPD). There are shared genetic influences on cigarette smoking and COPD. Genetic risk factors for cigarette smoking in cohorts enriched for COPD are largely unknown. METHODS: We performed genome-wide association analyses for average cigarettes per day (CPD) across the Genetic Epidemiology of COPD (COPDGene) non-Hispanic white (NHW) (n = 6659) and African American (AA) (n = 3260), GenKOLS (the Genetics of Chronic Obstructive Lung Disease) (n = 1671), and ECLIPSE (the Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints) (n = 1942) cohorts. In addition, we performed exome array association analyses across the COPDGene NHW and AA cohorts. We considered analyses across the entire cohort and stratified by COPD case-control status. RESULTS: We identified genome-wide significant associations for CPD on chromosome 15q25 across all cohorts (lowest p = 1.78 × 10-15), except in the COPDGene AA cohort alone. Previously reported associations on chromosome 19 had suggestive and directionally consistent associations (RAB4, p = 1.95 × 10-6; CYP2A7, p = 7.50 × 10-5; CYP2B6, p = 4.04 × 10-4). When we stratified by COPD case-control status, single nucleotide polymorphisms on chromosome 15q25 were nominally associated with both NHW COPD cases (ß = 0.11, p = 5.58 × 10-4) and controls (ß = 0.12, p = 3.86 × 10-5) For the gene-based exome array association analysis of rare variants, there were no exome-wide significant associations. For these previously replicated associations, the most significant results were among COPDGene NHW subjects for CYP2A7 (p = 5.2 × 10-4). CONCLUSIONS: In a large genome-wide association study of both common variants and a gene-based association of rare coding variants in ever-smokers, we found genome-wide significant associations on chromosome 15q25 with CPD for common variants, but not for rare coding variants. These results were directionally consistent among COPD cases and controls. IMPLICATIONS: We examined both common and rare coding variants associated with CPD in a large population of heavy smokers with and without COPD of NHW and AA descent. We replicated genome-wide significant associations on chromosome 15q25 with CPD for common variants among NHW subjects, but not for rare variants. We demonstrated for the first time that common variants on chromosome 15q25 associated with CPD are similar among COPD cases and controls. Previously reported associations on chromosome 19 showed suggestive and directionally consistent associations among common variants (RAB4, CYP2A7, and CYP2B6) and for rare variants (CYP2A7) among COPDGene NHW subjects. Although the genetic effect sizes for these single nucleotide polymorphisms on chromosome 15q25 are modest, we show that this creates a substantial smoking burden over the lifetime of a smoker.
Asunto(s)
Etnicidad/genética , Marcadores Genéticos , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple , Enfermedad Pulmonar Obstructiva Crónica/etiología , Fumadores/estadística & datos numéricos , Fumar/genética , Adulto , Anciano , Anciano de 80 o más Años , Hidrocarburo de Aril Hidroxilasas/genética , Estudios de Casos y Controles , Citocromo P-450 CYP2B6/genética , Familia 2 del Citocromo P450/genética , Europa (Continente)/epidemiología , Femenino , Estudio de Asociación del Genoma Completo/métodos , Humanos , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Prevalencia , Pronóstico , Enfermedad Pulmonar Obstructiva Crónica/epidemiología , Enfermedad Pulmonar Obstructiva Crónica/patología , Fumar/efectos adversos , Fumar/epidemiología , Estados Unidos/epidemiología , Proteínas de Unión al GTP rab4/genéticaRESUMEN
RATIONALE: Traditional genome-wide association studies (GWASs) of large cohorts of subjects with chronic obstructive pulmonary disease (COPD) have successfully identified novel candidate genes, but several other plausible loci do not meet strict criteria for genome-wide significance after correction for multiple testing. OBJECTIVES: The authors hypothesise that by applying unbiased weights derived from unique populations we can identify additional COPD susceptibility loci. Methods The authors performed a homozygosity haplotype analysis on a group of subjects with and without COPD to identify regions of conserved homozygosity haplotype (RCHHs). Weights were constructed based on the frequency of these RCHHs in case versus controls, and used to adjust the p values from a large collaborative GWAS of COPD. RESULTS: The authors identified 2318 RCHHs, of which 576 were significantly (p<0.05) over-represented in cases. After applying the weights constructed from these regions to a collaborative GWAS of COPD, the authors identified two single nucleotide polymorphisms (SNPs) in a novel gene (fibroblast growth factor-7 (FGF7)) that gained genome-wide significance by the false discovery rate method. In a follow-up analysis, both SNPs (rs12591300 and rs4480740) were significantly associated with COPD in an independent population (combined p values of 7.9E-7 and 2.8E-6, respectively). In another independent population, increased lung tissue FGF7 expression was associated with worse measures of lung function. CONCLUSION: Weights constructed from a homozygosity haplotype analysis of an isolated population successfully identify novel genetic associations from a GWAS on a separate population. This method can be used to identify promising candidate genes that fail to meet strict correction for multiple testing.