ABSTRACT
SARS-CoV-2 mortality has been extensively studied in relation to host susceptibility. How sequence variations in the SARS-CoV-2 genome affect pathogenicity is poorly understood. Starting in October 2020, using the methodology of genome-wide association studies (GWAS), we looked at the association between whole-genome sequencing (WGS) data of the virus and COVID-19 mortality as a potential method of early identification of highly pathogenic strains to target for containment. Although continuously updating our analysis, in December 2020, we analyzed 7548 single-stranded SARS-CoV-2 genomes of COVID-19 patients in the GISAID database and associated variants with mortality using a logistic regression. In total, evaluating 29,891 sequenced loci of the viral genome for association with patient/host mortality, two loci, at 12,053 and 25,088 bp, achieved genome-wide significance (p values of 4.09e-09 and 4.41e-23, respectively), though only 25,088 bp remained significant in follow-up analyses. Our association findings were exclusively driven by the samples that were submitted from Brazil (p value of 4.90e-13 for 25,088 bp). The mutation frequency of 25,088 bp in the Brazilian samples on GISAID has rapidly increased from about 0.4 in October/December 2020 to 0.77 in March 2021. Although GWAS methodology is suitable for samples in which mutation frequencies varies between geographical regions, it cannot account for mutation frequencies that change rapidly overtime, rendering a GWAS follow-up analysis of the GISAID samples that have been submitted after December 2020 as invalid. The locus at 25,088 bp is located in the P.1 strain, which later (April 2021) became one of the distinguishing loci (precisely, substitution V1176F) of the Brazilian strain as defined by the Centers for Disease Control. Specifically, the mutations at 25,088 bp occur in the S2 subunit of the SARS-CoV-2 spike protein, which plays a key role in viral entry of target host cells. Since the mutations alter amino acid coding sequences, they potentially imposing structural changes that could enhance viral infectivity and symptom severity. Our analysis suggests that GWAS methodology can provide suitable analysis tools for the real-time detection of new more transmissible and pathogenic viral strains in databases such as GISAID, though new approaches are needed to accommodate rapidly changing mutation frequencies over time, in the presence of simultaneously changing case/control ratios. Improvements of the associated metadata/patient information in terms of quality and availability will also be important to fully utilize the potential of GWAS methodology in this field.
Subject(s)
COVID-19 , Spike Glycoprotein, Coronavirus , Brazil , Genome-Wide Association Study , Humans , Mutation , Phylogeny , SARS-CoV-2 , Spike Glycoprotein, Coronavirus/geneticsABSTRACT
BACKGROUND: Asthma is a common respiratory disorder with a highly heterogeneous nature that remains poorly understood. The objective was to use whole genome sequencing (WGS) data to identify regions of common genetic variation contributing to lung function in individuals with a diagnosis of asthma. METHODS: WGS data were generated for 1,053 individuals from trios and extended pedigrees participating in the family-based Genetic Epidemiology of Asthma in Costa Rica study. Asthma affection status was defined through a physician's diagnosis of asthma, and most participants with asthma also had airway hyperresponsiveness (AHR) to methacholine. Family-based association tests for single variants were performed to assess the associations with lung function phenotypes. RESULTS: A genome-wide significant association was identified between baseline FEV1/FVC ratio and a single-nucleotide polymorphism in the top hit cysteine-rich secretory protein LCCL domain-containing 2 (CRISPLD2) (rs12051168; P = 3.6 × 10-8 in the unadjusted model) that retained suggestive significance in the covariate-adjusted model (P = 5.6 × 10-6). Rs12051168 was also nominally associated with other related phenotypes: baseline FEV1 (P = 3.3 × 10-3), postbronchodilator (PB) FEV1 (7.3 × 10-3), and PB FEV1/FVC ratio (P = 2.7 × 10-3). The identified baseline FEV1/FVC ratio and rs12051168 association was meta-analyzed and replicated in three independent cohorts in which most participants with asthma also had confirmed AHR (combined weighted z-score P = .015) but not in cohorts without information about AHR. CONCLUSIONS: These findings suggest that using specific asthma characteristics, such as AHR, can help identify more genetically homogeneous asthma subgroups with genotype-phenotype associations that may not be observed in all children with asthma. CRISPLD2 also may be important for baseline lung function in individuals with asthma who also may have AHR.
Subject(s)
Asthma/genetics , Asthma/physiopathology , Cell Adhesion Molecules/genetics , Forced Expiratory Volume/genetics , Interferon Regulatory Factors/genetics , Vital Capacity/genetics , Whole Genome Sequencing , Adolescent , Adult , Child , Child, Preschool , Costa Rica , Female , Humans , Male , Middle Aged , Respiratory Physiological Phenomena/genetics , Young AdultABSTRACT
Bipolar disorder (BD) is a major psychiatric illness affecting around 1% of the global population. BD is characterized by recurrent manic and depressive episodes, and has an estimated heritability of around 70%. Research has identified the first BD susceptibility genes. However, the underlying pathways and regulatory networks remain largely unknown. Research suggests that the cumulative impact of common alleles with small effects explains only around 25-38% of the phenotypic variance for BD. A plausible hypothesis therefore is that rare, high penetrance variants may contribute to BD risk. The present study investigated the role of rare, nonsynonymous, and potentially functional variants via whole exome sequencing in 15 BD cases from two large, multiply affected families from Cuba. The high prevalence of BD in these pedigrees renders them promising in terms of the identification of genetic risk variants with large effect sizes. In addition, SNP array data were used to calculate polygenic risk scores for affected and unaffected family members. After correction for multiple testing, no significant increase in polygenic risk scores for common, BD-associated genetic variants was found in BD cases compared to healthy relatives. Exome sequencing identified a total of 17 rare and potentially damaging variants in 17 genes. The identified variants were shared by all investigated BD cases in the respective pedigree. The most promising variant was located in the gene SERPING1 (p.L349F), which has been reported previously as a genome-wide significant risk gene for schizophrenia. The present data suggest novel candidate genes for BD susceptibility, and may facilitate the discovery of disease-relevant pathways and regulatory networks.