RESUMEN
During the course of the COVID-19 pandemic, large-scale genome sequencing of SARS-CoV-2 has been useful in tracking its spread and in identifying variants of concern (VOC). Viral and host factors could contribute to variability within a host that can be captured in next-generation sequencing reads as intra-host single nucleotide variations (iSNVs). Analysing 1347 samples collected till June 2020, we recorded 16 410 iSNV sites throughout the SARS-CoV-2 genome. We found â¼42% of the iSNV sites to be reported as SNVs by 30 September 2020 in consensus sequences submitted to GISAID, which increased to â¼80% by 30th June 2021. Following this, analysis of another set of 1774 samples sequenced in India between November 2020 and May 2021 revealed that majority of the Delta (B.1.617.2) and Kappa (B.1.617.1) lineage-defining variations appeared as iSNVs before getting fixed in the population. Besides, mutations in RdRp as well as RNA-editing by APOBEC and ADAR deaminases seem to contribute to the differential prevalence of iSNVs in hosts. We also observe hyper-variability at functionally critical residues in Spike protein that could alter the antigenicity and may contribute to immune escape. Thus, tracking and functional annotation of iSNVs in ongoing genome surveillance programs could be important for early identification of potential variants of concern and actionable interventions.
Asunto(s)
Evolución Molecular , Variación Genética/genética , Genoma Viral/genética , Interacciones Huésped-Patógeno/genética , SARS-CoV-2/genética , Desaminasas APOBEC-1/genética , Adenosina Desaminasa/genética , Animales , COVID-19/epidemiología , COVID-19/prevención & control , COVID-19/virología , Chlorocebus aethiops , ARN Polimerasa Dependiente de ARN de Coronavirus/genética , Bases de Datos Genéticas , Evasión Inmune/genética , India/epidemiología , Filogenia , Proteínas de Unión al ARN/genética , SARS-CoV-2/clasificación , SARS-CoV-2/crecimiento & desarrollo , Glicoproteína de la Espiga del Coronavirus/genética , Células VeroRESUMEN
The recent release of SARS-CoV-2 genomic data from several countries has provided clues into the potential antigenic drift of the coronavirus population. In particular, the genomic instability observed in the spike protein necessitates immediate action and further exploration in the context of viral-host interactions. By temporally tracking 527,988 SARS-CoV-2 genomes, we identified invariant and hypervariable regions within the spike protein. We evaluated combination of mutations from SARS-CoV-2 lineages and found that maximum number of lineage-defining mutations were present in the N-terminal domain (NTD). Based on mutant 3D-structural models of known Variants of Concern (VOCs), we found that structural properties such as accessibility, secondary structural type, and intra-protein interactions at local mutation sites are greatly altered. Further, we observed significant differences between intra-protein networks of wild-type and Delta mutant, with the latter showing dense intra-protein contacts. Extensive molecular dynamics simulations of D614G mutant spike structure with hACE2 further revealed dynamic features with 47.7% of mutations mapping on flexible regions of spike protein. Thus, we propose that significant changes within spike protein structure have occurred that may impact SARS-CoV-2 pathogenesis, and repositioning of vaccine candidates is required to contain the spread of COVID-19 pathogen.