ABSTRACT
Backgroundthe current SARS-CoV-2 pandemic has emphasized the utility of viral whole genome sequencing in the surveillance and control of the pathogen. An unprecedented ongoing global initiative is increasingly producing hundreds of thousands of sequences worldwide. However, the complex circumstances in which viruses are sequenced, along with the demand of urgent results, causes a high rate of incomplete and therefore useless, sequences. However, viral sequences evolve in the context of a complex phylogeny and therefore different positions along the genome are in linkage disequilibrium. Therefore, an imputation method would be able to predict missing positions from the available sequencing data. ResultsWe developed impuSARS, an application that includes Minimac, the most widely used strategy for genomic data imputation and, taking advantage of the enormous amount of SARS-CoV-2 whole genome sequences available, a reference panel containing 239,301 sequences was built. The impuSARS application was tested in a wide range of conditions (continuous fragments, amplicons or sparse individual positions missing) showing great fidelity when reconstructing the original sequences. The impuSARS application is also able to impute whole genomes from commercial kits covering less than 20% of the genome or only from the Spike protein with a precision of 0.96. It also recovers the lineage with a 100% precision for almost all the lineages, even in very poorly covered genomes (< 20%) Conclusionsimputation can improve the pace of SARS-CoV-2 sequencing production by recovering many incomplete or low-quality sequences that would be otherwise discarded. impuSARS can be incorporated in any primary data processing pipeline for SARS-CoV-2 whole genome sequencing.
ABSTRACT
ImportanceThe actual demand on SARS-CoV-2 diagnosis is a current challenge for clinical laboratories. Sample pooling may help to ameliorate workload in clinical laboratories. Objectiveto evaluate the efficacy of sample pooling compared to the individual analysis for the diagnosis of CoVID-19, by using different commercial platforms for nucleic acid extraction and amplification. Design and settingsobservational, prospective, multicentre study across 9 Spanish clinical microbiology laboratories including SARS-CoV-2 RNA testing performed in April 2020, during the first three days after acceptance to participate. Participants and Methods3519 naso-oro-pharyngeal samples received at the participating laboratories were processed individually and in pools (351 pools) according to the existing methodology in each of the centres. ResultsWe found that 253 pools (2519 samples) were negative, and 99 pools (990 samples) were positive; with 241 positive samples (6.85%), our pooling strategy would have saved 2167 PCR tests. For 29 pools (made out of 290 samples) we found discordant results when compared to their correspondent individual samples: in 24/29 pools (30 samples), minor discordances were found; for five pools (5 samples), we found major discordances. Sensitivity, specificity, positive and negative predictive values for pooling were 97.93%, 100%, 100% and 99.85% respectively; accuracy was 99.86% and kappa concordant coefficient was 0.988. As a result of the sample dilution effect of pooling, a loss of 2-3 Cts was observed for E, N or RdRP genes. Conclusionwe show a high efficiency of pooling strategies for SARS-CoV-2 RNA testing, across different RNA extraction and amplification platforms, with excellent performance in terms of sensitivity, specificity, and positive and negative predictive values. We believe that our results may help clinical laboratories to respond to the actual demand and clinical need on SARS-CoV-2 testing, especially for the screening of low prevalence populations. Key points QuestionMay clinical laboratories implement sample pooling as an efficient and safe strategy for SARS-COV-2 RT-PCR screening? FindingsSensitivity, specificity, positive and negative predictive values for pooling were 97.93%, 100%, 100% and 99.85% respectively; accuracy was 99.86% and kappa concordant coefficient was 0.988. MeaningSample pooling can be used safely at clinical laboratories, especially for the screening of low prevalence populations.
ABSTRACT
After more than two years of COVID-19 pandemic, SARS-CoV-2 still remains a global public health problem. Successive waves of infection have produced new SARS-CoV-2 variants with new mutations whose impact on COVID-19 severity and patient survival is uncertain. A total of 764 SARS-CoV-2 genomes sequenced from COVID-19 patients, hospitalized from 19th February 2020 to 30st April 2021, along with their clinical data, were used for survival analysis. A significant association of B.1.1.7, the alpha lineage, with patient mortality (Log Hazard ratio LHR=0.51, C.I.=[0.14,0.88]) was found upon adjustment by all the covariates known to affect COVID-19 prognosis. Moreover, survival analysis of mutations in the SARS-CoV-2 genome rendered 27 of them significantly associated with higher mortality of patients. Most of these mutations were located in the S, ORF8 and N proteins. This study illustrates how a combination of genomic and clinical data provide solid evidence on the impact of viral lineage on patient survival.
ABSTRACT
Given the highly variable clinical phenotype of Coronavirus disease 2019 (COVID-19), a deeper analysis of the host genetic contribution to severe COVID-19 is important to improve our understanding of underlying disease mechanisms. Here, we describe an extended GWAS meta-analysis of a well-characterized cohort of 3,260 COVID-19 patients with respiratory failure and 12,483 population controls from Italy, Spain, Norway and Germany/Austria, including stratified analyses based on age, sex and disease severity, as well as targeted analyses of chromosome Y haplotypes, the human leukocyte antigen (HLA) region and the SARS-CoV-2 peptidome. By inversion imputation, we traced a reported association at 17q21.31 to a highly pleiotropic [~]0.9-Mb inversion polymorphism and characterized the potential effects of the inversion in detail. Our data, together with the 5th release of summary statistics from the COVID-19 Host Genetics Initiative, also identified a new locus at 19q13.33, including NAPSA, a gene which is expressed primarily in alveolar cells responsible for gas exchange in the lung.