Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37779249

RESUMEN

To contain infectious diseases, it is crucial to determine the origin and transmission routes of the pathogen, as well as how the virus evolves. With the development of genome sequencing technology, genome epidemiology has emerged as a powerful approach for investigating the source and transmission of pathogens. In this study, we first presented the rationale for genomic tracing of SARS-CoV-2 and the challenges we currently face. Identifying the most genetically similar reference sequence to the query sequence is a critical step in genome tracing, typically achieved using either a phylogenetic tree or a sequence similarity search. However, these methods become inefficient or computationally prohibitive when dealing with tens of millions of sequences in the reference database, as we encountered during the COVID-19 pandemic. To address this challenge, we developed a novel genomic tracing algorithm capable of processing 6 million SARS-CoV-2 sequences in less than a minute. Instead of constructing a giant phylogenetic tree, we devised a weighted scoring system based on mutation characteristics to quantify sequences similarity. The developed method demonstrated superior performance compared to previous methods. Additionally, an online platform was developed to facilitate genomic tracing and visualization of the spatiotemporal distribution of sequences. The method will be a valuable addition to standard epidemiological investigations, enabling more efficient genomic tracing. Furthermore, the computational framework can be easily adapted to other pathogens, paving the way for routine genomic tracing of infectious diseases.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , COVID-19/epidemiología , COVID-19/genética , Filogenia , Pandemias , Genoma Viral , Genómica/métodos
2.
Bioinformatics ; 40(3)2024 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-38426352

RESUMEN

MOTIVATION: Intra-host variants refer to genetic variations or mutations that occur within an individual host organism. These variants are typically studied in the context of viruses, bacteria, or other pathogens to understand the evolution of pathogens. Moreover, intra-host variants are also explored in the field of tumor biology and mitochondrial biology to characterize somatic mutations and inherited heteroplasmic mutations. Intra-host variants can involve long insertions, deletions, and combinations of different mutation types, which poses challenges in their identification. The performance of current methods in detecting of complex intra-host variants is unknown. RESULTS: First, we simulated a dataset comprising 10 samples with 1869 intra-host variants involving various mutation patterns and benchmarked current variant detection software. The results indicated that though current software can detect most variants with F1-scores between 0.76 and 0.97, their performance in detecting long indels and low frequency variants was limited. Thus, we developed a new software, PySNV, for the detection of complex intra-host variations. On the simulated dataset, PySNV successfully detected 1863 variant cases (F1-score: 0.99) and exhibited the highest Pearson correlation coefficient (PCC: 0.99) to the ground truth in predicting variant frequencies. The results demonstrated that PySNV delivered promising performance even for long indels and low frequency variants, while maintaining computational speed comparable to other methods. Finally, we tested its performance on SARS-CoV-2 replicate sequencing data and found that it reported 21% more variants compared to LoFreq, the best-performing benchmarked software, while showing higher consistency (62% over 54%) within replicates. The discrepancies mostly exist in low-depth regions and low frequency variants. AVAILABILITY AND IMPLEMENTATION: https://github.com/bnuLyndon/PySNV/.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Mutación , Mutación INDEL , Variación Genética
3.
BMC Biol ; 20(1): 225, 2022 10 08.
Artículo en Inglés | MEDLINE | ID: mdl-36209213

RESUMEN

BACKGROUND: Shotgun metagenomic sequencing has greatly expanded the understanding of microbial communities in various biological niches. However, it is still challenging to efficiently convert sub-nanogram DNA to high-quality metagenomic libraries and obtain high-fidelity data, hindering the exploration of niches with low microbial biomass. RESULTS: To cope with this challenge comprehensively, we evaluated the performance of various library preparation methods on 0.5 pg-5 ng synthetic microbial community DNA, characterized contaminants, and further applied different in silico decontamination methods. First, we discovered that whole genome amplification prior to library construction led to worse outcomes than preparing libraries directly. Among different non-WGA-based library preparation methods, we found the endonuclease-based method being generally good for different amounts of template and the tagmentation-based method showing specific advantages with 0.5 pg template, based on evaluation metrics including fidelity, proportion of designated reads, and reproducibility. The load of contaminating DNA introduced by library preparation varied from 0.01 to 15.59 pg for different kits and accounted for 0.05 to 45.97% of total reads. A considerable fraction of the contaminating reads were mapped to human commensal and pathogenic microbes, thus potentially leading to erroneous conclusions in human microbiome studies. Furthermore, the best performing in silico decontamination method in our evaluation, Decontam-either, was capable of recovering the real microbial community from libraries where contaminants accounted for less than 10% of total reads, but not from libraries with heavy and highly varied contaminants. CONCLUSIONS: This study demonstrates that high-quality metagenomic data can be obtained from samples with sub-nanogram microbial DNA by combining appropriate library preparation and in silico decontamination methods and provides a general reference for method selection for samples with varying microbial biomass.


Asunto(s)
Descontaminación , Metagenómica , ADN/genética , Endonucleasas/genética , Biblioteca de Genes , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Metagenómica/métodos , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN/métodos
4.
Clin Infect Dis ; 71(15): 713-720, 2020 07 28.
Artículo en Inglés | MEDLINE | ID: mdl-32129843

RESUMEN

BACKGROUND: A novel coronavirus (CoV), severe acute respiratory syndrome (SARS)-CoV-2, has infected >75 000 individuals and spread to >20 countries. It is still unclear how fast the virus evolved and how it interacts with other microorganisms in the lung. METHODS: We have conducted metatranscriptome sequencing for bronchoalveolar lavage fluid samples from 8 patients with SARS-CoV-2, and also analyzed data from 25 patients with community-acquired pneumonia (CAP), and 20 healthy controls for comparison. RESULTS: The median number of intrahost variants was 1-4 in SARS-CoV-2-infected patients, ranged from 0 to 51 in different samples. The distribution of variants on genes was similar to those observed in the population data. However, very few intrahost variants were observed in the population as polymorphisms, implying either a bottleneck or purifying selection involved in the transmission of the virus, or a consequence of the limited diversity represented in the current polymorphism data. Although current evidence did not support the transmission of intrahost variants in a possible person-to-person spread, the risk should not be overlooked. Microbiotas in SARS-CoV-2-infected patients were similar to those in CAP, either dominated by the pathogens or with elevated levels of oral and upper respiratory commensal bacteria. CONCLUSION: SARS-CoV-2 evolves in vivo after infection, which may affect its virulence, infectivity, and transmissibility. Although how the intrahost variant spreads in the population is still elusive, it is necessary to strengthen the surveillance of the viral evolution in the population and associated clinical changes.


Asunto(s)
Infecciones por Coronavirus/epidemiología , Coronavirus , Pandemias , Neumonía Viral/epidemiología , Síndrome Respiratorio Agudo Grave , Betacoronavirus , COVID-19 , Variación Genética , Genómica , Humanos , SARS-CoV-2
6.
Cell Host Microbe ; 32(1): 25-34.e5, 2024 Jan 10.
Artículo en Inglés | MEDLINE | ID: mdl-38029742

RESUMEN

Emerging SARS-CoV-2 sub-lineages like XBB.1.5, XBB.1.16, EG.5, HK.3 (FLip), and XBB.2.3 and the variant BA.2.86 have recently been identified. Understanding the efficacy of current vaccines on these emerging variants is critical. We evaluate the serum neutralization activities of participants who received COVID-19 inactivated vaccine (CoronaVac), those who received the recently approved tetravalent protein vaccine (SCTV01E), or those who had contracted a breakthrough infection with BA.5/BF.7/XBB virus. Neutralization profiles against a broad panel of 30 sub-lineages reveal that BQ.1.1, CH.1.1, and all the XBB sub-lineages exhibit heightened resistance to neutralization compared to previous variants. However, despite their extra mutations, BA.2.86 and the emerging XBB sub-lineages do not demonstrate significantly increased resistance to neutralization over XBB.1.5. Encouragingly, the SCTV01E booster consistently induces higher neutralizing titers against all these variants than breakthrough infection does. Cellular immunity assays also show that the SCTV01E booster elicits a higher frequency of virus-specific memory B cells. Our findings support the development of multivalent vaccines to combat future variants.


Asunto(s)
Infección Irruptiva , Vacunas contra la COVID-19 , COVID-19 , Inmunización Secundaria , Humanos , COVID-19/prevención & control , SARS-CoV-2/genética , Anticuerpos Neutralizantes , Anticuerpos Antivirales
7.
Nat Ecol Evol ; 7(9): 1457-1466, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37443189

RESUMEN

Mutations in the SARS-CoV-2 genome could confer resistance to pre-existing antibodies and/or increased transmissibility. The recently emerged Omicron subvariants exhibit a strong tendency for immune evasion, suggesting adaptive evolution. However, because previous studies have been limited to specific lineages or subsets of mutations, the overall evolutionary trajectory of SARS-CoV-2 and the underlying driving forces are still not fully understood. Here we analysed all open-access SARS-CoV-2 genomes (up to November 2022) and correlated the mutation incidence and fitness changes with the impacts of mutations on immune evasion and ACE2 binding affinity. Our results show that the Omicron lineage had an accelerated mutation rate in the RBD region, while the mutation incidence in other genomic regions did not change dramatically over time. Mutations in the RBD region exhibited a lineage-specific pattern and tended to become more aggregated over time, and the mutation incidence was positively correlated with the strength of antibody pressure. Additionally, mutation incidence was positively correlated with changes in ACE2 binding affinity, but with a lower correlation coefficient than with immune evasion. In contrast, the effect of mutations on fitness was more closely correlated with changes in ACE2 binding affinity than with immune evasion. Our findings suggest that immune evasion and ACE2 binding affinity play significant and diverse roles in the evolution of SARS-CoV-2.


Asunto(s)
COVID-19 , Evasión Inmune , Humanos , Enzima Convertidora de Angiotensina 2 , Mutación , SARS-CoV-2/genética
8.
Microbiol Spectr ; 11(1): e0342622, 2023 02 14.
Artículo en Inglés | MEDLINE | ID: mdl-36622170

RESUMEN

SARS-CoV-2 has infected more than 600 million people. However, the origin of the virus is still unclear; knowing where the virus came from could help us prevent future zoonotic epidemics. Sequencing data, particularly metagenomic data, can profile the genomes of all species in the sample, including those not recognized at the time, thus allowing for the identification of the progenitor of SARS-CoV-2 in samples collected before the pandemic. We analyzed the data from 5,196 SARS-CoV-2-positive sequencing runs in the NCBI's SRA database with collection dates prior to 2020 or unknown. We found that the mutation patterns obtained from these suspicious SARS-CoV-2 reads did not match the genome characteristics of an unknown progenitor of the virus, suggesting that they may derive from circulating SARS-CoV-2 variants or other coronaviruses. Despite a negative result for tracking the progenitor of SARS-CoV-2, the methods developed in the study could assist in pinpointing the origin of various pathogens in the future. IMPORTANCE Sequences that are homologous to the SARS-CoV-2 genome were found in numerous sequencing runs that were not associated with the SARS-CoV-2 studies in the public database. It is unclear whether they are derived from the possible progenitor of SARS-CoV-2 or contamination of more recent SARS-CoV-2 variants circulated in the population due to the lack of information on the collection, library preparation, and sequencing processes. We have developed a computational framework to infer the evolutionary relationship between sequences based on the comparison of mutations, which enabled us to rule out the possibility that these suspicious sequences originate from unknown progenitors of SARS-CoV-2.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Metagenómica , Mutación , Genoma Viral
9.
Biosaf Health ; 5(1): 62-67, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-36320662

RESUMEN

We analyzed variations in the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genome during a flight-related cluster outbreak of coronavirus disease 2019 (COVID-19) in Shenzhen, China, to explore the characteristics of SARS-CoV-2 transmission and intra-host single nucleotide variations (iSNVs) in a confined space. Thirty-three patients with COVID-19 were sampled, and 14 were resampled 3-31 days later. All 47 nasopharyngeal swabs were deep-sequenced. iSNVs and similarities in the consensus genome sequence were analyzed. Three SARS-CoV-2 variants of concern, Delta (n = 31), Beta (n = 1), and C.1.2 (n = 1), were detected among the 33 patients. The viral genome sequences from 30 Delta-positive patients had similar SNVs; 14 of these patients provided two successive samples. Overall, the 47 sequenced genomes contained 164 iSNVs. Of the 14 paired (successive) samples, the second samples (T2) contained more iSNVs (median: 3; 95% confidence interval [95% CI]: 2.77-10.22) than did the first samples (T1; median: 2; 95% CI: 1.63-3.74; Wilcoxon test, P = 0.021). 38 iSNVs were detected in T1 samples, and only seven were also detectable in T2 samples. Notably, T2 samples from two of the 14 paired samples had additional mutations than the T1 samples. The iSNVs of the SARS-CoV-2 genome exhibited rapid dynamic changes during a flight-related cluster outbreak event. Intra-host diversity increased gradually with time, and new site mutations occurred in vivo without a population transmission bottleneck. Therefore, we could not determine the generational relationship from the mutation site changes alone.

10.
Genomics Proteomics Bioinformatics ; 20(1): 60-69, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-35033679

RESUMEN

A new variant of concern for SARS-CoV-2, Omicron (B.1.1.529), was designated by the World Health Organization on November 26, 2021. This study analyzed the viral genome sequencing data of 108 samples collected from patients infected with Omicron. First, we found that the enrichment efficiency of viral nucleic acids was reduced due to mutations in the region where the primers anneal to. Second, the Omicron variant possesses an excessive number of mutations compared to other variants circulating at the same time (median: 62 vs. 45), especially in the Spike gene. Mutations in the Spike gene confer alterations in 32 amino acid residues, more than those observed in other SARS-CoV-2 variants. Moreover, a large number of nonsynonymous mutations occur in the codons for the amino acid residues located on the surface of the Spike protein, which could potentially affect the replication, infectivity, and antigenicity of SARS-CoV-2. Third, there are 53 mutations between the Omicron variant and its closest sequences available in public databases. Many of these mutations were rarely observed in public databases and had a low mutation rate. In addition, the linkage disequilibrium between these mutations was low, with a limited number of mutations concurrently observed in the same genome, suggesting that the Omicron variant would be in a different evolutionary branch from the currently prevalent variants. To improve our ability to detect and track the source of new variants rapidly, it is imperative to further strengthen genomic surveillance and data sharing globally in a timely manner.


Asunto(s)
COVID-19 , Ácidos Nucleicos , Aminoácidos , Genómica , Humanos , SARS-CoV-2/genética , Glicoproteína de la Espiga del Coronavirus/genética
11.
Emerg Microbes Infect ; 11(1): 552-555, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-35081877

RESUMEN

We identified an individual who was coinfected with two SARS-CoV-2 variants of concern, the Beta and Delta variants. The ratio of the relative abundance between the two variants was maintained at 1:9 (Beta:Delta) in 14 days. Furthermore, possible evidence of recombinations in the Orf1ab and Spike genes was found.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , Recombinación Genética , Glicoproteína de la Espiga del Coronavirus/genética
12.
Virol Sin ; 37(6): 804-812, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36167254

RESUMEN

The continuously arising of SARS-CoV-2 variants has been posting a great threat to public health safety globally, from B.1.17 (Alpha), B.1.351 (Beta), P.1 (Gamma), B.1.617.2 (Delta) to B.1.1.529 (Omicron). The emerging or re-emerging of the SARS-CoV-2 variants of concern is calling for the constant monitoring of their epidemics, pathogenicity and immune escape. In this study, we aimed to characterize replication and pathogenicity of the Alpha and Delta variant strains isolated from patients infected in Laos. The amino acid mutations within the spike fragment of the isolates were determined via sequencing. The more efficient replication of the Alpha and Delta isolates was documented than the prototyped SARS-CoV-2 in Calu-3 and Caco-2 â€‹cells, while such features were not observed in Huh-7, Vero E6 and HPA-3 â€‹cells. We utilized both animal models of human ACE2 (hACE2) transgenic mice and hamsters to evaluate the pathogenesis of the isolates. The Alpha and Delta can replicate well in multiple organs and cause moderate to severe lung pathology in these animals. In conclusion, the spike protein of the isolated Alpha and Delta variant strains was characterized, and the replication and pathogenicity of the strains in the cells and animal models were also evaluated.


Asunto(s)
COVID-19 , SARS-CoV-2 , Animales , Cricetinae , Humanos , Ratones , Enzima Convertidora de Angiotensina 2 , Células CACO-2 , COVID-19/virología , Ratones Transgénicos , SARS-CoV-2/patogenicidad , Glicoproteína de la Espiga del Coronavirus , Virulencia
13.
Genomics Proteomics Bioinformatics ; 19(5): 727-740, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34695600

RESUMEN

COVID-19 has swept globally and Pakistan is no exception. To investigate the initial introductions and transmissions of the SARS-CoV-2 in Pakistan, we performed the largest genomic epidemiology study of COVID-19 in Pakistan and generated 150 complete SARS-CoV-2 genome sequences from samples collected from March 16 to June 1, 2020. We identified a total of 347 mutated positions, 31 of which were over-represented in Pakistan. Meanwhile, we found over 1000 intra-host single-nucleotide variants (iSNVs). Several of them occurred concurrently, indicating possible interactions among them or coevolution. Some of the high-frequency iSNVs in Pakistan were not observed in the global population, suggesting strong purifying selections. The genomic epidemiology revealed five distinctive spreading clusters. The largest cluster consisted of 74 viruses which were derived from different geographic locations of Pakistan and formed a deep hierarchical structure, indicating an extensive and persistent nation-wide transmission of the virus that was probably attributed to a signature mutation (G8371T in ORF1ab) of this cluster. Furthermore, 28 putative international introductions were identified, several of which are consistent with the epidemiological investigations. In all, this study has inferred the possible pathways of introductions and transmissions of SARS-CoV-2 in Pakistan, which could aid ongoing and future viral surveillance and COVID-19 control.


Asunto(s)
COVID-19 , SARS-CoV-2 , COVID-19/epidemiología , Genoma Viral , Genómica , Humanos , Pakistán/epidemiología , Filogenia , SARS-CoV-2/genética
14.
J Genet Genomics ; 47(10): 610-617, 2020 10 20.
Artículo en Inglés | MEDLINE | ID: mdl-33388272

RESUMEN

In response to the current coronavirus disease 2019 (COVID-19) pandemic, it is crucial to understand the origin, transmission, and evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which relies on close surveillance of genomic diversity in clinical samples. Although the mutation at the population level had been extensively investigated, how the mutations evolve at the individual level is largely unknown. Eighteen time-series fecal samples were collected from nine patients with COVID-19 during the convalescent phase. The nucleic acids of SARS-CoV-2 were enriched by the hybrid capture method. First, we demonstrated the outstanding performance of the hybrid capture method in detecting intra-host variants. We identified 229 intra-host variants at 182 sites in 18 fecal samples. Among them, nineteen variants presented frequency changes > 0.3 within 1-5 days, reflecting highly dynamic intra-host viral populations. Moreover, the evolution of the viral genome demonstrated that the virus was probably viable in the gastrointestinal tract during the convalescent period. Meanwhile, we also found that the same mutation showed a distinct pattern of frequency changes in different individuals, indicating a strong random drift. In summary, dramatic changes of the SARS-CoV-2 genome were detected in fecal samples during the convalescent period; whether the viral load in feces is sufficient to establish an infection warranted further investigation.


Asunto(s)
COVID-19/prevención & control , Heces/virología , Genoma Viral/genética , SARS-CoV-2/genética , COVID-19/epidemiología , COVID-19/virología , Convalecencia , Perfilación de la Expresión Génica/métodos , Genómica/métodos , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mutación , Pandemias , Polimorfismo de Nucleótido Simple , SARS-CoV-2/fisiología , Factores de Tiempo
16.
J Proteomics ; 197: 53-59, 2019 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-30790687

RESUMEN

Peptide-spectrum matches (PSM) scoring between the experimental and theoretical spectrum is a key step in the identification of proteins using mass spectrometry (MS)-based proteomics analyses. Efficient protein identification using MS/MS data remains a challenge. The strategy of using RNA-seq data increases the number of proteins identified by re-constructing the custom search database and integrating mRNA abundance into the false discovery rate of post-PSM. However, this process lacks an algorithm that can allow the incorporation of mRNA abundance into the key scoring model of PSM. Therefore, we developed a novel PSM scoring model, which incorporates mRNA abundance for improved peptide and protein identification. In the new algorithm, abundance information of mRNA was transformed to the prior probability of protein identification and integrated to re-score in PSM using the binomial probability distribution model. Compared with other algorithms using five MS/MS datasets, the results showed that the least improvement ratios of peptide and protein groups were 3.39%-9.79% and 0.48%-8.16% in different datasets (human, rat, zebrafish, yeast, and Arabidopsis thaliana). The new strategy offers an effective solution for MS-based identification of peptides and proteins. SIGNIFICANCE: The new algorithm identifies proteins by quantifying mRNA abundance (FPKM) and incorporating it into a scoring model for peptide-spectrum matches. It is important to improve peptide and protein identification from MS/MS datasets in proteomics research.


Asunto(s)
Algoritmos , Arabidopsis/metabolismo , Bases de Datos de Ácidos Nucleicos , ARN de Hongos/metabolismo , ARN Mensajero/metabolismo , ARN de Planta/metabolismo , Saccharomyces cerevisiae/metabolismo , Pez Cebra/metabolismo , Animales , Humanos , Ratas , Espectrometría de Masas en Tándem
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA