RESUMO
This study retrospectively analyzed the genetic characteristics of influenza A H3N2 (A/H3N2) viruses circulating in New South Wales (NSW), the Australian state with the highest number of influenza cases in 2022, and explored the phylodynamics of A/H3N2 transmission within Australia during this period. Sequencing was performed on 217 archived specimens, and A/H3N2 evolution and spread within Australia were analyzed using phylogenetic and phylodynamic methods. Hemagglutinin genes of all analyzed NSW viruses belonged to subclade 3C.2a1b.2a.2 and clustered together with the 2022 vaccine strain. Complete genome analysis of NSW viruses revealed highly frequent interclade reassortments between subclades 3C.2a1b.2a.2 and 3C.2a1b.1a. The estimated earliest introduction time of the dominant subgroup 3C.2a1b.2a.2a.1 in Australia was February 22, 2022 (95% highest posterior density: December 19, 2021-March 13, 2022), following the easing of Australian travel restrictions, suggesting a possible international source. Phylogeographic analysis revealed that Victoria drove the transmission of A/H3N2 viruses across the country during this season, while NSW did not have a dominant role in viral dissemination to other regions. This study highlights the importance of continuous surveillance and genomic characterization of influenza viruses in the postpandemic era, which can inform public health decision-making and enable early detection of novel strains with pandemic potential.
Assuntos
COVID-19 , Vírus da Influenza A Subtipo H3N2 , Influenza Humana , Filogenia , Humanos , Vírus da Influenza A Subtipo H3N2/genética , Vírus da Influenza A Subtipo H3N2/classificação , Vírus da Influenza A Subtipo H3N2/isolamento & purificação , Influenza Humana/epidemiologia , Influenza Humana/virologia , Influenza Humana/transmissão , Estudos Retrospectivos , COVID-19/epidemiologia , COVID-19/transmissão , COVID-19/virologia , COVID-19/prevenção & controle , Austrália/epidemiologia , New South Wales/epidemiologia , SARS-CoV-2/genética , SARS-CoV-2/classificação , Filogeografia , Estações do Ano , Genoma Viral/genética , Glicoproteínas de Hemaglutininação de Vírus da Influenza/genética , Vírus Reordenados/genética , Vírus Reordenados/classificaçãoRESUMO
Acute respiratory infection is the third most frequent cause of mortality worldwide, causing over 4.25 million deaths annually. Although most diagnosed acute respiratory infections are thought to be of viral origin, the aetiology often remains unclear. The advent of next-generation sequencing (NGS) has revolutionised the field of virus discovery and identification, particularly in the detection of unknown respiratory viruses. We systematically reviewed the application of NGS technologies for detecting respiratory viruses from clinical samples and outline potential barriers to the routine clinical introduction of NGS. The five databases searched for studies published in English from 01 January 2010 to 01 February 2021, which led to the inclusion of 52 studies. A total of 14 different models of NGS platforms were summarised from included studies. Among these models, second-generation sequencing platforms (e.g., Illumina sequencers) were used in the majority of studies (41/52, 79%). Moreover, NGS platforms have proven successful in detecting a variety of respiratory viruses, including influenza A/B viruses (9/52, 17%), SARS-CoV-2 (21/52, 40%), parainfluenza virus (3/52, 6%), respiratory syncytial virus (1/52, 2%), human metapneumovirus (2/52, 4%), or a viral panel including other respiratory viruses (16/52, 31%). The review of NGS technologies used in previous studies indicates the advantages of NGS technologies in novel virus detection, virus typing, mutation identification, and infection cluster assessment. Although there remain some technical and ethical challenges associated with NGS use in clinical laboratories, NGS is a promising future tool to improve understanding of respiratory viruses and provide a more accurate diagnosis with simultaneous virus characterisation.
Assuntos
COVID-19 , Vírus da Influenza A , Infecções Respiratórias , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Vírus da Influenza B , Infecções Respiratórias/diagnóstico , SARS-CoV-2RESUMO
SUMMARY: We present GeoBoost2, a natural language-processing pipeline for extracting the location of infected hosts for enriching metadata in nucleotide sequences repositories like National Center of Biotechnology Information's GenBank for downstream analysis including phylogeography and genomic epidemiology. The increasing number of pathogen sequences requires complementary information extraction methods for focused research, including surveillance within countries and between borders. In this article, we describe the enhancements from our earlier release including improvement in end-to-end extraction performance and speed, availability of a fully functional web-interface and state-of-the-art methods for location extraction using deep learning. AVAILABILITY AND IMPLEMENTATION: Application is freely available on the web at https://zodo.asu.edu/geoboost2. Source code, usage examples and annotated data for GeoBoost2 is freely available at https://github.com/ZooPhy/geoboost2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Bases de Dados de Ácidos Nucleicos , Metadados , Genômica , Filogeografia , SoftwareRESUMO
BACKGROUND: Local transmission of seasonal influenza viruses (IVs) can be difficult to resolve. Here, we study if coupling high-throughput sequencing (HTS) of hemagglutinin (HA) and neuraminidase (NA) genes with variant analysis can resolve strains from local transmission that have identical consensus genome. We analyzed 24 samples collected over four days in January 2020 at a large university in the US. We amplified complete hemagglutinin (HA) and neuraminidase (NA) genomic segments followed by Illumina sequencing. We identified consensus complete HA and NA segments using BLASTn and performed variant analysis on strains whose HA and NA segments were 100% similar. RESULTS: Twelve of the 24 samples were PCR positive, and we detected complete HA and/or NA segments by de novo assembly in 83.33% (10/12) of them. Similarity and phylogenetic analysis showed that 70% (7/10) of the strains were distinct while the remaining 30% had identical consensus sequences. These three samples also had IAV and IBV co-infection. However, subsequent variant analysis showed that they had distinct variant profiles. While the IAV HA of one sample had no variant, another had a T663C mutation and another had both C1379T and C1589A. CONCLUSION: In this study, we showed that HTS coupled with variant analysis of only HA and NA genes can help resolve variants that are closely related. We also provide evidence that during a short time period in the 2019-2020 season, co-infection of IAV and IBV occurred on the university campus and both 2020/2021 and 2021/2022 WHO recommended H1N1 vaccine strains were co-circulating.
Assuntos
Coinfecção/diagnóstico , Glicoproteínas de Hemaglutininação de Vírus da Influenza/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Vírus da Influenza A Subtipo H1N1/genética , Vírus da Influenza A Subtipo H1N1/isolamento & purificação , Influenza Humana/diagnóstico , Influenza Humana/virologia , Neuraminidase/genética , Sequência Consenso , Variação Genética/genética , Hemaglutininas , Humanos , Influenza Humana/genética , Filogenia , Estações do AnoRESUMO
Variola virus is at risk of re-emergence either through accidental release, bioterrorism, or synthetic biology. The use of phylogenetics and phylogeography to support epidemic field response is expected to grow as sequencing technology becomes miniaturized, cheap, and ubiquitous. In this study, we aimed to explore the use of common VARV diagnostic targets hemagglutinin (HA), cytokine response modifier B (CrmB), and A-type inclusion protein (ATI) for phylogenetic characterization as well as the representativeness of modelling strategies in phylogeography to support epidemic response should smallpox re-emerge. We used Bayesian discrete-trait phylogeography using the most complete data set currently available of whole genome (n = 51) and partially sequenced (n = 20) VARV isolates. We show that multilocus models combining HA, ATI, and CrmB genes may represent a useful heuristic to differentiate between VARV Major and subclades of VARV Minor which have been associated with variable case-fatality rates. Where whole genome sequencing is unavailable, phylogeography models of HA, ATI, and CrmB may provide preliminary but uncertain estimates of transmission, while supplementing whole genome models with additional isolates sequenced only for HA can improve sample representativeness, maintaining similar support for transmission relative to whole genome models. We have also provided empirical evidence delineating historic international VARV transmission using phylogeography. Due to the persistent threat of re-emergence, our results provide important research for smallpox epidemic preparedness in the posteradication era as recommended by the World Health Organisation.
Assuntos
Hemaglutininas Virais/genética , Filogenia , Serpinas/genética , Vírus da Varíola/genética , Proteínas Virais/genética , Teorema de Bayes , Filogeografia , Vírus da Varíola/patogenicidadeRESUMO
Motivation: Virus phylogeographers rely on DNA sequences of viruses and the locations of the infected hosts found in public sequence databases like GenBank for modeling virus spread. However, the locations in GenBank records are often only at the country or state level, and may require phylogeographers to scan the journal articles associated with the records to identify more localized geographic areas. To automate this process, we present a named entity recognizer (NER) for detecting locations in biomedical literature. We built the NER using a deep feedforward neural network to determine whether a given token is a toponym or not. To overcome the limited human annotated data available for training, we use distant supervision techniques to generate additional samples to train our NER. Results: Our NER achieves an F1-score of 0.910 and significantly outperforms the previous state-of-the-art system. Using the additional data generated through distant supervision further boosts the performance of the NER achieving an F1-score of 0.927. The NER presented in this research improves over previous systems significantly. Our experiments also demonstrate the NER's capability to embed external features to further boost the system's performance. We believe that the same methodology can be applied for recognizing similar biomedical entities in scientific literature.
Assuntos
Aprendizado Profundo , Armazenamento e Recuperação da Informação/métodos , Filogeografia/métodos , Vírus/genética , Bases de Dados de Ácidos Nucleicos , HumanosRESUMO
Summary: GeoBoost is a command-line software package developed to address sparse or incomplete metadata in GenBank sequence records that relate to the location of the infected host (LOIH) of viruses. Given a set of GenBank accession numbers corresponding to virus GenBank records, GeoBoost extracts, integrates and normalizes geographic information reflecting the LOIH of the viruses using integrated information from GenBank metadata and related full-text publications. In addition, to facilitate probabilistic geospatial modeling, GeoBoost assigns probability scores for each possible LOIH. Availability and implementation: Binaries and resources required for running GeoBoost are packed into a single zipped file and freely available for download at https://tinyurl.com/geoboost. A video tutorial is included to help users quickly and easily install and run the software. The software is implemented in Java 1.8, and supported on MS Windows and Linux platforms. Contact: gragon@upenn.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Metadados , Vírus , Bases de Dados de Ácidos Nucleicos , SoftwareRESUMO
Ancestral state reconstructions in Bayesian phylogeography of virus pandemics have been improved by utilizing a Bayesian stochastic search variable selection (BSSVS) framework. Recently, this framework has been extended to model the transition rate matrix between discrete states as a generalized linear model (GLM) of genetic, geographic, demographic, and environmental predictors of interest to the virus and incorporating BSSVS to estimate the posterior inclusion probabilities of each predictor. Although the latter appears to enhance the biological validity of ancestral state reconstruction, there has yet to be a comparison of phylogenies created by the two methods. In this paper, we compare these two methods, while also using a primitive method without BSSVS, and highlight the differences in phylogenies created by each. We test six coalescent priors and six random sequence samples of H3N2 influenza during the 2014-15 flu season in the U.S. We show that the GLMs yield significantly greater root state posterior probabilities than the two alternative methods under five of the six priors, and significantly greater Kullback-Leibler divergence values than the two alternative methods under all priors. Furthermore, the GLMs strongly implicate temperature and precipitation as driving forces of this flu season and nearly unanimously identified a single root state, which exhibits the most tropical climate during a typical flu season in the U.S. The GLM, however, appears to be highly susceptible to sampling bias compared with the other methods, which casts doubt on whether its reconstructions should be favored over those created by alternate methods. We report that a BSSVS approach with a Poisson prior demonstrates less bias toward sample size under certain conditions than the GLMs or primitive models, and believe that the connection between reconstruction method and sampling bias warrants further investigation.
Assuntos
Vírus da Influenza A Subtipo H3N2/genética , Influenza Humana/virologia , Modelos Estatísticos , Filogeografia/métodos , Estações do Ano , Teorema de Bayes , Simulação por Computador , Surtos de Doenças , Evolução Molecular , Variação Genética/genética , Humanos , Incidência , Vírus da Influenza A Subtipo H3N2/isolamento & purificação , Influenza Humana/epidemiologia , Filogenia , Vigilância da População , Fatores de Risco , Análise Espaço-Temporal , Especificidade da Espécie , Estados Unidos/epidemiologia , Tempo (Meteorologia)RESUMO
BACKGROUND: Zoonotic diseases account for a substantial portion of infectious disease outbreaks and burden on public health programs to maintain surveillance and preventative measures. Taking advantage of new modeling approaches and data sources have become necessary in an interconnected global community. To facilitate data collection, analysis, and decision-making, the number of spatial decision support systems reported in the last 10 years has increased. This systematic review aims to describe characteristics of spatial decision support systems developed to assist public health officials in the management of zoonotic disease outbreaks. METHODS: A systematic search of the Google Scholar database was undertaken for published articles written between 2008 and 2018, with no language restriction. A manual search of titles and abstracts using Boolean logic and keyword search terms was undertaken using predefined inclusion and exclusion criteria. Data extraction included items such as spatial database management, visualizations, and report generation. RESULTS: For this review we screened 34 full text articles. Design and reporting quality were assessed, resulting in a final set of 12 articles which were evaluated on proposed interventions and identifying characteristics were described. Multisource data integration, and user centered design were inconsistently applied, though indicated diverse utilization of modeling techniques. CONCLUSIONS: The characteristics, data sources, development and modeling techniques implemented in the design of recent SDSS that target zoonotic disease outbreak were described. There are still many challenges to address during the design process to effectively utilize the value of emerging data sources and modeling methods. In the future, development should adhere to comparable standards for functionality and system development such as user input for system requirements, and flexible interfaces to visualize data that exist on different scales. PROSPERO registration number: CRD42018110466.
Assuntos
Técnicas de Apoio para a Decisão , Surtos de Doenças , Informática em Saúde Pública/métodos , Zoonoses/epidemiologia , Animais , Tomada de Decisões , Surtos de Doenças/prevenção & controle , Humanos , Fatores de Risco , Zoonoses/diagnósticoAssuntos
Betacoronavirus/genética , Infecções por Coronavirus/virologia , Pneumonia Viral/virologia , Proteínas Virais/genética , Arizona/epidemiologia , COVID-19 , Infecções por Coronavirus/epidemiologia , Evolução Molecular , Genoma Viral , Humanos , Pandemias , Pneumonia Viral/epidemiologia , SARS-CoV-2 , Vigilância de Evento Sentinela , Deleção de SequênciaRESUMO
UNLABELLED: Diseases caused by zoonotic viruses (viruses transmittable between humans and animals) are a major threat to public health throughout the world. By studying virus migration and mutation patterns, the field of phylogeography provides a valuable tool for improving their surveillance. A key component in phylogeographic analysis of zoonotic viruses involves identifying the specific locations of relevant viral sequences. This is usually accomplished by querying public databases such as GenBank and examining the geospatial metadata in the record. When sufficient detail is not available, a logical next step is for the researcher to conduct a manual survey of the corresponding published articles. MOTIVATION: In this article, we present a system for detection and disambiguation of locations (toponym resolution) in full-text articles to automate the retrieval of sufficient metadata. Our system has been tested on a manually annotated corpus of journal articles related to phylogeography using integrated heuristics for location disambiguation including a distance heuristic, a population heuristic and a novel heuristic utilizing knowledge obtained from GenBank metadata (i.e. a 'metadata heuristic'). RESULTS: For detecting and disambiguating locations, our system performed best using the metadata heuristic (0.54 Precision, 0.89 Recall and 0.68 F-score). Precision reaches 0.88 when examining only the disambiguation of location names. Our error analysis showed that a noticeable increase in the accuracy of toponym resolution is possible by improving the geospatial location detection. By improving these fundamental automated tasks, our system can be a useful resource to phylogeographers that rely on geospatial metadata of GenBank sequences. .
Assuntos
Filogeografia/métodos , Vírus/genética , Bases de Dados de Ácidos Nucleicos , Análise de SequênciaRESUMO
Emerging and re-emerging infectious diseases of zoonotic origin like highly pathogenic avian influenza pose a significant threat to human and animal health due to their elevated transmissibility. Identifying the drivers of such viruses is challenging, and estimation of spatial diffusion is complicated by the fact that the variability of viral spread from locations could be caused by a complex array of unknown factors. Several techniques exist to help identify these drivers, including bioinformatics, phylogeography, and spatial epidemiology, but these methods are generally evaluated separately and do not consider the complementary nature of each other. Here, we studied an approach that integrates these techniques and identifies the most important drivers of viral spread by focusing on H5N1 influenza A virus in Egypt because of its recent emergence as an epicenter for the disease. We used a Bayesian phylogeographic generalized linear model (GLM) to reconstruct spatiotemporal patterns of viral diffusion while simultaneously assessing the impact of factors contributing to transmission. We also calculated the cross-species transmission rates among hosts in order to identify the species driving transmission. The densities of both human and avian species were supported contributors, along with latitude, longitude, elevation, and several meteorological variables. Also supported was the presence of a genetic motif found near the hemagglutinin cleavage site. Various genetic, geographic, demographic, and environmental predictors each play a role in H1N1 diffusion. Further development and expansion of phylogeographic GLMs such as this will enable health agencies to identify variables that can curb virus diffusion and reduce morbidity and mortality.
Assuntos
Virus da Influenza A Subtipo H5N1 , Influenza Aviária/virologia , Influenza Humana/virologia , Animais , Teorema de Bayes , Aves , Egito/epidemiologia , Humanos , Influenza Aviária/epidemiologia , Influenza Humana/epidemiologia , Filogeografia , ZoonosesRESUMO
BACKGROUND: Time series models can play an important role in disease prediction. Incidence data can be used to predict the future occurrence of disease events. Developments in modeling approaches provide an opportunity to compare different time series models for predictive power. RESULTS: We applied ARIMA and Random Forest time series models to incidence data of outbreaks of highly pathogenic avian influenza (H5N1) in Egypt, available through the online EMPRES-I system. We found that the Random Forest model outperformed the ARIMA model in predictive ability. Furthermore, we found that the Random Forest model is effective for predicting outbreaks of H5N1 in Egypt. CONCLUSIONS: Random Forest time series modeling provides enhanced predictive ability over existing time series models for the prediction of infectious disease outbreaks. This result, along with those showing the concordance between bird and human outbreaks (Rabinowitz et al. 2012), provides a new approach to predicting these dangerous outbreaks in bird populations based on existing, freely available data. Our analysis uncovers the time-series structure of outbreak severity for highly pathogenic avain influenza (H5N1) in Egypt.
Assuntos
Inteligência Artificial , Biologia Computacional/métodos , Surtos de Doenças/estatística & dados numéricos , Virus da Influenza A Subtipo H5N1/fisiologia , Influenza Aviária/epidemiologia , Influenza Humana/epidemiologia , Modelos Estatísticos , Animais , Aves/virologia , Egito/epidemiologia , HumanosRESUMO
Background: There has been an unprecedented effort to sequence the SARS-CoV-2 virus and examine its molecular evolution. This has been facilitated by the availability of publicly accessible databases, the Global Initiative on Sharing All Influenza Data (GISAID) and GenBank, which collectively hold millions of SARS-CoV-2 sequence records. Genomic epidemiology, however, seeks to go beyond phylogenetic analysis by linking genetic information to patient characteristics and disease outcomes, enabling a comprehensive understanding of transmission dynamics and disease impact.While these repositories include fields reflecting patient-related metadata for a given sequence, inclusion of these demographic and clinical details is scarce. The extent to which patient-related metadata is reported in published sequencing studies and its quality remains largely unexplored. Methods: The NIH's LitCovid collection will be used for automated classification of articles reporting having deposited SARS-CoV-2 sequences in public repositories, while an independent search will be conducted in PubMed for validation. Data extraction will be conducted using Covidence. The extracted data will be synthesized and summarized to quantify the availability of patient metadata in the published literature of SARS-CoV-2 sequencing studies. For the bibliometric analysis, relevant data points, such as author affiliations and citation metrics will be extracted. Discussion: This scoping review will report on the extent and types of patient-related metadata reported in genomic viral sequencing studies of SARS-CoV-2, identify gaps in this reporting, and make recommendations for improving the quality and consistency of reporting in this area. The bibliometric analysis will uncover trends and patterns in the reporting of patient-related metadata, including differences in reporting based on study types or geographic regions. Co-occurrence networks of author keywords will also be presented. The insights gained from this study may help improve the quality and consistency of reporting patient metadata, enhancing the utility of sequence metadata and facilitating future research on infectious diseases.
RESUMO
BACKGROUND: During the 2019 severe influenza season, New South Wales (NSW) experienced the highest number of cases in Australia. This study retrospectively investigated the genetic characteristics of influenza viruses circulating in NSW in 2019 and identified genetic markers related to antiviral resistance and potential virulence. METHODS: The complete genomes of influenza A and B viruses were amplified using reverse transcription-polymerase chain reaction (PCR) and sequenced with an Illumina MiSeq platform. RESULTS: When comparing the sequencing data with the vaccine strains and reference sequences, the phylogenetic analysis revealed that most NSW A/H3N2 viruses (n = 68; 94%) belonged to 3C.2a1b and a minority (n = 4; 6%) belonged to 3C.3a. These viruses all diverged from the vaccine strain A/Switzerland/8060/2017. All A/H1N1pdm09 viruses (n = 20) showed genetic dissimilarity from vaccine strain A/Michigan/45/2015, with subclades 6B.1A.5 and 6B.1A.2 identified. All B/Victoria-lineage viruses (n = 21) aligned with clade V1A.3, presenting triple amino acid deletions at positions 162-164 in the hemagglutinin protein, significantly diverging from the vaccine strain B/Colorado/06/2017. Multiple amino acid substitutions were also found in the internal proteins of influenza viruses, some of which have been previously reported in hospitalized influenza patients in Thailand. Notably, the oseltamivir-resistant marker H275Y was present in one immunocompromised patient infected with A/H1N1pdm09 and the resistance-related mutation I222V was detected in another A/H3N2-infected patient. CONCLUSIONS: Considering antigenic drift and the constant evolution of circulating A and B strains, we believe continuous monitoring of influenza viruses in NSW via the high-throughput sequencing approach provides timely and pivotal information for both public health surveillance and clinical treatment.
Assuntos
Herpesvirus Cercopitecino 1 , Vacinas contra Influenza , Influenza Humana , Humanos , Estudos Retrospectivos , Herpesvirus Cercopitecino 1/genética , Vírus da Influenza A Subtipo H3N2/genética , New South Wales/epidemiologia , Filogenia , Glicoproteínas de Hemaglutininação de Vírus da Influenza/genética , Austrália , Estações do Ano , Sequenciamento Completo do GenomaRESUMO
We describe four complete coding sequence (cCDS) of canine picornavirus from wastewater in Arizona, USA detected by coupling cCDS single-contig (â¼7.5 kb) reverse-transcriptase polymerase chain reaction (RT-PCR) and low-cost long-read high-throughput sequencing. For viruses of medical/veterinary importance, this workflow expands possibilities of wastewater based genomic epidemiology for exploring virus evolutionary dynamics especially in low-resource settings.
Assuntos
Infecções por Picornaviridae , Picornaviridae , Animais , Cães , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Águas Residuárias , Picornaviridae/genética , FilogeniaRESUMO
We describe nine Rhizobium microvirus genomes identified in wastewater in Tempe, AZ, USA, between October 2019 and March 2020. The major capsid protein (MCP) encoded in these genomes phylogenetically cluster together and are distinct from the MCPs of Rhizobium microviruses identified in Mexico and Argentina.
RESUMO
The SARS-CoV-2 pandemic resulted in a scale-up of viral genomic surveillance globally. However, the wet lab constraints (economic, infrastructural, and personnel) of translating novel virus variant sequence information to meaningful immunological and structural insights that are valuable for the development of broadly acting countermeasures (especially for emerging and re-emerging viruses) remain a challenge in many resource-limited settings. Here, we describe a workflow that couples wastewater surveillance, high-throughput sequencing, phylogenetics, immuno-informatics, and virus capsid structure modeling for the genotype-to-serotype characterization of uncultivated picornavirus sequences identified in wastewater. Specifically, we analyzed canine picornaviruses (CanPVs), which are uncultivated and yet-to-be-assigned members of the family Picornaviridae that cause systemic infections in canines. We analyzed 118 archived (stored at -20 °C) wastewater (WW) samples representing a population of ~700,000 persons in southwest USA between October 2019 to March 2020 and October 2020 to March 2021. Samples were pooled into 12 two-liter volumes by month, partitioned (into filter-trapped solids [FTSs] and filtrates) using 450 nm membrane filters, and subsequently concentrated to 2 mL (1000×) using 10,000 Da MW cutoff centrifugal filters. The 24 concentrates were subjected to RNA extraction, CanPV complete capsid single-contig RT-PCR, Illumina sequencing, phylogenetics, immuno-informatics, and structure prediction. We detected CanPVs in 58.3% (14/24) of the samples generated 13,824,046 trimmed Illumina reads and 27 CanPV contigs. Phylogenetic and pairwise identity analyses showed eight CanPV genotypes (intragenotype divergence <14%) belonging to four clusters, with intracluster divergence of <20%. Similarity analysis, immuno-informatics, and virus protomer and capsid structure prediction suggested that the four clusters were likely distinct serological types, with predicted cluster-distinguishing B-cell epitopes clustered in the northern and southern rims of the canyon surrounding the 5-fold axis of symmetry. Our approach allows forgenotype-to-serotype characterization of uncultivated picornavirus sequences by coupling phylogenetics, immuno-informatics, and virus capsid structure prediction. This consequently bypasses a major wet lab-associated bottleneck, thereby allowing resource-limited settings to leapfrog from wastewater-sourced genomic data to valuable immunological insights necessary for the development of prophylaxis and other mitigation measures.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Filogenia , Picornaviridae , Águas Residuárias , Picornaviridae/genética , Picornaviridae/classificação , Picornaviridae/isolamento & purificação , Animais , Cães , Águas Residuárias/virologia , Proteínas do Capsídeo/genética , Proteínas do Capsídeo/química , Genoma Viral , Capsídeo/imunologia , Capsídeo/química , Estados Unidos/epidemiologia , Infecções por Picornaviridae/veterinária , Infecções por Picornaviridae/virologia , Infecções por Picornaviridae/epidemiologia , Doenças do Cão/virologia , Doenças do Cão/epidemiologia , Genótipo , Variação GenéticaRESUMO
BACKGROUND: Influenza A H5N1 has killed millions of birds and raises serious public health concern because of its potential to spread to humans and cause a global pandemic. While the early focus was in Asia, recent evidence suggests that Egypt is a new epicenter for the disease. This includes characterization of a variant clade 2.2.1.1, which has been found almost exclusively in Egypt.We analyzed 226 HA and 92 NA sequences with an emphasis on the H5N1 2.2.1.1 strains in Egypt using a Bayesian discrete phylogeography approach. This allowed modeling of virus dispersion between Egyptian governorates including the most likely origin. RESULTS: Phylogeography models of hemagglutinin (HA) and neuraminidase (NA) suggest Ash Sharqiyah as the origin of virus spread, however the support is weak based on Kullback-Leibler values of 0.09 for HA and 0.01 for NA. Association Index (AI) values and Parsimony Scores (PS) were significant (p-value < 0.05), indicating that dispersion of H5N1 in Egypt was geographically structured. In addition, the Ash Sharqiyah to Al Gharbiyah and Al Fayyum to Al Qalyubiyah routes had the strongest statistical support. CONCLUSION: We found that the majority of routes with strong statistical support were in the heavily populated Delta region. In particular, the Al Qalyubiyah governorate appears to represent a popular location for virus transition as it represented a large portion of branches in both trees. However, there remains uncertainty about virus dispersion to and from this location and thus more research needs to be conducted in order to examine this.Phylogeography can highlight the drivers of H5N1 emergence and spread. This knowledge can be used to target public health efforts to reduce morbidity and mortality. For Egypt, future work should focus on using data about vaccination and live bird markets in phylogeography models to study their impact on H5N1 diffusion within the country.