Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

Predicting variable gene content in Escherichia coli using conserved genes.

Nguyen, Marcus; Elmore, Zachary; Ihle, Clay; Moen, Francesco S; Slater, Adam D; Turner, Benjamin N; Parrello, Bruce; Best, Aaron A; Davis, James J.

mSystems ; 8(4): e0005823, 2023 08 31.

Artigo em Inglês | MEDLINE | ID: mdl-37314210

RESUMO

Having the ability to predict the protein-encoding gene content of an incomplete genome or metagenome-assembled genome is important for a variety of bioinformatic tasks. In this study, as a proof of concept, we built machine learning classifiers for predicting variable gene content in Escherichia coli genomes using only the nucleotide k-mers from a set of 100 conserved genes as features. Protein families were used to define orthologs, and a single classifier was built for predicting the presence or absence of each protein family occurring in 10%-90% of all E. coli genomes. The resulting set of 3,259 extreme gradient boosting classifiers had a per-genome average macro F1 score of 0.944 [0.943-0.945, 95% CI]. We show that the F1 scores are stable across multi-locus sequence types and that the trend can be recapitulated by sampling a smaller number of core genes or diverse input genomes. Surprisingly, the presence or absence of poorly annotated proteins, including "hypothetical proteins" was accurately predicted (F1 = 0.902 [0.898-0.906, 95% CI]). Models for proteins with horizontal gene transfer-related functions had slightly lower F1 scores but were still accurate (F1s = 0.895, 0.872, 0.824, and 0.841 for transposon, phage, plasmid, and antimicrobial resistance-related functions, respectively). Finally, using a holdout set of 419 diverse E. coli genomes that were isolated from freshwater environmental sources, we observed an average per-genome F1 score of 0.880 [0.876-0.883, 95% CI], demonstrating the extensibility of the models. Overall, this study provides a framework for predicting variable gene content using a limited amount of input sequence data. IMPORTANCE Having the ability to predict the protein-encoding gene content of a genome is important for assessing genome quality, binning genomes from shotgun metagenomic assemblies, and assessing risk due to the presence of antimicrobial resistance and other virulence genes. In this study, we built a set of binary classifiers for predicting the presence or absence of variable genes occurring in 10%-90% of all publicly available E. coli genomes. Overall, the results show that a large portion of the E. coli variable gene content can be predicted with high accuracy, including genes with functions relating to horizontal gene transfer. This study offers a strategy for predicting gene content using limited input sequence data.

Assuntos

Anti-Infecciosos , Proteínas de Escherichia coli , Escherichia coli/genética , Genoma Bacteriano/genética , Plasmídeos , Proteínas de Escherichia coli/genética

2.

Introducing the Bacterial and Viral Bioinformatics Resource Center (BV-BRC): a resource combining PATRIC, IRD and ViPR.

Olson, Robert D; Assaf, Rida; Brettin, Thomas; Conrad, Neal; Cucinell, Clark; Davis, James J; Dempsey, Donald M; Dickerman, Allan; Dietrich, Emily M; Kenyon, Ronald W; Kuscuoglu, Mehmet; Lefkowitz, Elliot J; Lu, Jian; Machi, Dustin; Macken, Catherine; Mao, Chunhong; Niewiadomska, Anna; Nguyen, Marcus; Olsen, Gary J; Overbeek, Jamie C; Parrello, Bruce; Parrello, Victoria; Porter, Jacob S; Pusch, Gordon D; Shukla, Maulik; Singh, Indresh; Stewart, Lucy; Tan, Gene; Thomas, Chris; VanOeffelen, Margo; Vonstein, Veronika; Wallace, Zachary S; Warren, Andrew S; Wattam, Alice R; Xia, Fangfang; Yoo, Hyunseung; Zhang, Yun; Zmasek, Christian M; Scheuermann, Richard H; Stevens, Rick L.

Nucleic Acids Res ; 51(D1): D678-D689, 2023 01 06.

Artigo em Inglês | MEDLINE | ID: mdl-36350631

RESUMO

The National Institute of Allergy and Infectious Diseases (NIAID) established the Bioinformatics Resource Center (BRC) program to assist researchers with analyzing the growing body of genome sequence and other omics-related data. In this report, we describe the merger of the PAThosystems Resource Integration Center (PATRIC), the Influenza Research Database (IRD) and the Virus Pathogen Database and Analysis Resource (ViPR) BRCs to form the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) https://www.bv-brc.org/. The combined BV-BRC leverages the functionality of the bacterial and viral resources to provide a unified data model, enhanced web-based visualization and analysis tools, bioinformatics services, and a powerful suite of command line tools that benefit the bacterial and viral research communities.

Assuntos

Genômica , Software , Vírus , Humanos , Bactérias/genética , Biologia Computacional , Bases de Dados Genéticas , Influenza Humana , Vírus/genética

3.

SARS-CoV-2 Omicron (B.1.1.529) Infection of Wild White-Tailed Deer in New York City.

Vandegrift, Kurt J; Yon, Michele; Surendran Nair, Meera; Gontu, Abhinay; Ramasamy, Santhamani; Amirthalingam, Saranya; Neerukonda, Sabarinath; Nissly, Ruth H; Chothe, Shubhada K; Jakka, Padmaja; LaBella, Lindsey; Levine, Nicole; Rodriguez, Sophie; Chen, Chen; Sheersh Boorla, Veda; Stuber, Tod; Boulanger, Jason R; Kotschwar, Nathan; Aucoin, Sarah Grimké; Simon, Richard; Toal, Katrina L; Olsen, Randall J; Davis, James J; Bold, Dashzeveg; Gaudreault, Natasha N; Dinali Perera, Krishani; Kim, Yunjeong; Chang, Kyeong-Ok; Maranas, Costas D; Richt, Juergen A; Musser, James M; Hudson, Peter J; Kapur, Vivek; Kuchipudi, Suresh V.

Viruses ; 14(12)2022 12 12.

Artigo em Inglês | MEDLINE | ID: mdl-36560774

RESUMO

There is mounting evidence of SARS-CoV-2 spillover from humans into many domestic, companion, and wild animal species. Research indicates that humans have infected white-tailed deer, and that deer-to-deer transmission has occurred, indicating that deer could be a wildlife reservoir and a source of novel SARS-CoV-2 variants. We examined the hypothesis that the Omicron variant is actively and asymptomatically infecting the free-ranging deer of New York City. Between December 2021 and February 2022, 155 deer on Staten Island, New York, were anesthetized and examined for gross abnormalities and illnesses. Paired nasopharyngeal swabs and blood samples were collected and analyzed for the presence of SARS-CoV-2 RNA and antibodies. Of 135 serum samples, 19 (14.1%) indicated SARS-CoV-2 exposure, and 11 reacted most strongly to the wild-type B.1 lineage. Of the 71 swabs, 8 were positive for SARS-CoV-2 RNA (4 Omicron and 4 Delta). Two of the animals had active infections and robust neutralizing antibodies, revealing evidence of reinfection or early seroconversion in deer. Variants of concern continue to circulate among and may reinfect US deer populations, and establish enzootic transmission cycles in the wild: this warrants a coordinated One Health response, to proactively surveil, identify, and curtail variants of concern before they can spill back into humans.

Assuntos

COVID-19 , Cervos , Humanos , Animais , Cidade de Nova Iorque/epidemiologia , RNA Viral/genética , SARS-CoV-2/genética , COVID-19/epidemiologia , COVID-19/veterinária , Animais Selvagens

4.

GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics.

Zvyagin, Maxim; Brace, Alexander; Hippe, Kyle; Deng, Yuntian; Zhang, Bin; Bohorquez, Cindy Orozco; Clyde, Austin; Kale, Bharat; Perez-Rivera, Danilo; Ma, Heng; Mann, Carla M; Irvin, Michael; Pauloski, J Gregory; Ward, Logan; Hayot-Sasson, Valerie; Emani, Murali; Foreman, Sam; Xie, Zhen; Lin, Diangen; Shukla, Maulik; Nie, Weili; Romero, Josh; Dallago, Christian; Vahdat, Arash; Xiao, Chaowei; Gibbs, Thomas; Foster, Ian; Davis, James J; Papka, Michael E; Brettin, Thomas; Stevens, Rick; Anandkumar, Anima; Vishwanath, Venkatram; Ramanathan, Arvind.

bioRxiv ; 2022 Nov 23.

Artigo em Inglês | MEDLINE | ID: mdl-36451881

RESUMO

We seek to transform how new and emergent variants of pandemic-causing viruses, specifically SARS-CoV-2, are identified and classified. By adapting large language models (LLMs) for genomic data, we build genome-scale language models (GenSLMs) which can learn the evolutionary landscape of SARS-CoV-2 genomes. By pre-training on over 110 million prokaryotic gene sequences and fine-tuning a SARS-CoV-2-specific model on 1.5 million genomes, we show that GenSLMs can accurately and rapidly identify variants of concern. Thus, to our knowledge, GenSLMs represents one of the first whole genome scale foundation models which can generalize to other prediction tasks. We demonstrate scaling of GenSLMs on GPU-based supercomputers and AI-hardware accelerators utilizing 1.63 Zettaflops in training runs with a sustained performance of 121 PFLOPS in mixed precision and peak of 850 PFLOPS. We present initial scientific insights from examining GenSLMs in tracking evolutionary dynamics of SARS-CoV-2, paving the path to realizing this on large biological data.

5.

Classification of bacterial plasmid and chromosome derived sequences using machine learning.

Zou, Xiaohui; Nguyen, Marcus; Overbeek, Jamie; Cao, Bin; Davis, James J.

PLoS One ; 17(12): e0279280, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36525447

RESUMO

Plasmids are important genetic elements that facilitate horizonal gene transfer between bacteria and contribute to the spread of virulence and antimicrobial resistance. Most bacterial genome sequences in the public archives exist in draft form with many contigs, making it difficult to determine if a contig is of chromosomal or plasmid origin. Using a training set of contigs comprising 10,584 chromosomes and 10,654 plasmids from the PATRIC database, we evaluated several machine learning models including random forest, logistic regression, XGBoost, and a neural network for their ability to classify chromosomal and plasmid sequences using nucleotide k-mers as features. Based on the methods tested, a neural network model that used nucleotide 6-mers as features that was trained on randomly selected chromosomal and plasmid subsequences 5kb in length achieved the best performance, outperforming existing out-of-the-box methods, with an average accuracy of 89.38% ± 2.16% over a 10-fold cross validation. The model accuracy can be improved to 92.08% by using a voting strategy when classifying holdout sequences. In both plasmids and chromosomes, subsequences encoding functions involved in horizontal gene transfer-including hypothetical proteins, transporters, phage, mobile elements, and CRISPR elements-were most likely to be misclassified by the model. This study provides a straightforward approach for identifying plasmid-encoding sequences in short read assemblies without the need for sequence alignment-based tools.

Assuntos

Cromossomos Bacterianos , Genoma Bacteriano , Plasmídeos/genética , Cromossomos Bacterianos/genética , Bactérias/genética , Aprendizado de Máquina , Nucleotídeos

6.

SourceFinder: a Machine-Learning-Based Tool for Identification of Chromosomal, Plasmid, and Bacteriophage Sequences from Assemblies.

Aytan-Aktug, Derya; Grigorjev, Vladislav; Szarvas, Judit; Clausen, Philip T L C; Munk, Patrick; Nguyen, Marcus; Davis, James J; Aarestrup, Frank M; Lund, Ole.

Microbiol Spectr ; 10(6): e0264122, 2022 12 21.

Artigo em Inglês | MEDLINE | ID: mdl-36377945

RESUMO

High-throughput genome sequencing technologies enable the investigation of complex genetic interactions, including the horizontal gene transfer of plasmids and bacteriophages. However, identifying these elements from assembled reads remains challenging due to genome sequence plasticity and the difficulty in assembling complete sequences. In this study, we developed a classifier, using random forest, to identify whether sequences originated from bacterial chromosomes, plasmids, or bacteriophages. The classifier was trained on a diverse collection of 23,211 chromosomal, plasmid, and bacteriophage sequences from hundreds of bacterial species. In order to adapt the classifier to incomplete sequences, each complete sequence was subsampled into 5,000 nucleotide fragments and further subdivided into k-mers. This three-class classifier succeeded in identifying chromosomes, plasmids, and bacteriophages using k-mer distributions of complete and partial genome sequences, including simulated metagenomic scaffolds with minimum performance of 0.939 area under the receiver operating characteristic curve (AUC). This classifier, implemented as SourceFinder, has been made available as an online web service to help the community with predicting the chromosomal, plasmid, and bacteriophage sources of assembled bacterial sequence data (https://cge.food.dtu.dk/services/SourceFinder/). IMPORTANCE Extra-chromosomal genes encoding antimicrobial resistance, metal resistance, and virulence provide selective advantages for bacterial survival under stress conditions and pose serious threats to human and animal health. These accessory genes can impact the composition of microbiomes by providing selective advantages to their hosts. Accurately identifying extra-chromosomal elements in genome sequence data are critical for understanding gene dissemination trajectories and taking preventative measures. Therefore, in this study, we developed a random forest classifier for identifying the source of bacterial chromosomal, plasmid, and bacteriophage sequences.

Assuntos

Bacteriófagos , Genoma Bacteriano , Humanos , Bacteriófagos/genética , Plasmídeos/genética , Cromossomos Bacterianos/genética , Aprendizado de Máquina

7.

Transmission history of SARS-CoV-2 in humans and white-tailed deer.

Willgert, Katriina; Didelot, Xavier; Surendran-Nair, Meera; Kuchipudi, Suresh V; Ruden, Rachel M; Yon, Michele; Nissly, Ruth H; Vandegrift, Kurt J; Nelli, Rahul K; Li, Lingling; Jayarao, Bhushan M; Levine, Nicole; Olsen, Randall J; Davis, James J; Musser, James M; Hudson, Peter J; Kapur, Vivek; Conlan, Andrew J K.

Sci Rep ; 12(1): 12094, 2022 07 15.

Artigo em Inglês | MEDLINE | ID: mdl-35840592

RESUMO

The emergence of a novel pathogen in a susceptible population can cause rapid spread of infection. High prevalence of SARS-CoV-2 infection in white-tailed deer (Odocoileus virginianus) has been reported in multiple locations, likely resulting from several human-to-deer spillover events followed by deer-to-deer transmission. Knowledge of the risk and direction of SARS-CoV-2 transmission between humans and potential reservoir hosts is essential for effective disease control and prioritisation of interventions. Using genomic data, we reconstruct the transmission history of SARS-CoV-2 in humans and deer, estimate the case finding rate and attempt to infer relative rates of transmission between species. We found no evidence of direct or indirect transmission from deer to human. However, with an estimated case finding rate of only 4.2%, spillback to humans cannot be ruled out. The extensive transmission of SARS-CoV-2 within deer populations and the large number of unsampled cases highlights the need for active surveillance at the human-animal interface.

Assuntos

COVID-19 , Cervos , SARS-CoV-2 , Zoonoses Virais , Animais , COVID-19/epidemiologia , COVID-19/prevenção & controle , COVID-19/transmissão , COVID-19/veterinária , Cervos/virologia , Monitoramento Ambiental , Humanos , Medição de Risco , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação , Zoonoses Virais/epidemiologia , Zoonoses Virais/transmissão , Zoonoses Virais/virologia

8.

PlasmidHostFinder: Prediction of Plasmid Hosts Using Random Forest.

Aytan-Aktug, Derya; Clausen, Philip T L C; Szarvas, Judit; Munk, Patrick; Otani, Saria; Nguyen, Marcus; Davis, James J; Lund, Ole; Aarestrup, Frank M.

mSystems ; 7(2): e0118021, 2022 04 26.

Artigo em Inglês | MEDLINE | ID: mdl-35382558

RESUMO

Plasmids play a major role facilitating the spread of antimicrobial resistance between bacteria. Understanding the host range and dissemination trajectories of plasmids is critical for surveillance and prevention of antimicrobial resistance. Identification of plasmid host ranges could be improved using automated pattern detection methods compared to homology-based methods due to the diversity and genetic plasticity of plasmids. In this study, we developed a method for predicting the host range of plasmids using machine learning-specifically, random forests. We trained the models with 8,519 plasmids from 359 different bacterial species per taxonomic level; the models achieved Matthews correlation coefficients of 0.662 and 0.867 at the species and order levels, respectively. Our results suggest that despite the diverse nature and genetic plasticity of plasmids, our random forest model can accurately distinguish between plasmid hosts. This tool is available online through the Center for Genomic Epidemiology (https://cge.cbs.dtu.dk/services/PlasmidHostFinder/). IMPORTANCE Antimicrobial resistance is a global health threat to humans and animals, causing high mortality and morbidity while effectively ending decades of success in fighting against bacterial infections. Plasmids confer extra genetic capabilities to the host organisms through accessory genes that can encode antimicrobial resistance and virulence. In addition to lateral inheritance, plasmids can be transferred horizontally between bacterial taxa. Therefore, detection of the host range of plasmids is crucial for understanding and predicting the dissemination trajectories of extrachromosomal genes and bacterial evolution as well as taking effective countermeasures against antimicrobial resistance.

Assuntos

Anti-Infecciosos , Algoritmo Florestas Aleatórias , Animais , Humanos , Plasmídeos , Bactérias/genética , Genômica

9.

Detection of SARS-CoV-2 Omicron variant (B.1.1.529) infection of white-tailed deer.

Vandegrift, Kurt J; Yon, Michele; Surendran-Nair, Meera; Gontu, Abhinay; Amirthalingam, Saranya; Nissly, Ruth H; Levine, Nicole; Stuber, Tod; DeNicola, Anthony J; Boulanger, Jason R; Kotschwar, Nathan; Aucoin, Sarah Grimké; Simon, Richard; Toal, Katrina; Olsen, Randall J; Davis, James J; Bold, Dashzeveg; Gaudreault, Natasha N; Richt, Juergen A; Musser, James M; Hudson, Peter J; Kapur, Vivek; Kuchipudi, Suresh V.

bioRxiv ; 2022 Feb 07.

Artigo em Inglês | MEDLINE | ID: mdl-35169802

RESUMO

White-tailed deer ( Odocoileus virginianus ) are highly susceptible to infection by SARS-CoV-2, with multiple reports of widespread spillover of virus from humans to free-living deer. While the recently emerged SARS-CoV-2 B.1.1.529 Omicron variant of concern (VoC) has been shown to be notably more transmissible amongst humans, its ability to cause infection and spillover to non-human animals remains a challenge of concern. We found that 19 of the 131 (14.5%; 95% CI: 0.10-0.22) white-tailed deer opportunistically sampled on Staten Island, New York, between December 12, 2021, and January 31, 2022, were positive for SARS-CoV-2 specific serum antibodies using a surrogate virus neutralization assay, indicating prior exposure. The results also revealed strong evidence of age-dependence in antibody prevalence. A significantly (χ 2 , p < 0.001) greater proportion of yearling deer possessed neutralizing antibodies as compared with fawns (OR=12.7; 95% CI 4-37.5). Importantly, SARS-CoV-2 nucleic acid was detected in nasal swabs from seven of 68 (10.29%; 95% CI: 0.0-0.20) of the sampled deer, and whole-genome sequencing identified the SARS-CoV-2 Omicron VoC (B.1.1.529) is circulating amongst the white-tailed deer on Staten Island. Phylogenetic analyses revealed the deer Omicron sequences clustered closely with other, recently reported Omicron sequences recovered from infected humans in New York City and elsewhere, consistent with human to deer spillover. Interestingly, one individual deer was positive for viral RNA and had a high level of neutralizing antibodies, suggesting either rapid serological conversion during an ongoing infection or a "breakthrough" infection in a previously exposed animal. Together, our findings show that the SARS-CoV-2 B.1.1.529 Omicron VoC can infect white-tailed deer and highlights an urgent need for comprehensive surveillance of susceptible animal species to identify ecological transmission networks and better assess the potential risks of spillback to humans. KEY FINDINGS: These studies provide strong evidence of infection of free-living white-tailed deer with the SARS-CoV-2 B.1.1.529 Omicron variant of concern on Staten Island, New York, and highlight an urgent need for investigations on human-to-animal-to-human spillovers/spillbacks as well as on better defining the expanding host-range of SARS-CoV-2 in non-human animals and the environment.

10.

Signals of Significantly Increased Vaccine Breakthrough, Decreased Hospitalization Rates, and Less Severe Disease in Patients with Coronavirus Disease 2019 Caused by the Omicron Variant of Severe Acute Respiratory Syndrome Coronavirus 2 in Houston, Texas.

Christensen, Paul A; Olsen, Randall J; Long, S Wesley; Snehal, Richard; Davis, James J; Ojeda Saavedra, Matthew; Reppond, Kristina; Shyer, Madison N; Cambric, Jessica; Gadd, Ryan; Thakur, Rashi M; Batajoo, Akanksha; Mangham, Regan; Pena, Sindy; Trinh, Trina; Kinskey, Jacob C; Williams, Guy; Olson, Robert; Gollihar, Jimmy; Musser, James M.

Am J Pathol ; 192(4): 642-652, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-35123975

RESUMO

Genetic variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) continue to dramatically alter the landscape of the coronavirus disease 2019 (COVID-19) pandemic. The recently described variant of concern designated Omicron (B.1.1.529) has rapidly spread worldwide and is now responsible for the majority of COVID-19 cases in many countries. Because Omicron was recognized recently, many knowledge gaps exist about its epidemiology, clinical severity, and disease course. A genome sequencing study of SARS-CoV-2 in the Houston Methodist health care system identified 4468 symptomatic patients with infections caused by Omicron from late November 2021 through January 5, 2022. Omicron rapidly increased in only 3 weeks to cause 90% of all new COVID-19 cases, and at the end of the study period caused 98% of new cases. Compared with patients infected with either Alpha or Delta variants in our health care system, Omicron patients were significantly younger, had significantly increased vaccine breakthrough rates, and were significantly less likely to be hospitalized. Omicron patients required less intense respiratory support and had a shorter length of hospital stay, consistent with on average decreased disease severity. Two patients with Omicron stealth sublineage BA.2 also were identified. The data document the unusually rapid spread and increased occurrence of COVID-19 caused by the Omicron variant in metropolitan Houston, Texas, and address the lack of information about disease character among US patients.

Assuntos

COVID-19 , Vacinas , COVID-19/epidemiologia , Hospitalização , Humanos , SARS-CoV-2/genética , Texas/epidemiologia

11.

Multiple spillovers from humans and onward transmission of SARS-CoV-2 in white-tailed deer.

Kuchipudi, Suresh V; Surendran-Nair, Meera; Ruden, Rachel M; Yon, Michele; Nissly, Ruth H; Vandegrift, Kurt J; Nelli, Rahul K; Li, Lingling; Jayarao, Bhushan M; Maranas, Costas D; Levine, Nicole; Willgert, Katriina; Conlan, Andrew J K; Olsen, Randall J; Davis, James J; Musser, James M; Hudson, Peter J; Kapur, Vivek.

Proc Natl Acad Sci U S A ; 119(6)2022 02 08.

Artigo em Inglês | MEDLINE | ID: mdl-35078920

RESUMO

Many animal species are susceptible to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and could act as reservoirs; however, transmission in free-living animals has not been documented. White-tailed deer, the predominant cervid in North America, are susceptible to SARS-CoV-2 infection, and experimentally infected fawns can transmit the virus. To test the hypothesis that SARS-CoV-2 is circulating in deer, 283 retropharyngeal lymph node (RPLN) samples collected from 151 free-living and 132 captive deer in Iowa from April 2020 through January of 2021 were assayed for the presence of SARS-CoV-2 RNA. Ninety-four of the 283 (33.2%) deer samples were positive for SARS-CoV-2 RNA as assessed by RT-PCR. Notably, following the November 2020 peak of human cases in Iowa, and coinciding with the onset of winter and the peak deer hunting season, SARS-CoV-2 RNA was detected in 80 of 97 (82.5%) RPLN samples collected over a 7-wk period. Whole genome sequencing of all 94 positive RPLN samples identified 12 SARS-CoV-2 lineages, with B.1.2 (n = 51; 54.5%) and B.1.311 (n = 19; 20%) accounting for â¼75% of all samples. The geographic distribution and nesting of clusters of deer and human lineages strongly suggest multiple human-to-deer transmission events followed by subsequent deer-to-deer spread. These discoveries have important implications for the long-term persistence of the SARS-CoV-2 pandemic. Our findings highlight an urgent need for a robust and proactive "One Health" approach to obtain enhanced understanding of the ecology, molecular evolution, and dissemination of SARS-CoV-2.

Assuntos

COVID-19/transmissão , Cervos/virologia , SARS-CoV-2/isolamento & purificação , Zoonoses/virologia , Animais , COVID-19/virologia , Reservatórios de Doenças/virologia , Humanos , SARS-CoV-2/genética

12.

Delta Variants of SARS-CoV-2 Cause Significantly Increased Vaccine Breakthrough COVID-19 Cases in Houston, Texas.

Christensen, Paul A; Olsen, Randall J; Long, S Wesley; Subedi, Sishir; Davis, James J; Hodjat, Parsa; Walley, Debbie R; Kinskey, Jacob C; Ojeda Saavedra, Matthew; Pruitt, Layne; Reppond, Kristina; Shyer, Madison N; Cambric, Jessica; Gadd, Ryan; Thakur, Rashi M; Batajoo, Akanksha; Mangham, Regan; Pena, Sindy; Trinh, Trina; Yerramilli, Prasanti; Nguyen, Marcus; Olson, Robert; Snehal, Richard; Gollihar, Jimmy; Musser, James M.

Am J Pathol ; 192(2): 320-331, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-34774517

RESUMO

Genetic variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) have repeatedly altered the course of the coronavirus disease 2019 (COVID-19) pandemic. Delta variants are now the focus of intense international attention because they are causing widespread COVID-19 globally and are associated with vaccine breakthrough cases. We sequenced 16,965 SARS-CoV-2 genomes from samples acquired March 15, 2021, through September 20, 2021, in the Houston Methodist hospital system. This sample represents 91% of all Methodist system COVID-19 patients during the study period. Delta variants increased rapidly from late April onward to cause 99.9% of all COVID-19 cases and spread throughout the Houston metroplex. Compared with all other variants combined, Delta caused a significantly higher rate of vaccine breakthrough cases (23.7% for Delta compared with 6.6% for all other variants combined). Importantly, significantly fewer fully vaccinated individuals required hospitalization. Vaccine breakthrough cases caused by Delta had a low median PCR cycle threshold value (a proxy for high virus load). This value was similar to the median cycle threshold value for unvaccinated patients with COVID-19 caused by Delta variants, suggesting that fully vaccinated individuals can transmit SARS-CoV-2 to others. Patients infected with Alpha and Delta variants had several significant differences. The integrated analysis indicates that vaccines used in the United States are highly effective in decreasing severe COVID-19, hospitalizations, and deaths.

Assuntos

COVID-19/virologia , SARS-CoV-2 , Adulto , Vacinas contra COVID-19 , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Texas

13.

Analysis of the ARTIC Version 3 and Version 4 SARS-CoV-2 Primers and Their Impact on the Detection of the G142D Amino Acid Substitution in the Spike Protein.

Davis, James J; Long, S Wesley; Christensen, Paul A; Olsen, Randall J; Olson, Robert; Shukla, Maulik; Subedi, Sishir; Stevens, Rick; Musser, James M.

Microbiol Spectr ; 9(3): e0180321, 2021 12 22.

Artigo em Inglês | MEDLINE | ID: mdl-34878296

RESUMO

The ARTIC Network provides a common resource of PCR primer sequences and recommendations for amplifying SARS-CoV-2 genomes. The initial tiling strategy was developed with the reference genome Wuhan-01, and subsequent iterations have addressed areas of low amplification and sequence drop out. Recently, a new version (V4) was released, based on new variant genome sequences, in response to the realization that some V3 primers were located in regions with key mutations. Herein, we compare the performance of the ARTIC V3 and V4 primer sets with a matched set of 663 SARS-CoV-2 clinical samples sequenced with an Illumina NovaSeq 6000 instrument. We observe general improvements in sequencing depth and quality, and improved resolution of the SNP causing the D950N variation in the spike protein. Importantly, we also find nearly universal presence of spike protein substitution G142D in Delta-lineage samples. Due to the prior release and widespread use of the ARTIC V3 primers during the initial surge of the Delta variant, it is likely that the G142D amino acid substitution is substantially underrepresented among early Delta variant genomes deposited in public repositories. In addition to the improved performance of the ARTIC V4 primer set, this study also illustrates the importance of the primer scheme in downstream analyses. IMPORTANCE ARTIC Network primers are commonly used by laboratories worldwide to amplify and sequence SARS-CoV-2 present in clinical samples. As new variants have evolved and spread, it was found that the V3 primer set poorly amplified several key mutations. In this report, we compare the results of sequencing a matched set of samples with the V3 and V4 primer sets. We find that adoption of the ARTIC V4 primer set is critical for accurate sequencing of the SARS-CoV-2 spike region. The absence of metadata describing the primer scheme used will negatively impact the downstream use of publicly available SARS-Cov-2 sequencing reads and assembled genomes.

Assuntos

Substituição de Aminoácidos , COVID-19/virologia , SARS-CoV-2/classificação , SARS-CoV-2/genética , SARS-CoV-2/isolamento & purificação , Glicoproteína da Espícula de Coronavírus/genética , Sequência de Bases , Genoma Viral , Humanos , Mutação , Sequenciamento Completo do Genoma

14.

A genomic data resource for predicting antimicrobial resistance from laboratory-derived antimicrobial susceptibility phenotypes.

VanOeffelen, Margo; Nguyen, Marcus; Aytan-Aktug, Derya; Brettin, Thomas; Dietrich, Emily M; Kenyon, Ronald W; Machi, Dustin; Mao, Chunhong; Olson, Robert; Pusch, Gordon D; Shukla, Maulik; Stevens, Rick; Vonstein, Veronika; Warren, Andrew S; Wattam, Alice R; Yoo, Hyunseung; Davis, James J.

Brief Bioinform ; 22(6)2021 11 05.

Artigo em Inglês | MEDLINE | ID: mdl-34379107

RESUMO

Antimicrobial resistance (AMR) is a major global health threat that affects millions of people each year. Funding agencies worldwide and the global research community have expended considerable capital and effort tracking the evolution and spread of AMR by isolating and sequencing bacterial strains and performing antimicrobial susceptibility testing (AST). For the last several years, we have been capturing these efforts by curating data from the literature and data resources and building a set of assembled bacterial genome sequences that are paired with laboratory-derived AST data. This collection currently contains AST data for over 67 000 genomes encompassing approximately 40 genera and over 100 species. In this paper, we describe the characteristics of this collection, highlighting areas where sampling is comparatively deep or shallow, and showing areas where attention is needed from the research community to improve sampling and tracking efforts. In addition to using the data to track the evolution and spread of AMR, it also serves as a useful starting point for building machine learning models for predicting AMR phenotypes. We demonstrate this by describing two machine learning models that are built from the entire dataset to show where the predictive power is comparatively high or low. This AMR metadata collection is freely available and maintained on the Bacterial and Viral Bioinformatics Center (BV-BRC) FTP site ftp://ftp.bvbrc.org/RELEASE_NOTES/PATRIC_genomes_AMR.txt.

Assuntos

Biologia Computacional/métodos , Bases de Dados Genéticas , Resistência Microbiana a Medicamentos , Genômica/métodos , Testes de Sensibilidade Microbiana , Inteligência Artificial , Bactérias/efeitos dos fármacos , Bactérias/genética , Genoma Bacteriano , Humanos , Laboratórios , Aprendizado de Máquina , Fenótipo

15.

Trajectory of Growth of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Variants in Houston, Texas, January through May 2021, Based on 12,476 Genome Sequences.

Olsen, Randall J; Christensen, Paul A; Long, S Wesley; Subedi, Sishir; Hodjat, Parsa; Olson, Robert; Nguyen, Marcus; Davis, James J; Yerramilli, Prasanti; Saavedra, Matthew O; Pruitt, Layne; Reppond, Kristina; Shyer, Madison N; Cambric, Jessica; Gadd, Ryan; Thakur, Rashi M; Batajoo, Akanksha; Finkelstein, Ilya J; Gollihar, Jimmy; Musser, James M.

Am J Pathol ; 191(10): 1754-1773, 2021 10.

Artigo em Inglês | MEDLINE | ID: mdl-34303698

RESUMO

Certain genetic variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are of substantial concern because they may be more transmissible or detrimentally alter the pandemic course and disease features in individual patients. SARS-CoV-2 genome sequences from 12,476 patients in the Houston Methodist health care system diagnosed from January 1 through May 31, 2021 are reported here. Prevalence of the B.1.1.7 (Alpha) variant increased rapidly and caused 63% to 90% of new cases in the latter half of May. Eleven B.1.1.7 genomes had an E484K replacement in spike protein, a change also identified in other SARS-CoV-2 lineages. Compared with non-B.1.1.7-infected patients, individuals with B.1.1.7 had a significantly lower cycle threshold (a proxy for higher virus load) and significantly higher hospitalization rate. Other variants [eg, B.1.429 and B.1.427 (Epsilon), P.1 (Gamma), P.2 (Zeta), and R.1] also increased rapidly, although the magnitude was less than that in B.1.1.7. Twenty-two patients infected with B.1.617.1 (Kappa) or B.1.617.2 (Delta) variants had a high rate of hospitalization. Breakthrough cases (n = 207) in fully vaccinated patients were caused by a heterogeneous array of virus genotypes, including many not currently designated variants of interest or concern. In the aggregate, this study delineates the trajectory of SARS-CoV-2 variants circulating in a major metropolitan area, documents B.1.1.7 as the major cause of new cases in Houston, TX, and heralds the arrival of B.1.617 variants in the metroplex.

Assuntos

COVID-19/epidemiologia , Genoma Viral , Mutação , SARS-CoV-2/genética , COVID-19/genética , COVID-19/transmissão , COVID-19/virologia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , SARS-CoV-2/isolamento & purificação , Texas/epidemiologia

16.

Sequence Analysis of 20,453 Severe Acute Respiratory Syndrome Coronavirus 2 Genomes from the Houston Metropolitan Area Identifies the Emergence and Widespread Distribution of Multiple Isolates of All Major Variants of Concern.

Long, S Wesley; Olsen, Randall J; Christensen, Paul A; Subedi, Sishir; Olson, Robert; Davis, James J; Saavedra, Matthew Ojeda; Yerramilli, Prasanti; Pruitt, Layne; Reppond, Kristina; Shyer, Madison N; Cambric, Jessica; Finkelstein, Ilya J; Gollihar, Jimmy; Musser, James M.

Am J Pathol ; 191(6): 983-992, 2021 06.

Artigo em Inglês | MEDLINE | ID: mdl-33741335

RESUMO

Since the beginning of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, there has been international concern about the emergence of virus variants with mutations that increase transmissibility, enhance escape from the human immune response, or otherwise alter biologically important phenotypes. In late 2020, several variants of concern emerged globally, including the UK variant (B.1.1.7), the South Africa variant (B.1.351), Brazil variants (P.1 and P.2), and two related California variants of interest (B.1.429 and B.1.427). These variants are believed to have enhanced transmissibility. For the South Africa and Brazil variants, there is evidence that mutations in spike protein permit it to escape from some vaccines and therapeutic monoclonal antibodies. On the basis of our extensive genome sequencing program involving 20,453 coronavirus disease 2019 patient samples collected from March 2020 to February 2021, we report identification of all six of these SARS-CoV-2 variants among Houston Methodist Hospital (Houston, TX) patients residing in the greater metropolitan area. Although these variants are currently at relatively low frequency (aggregate of 1.1%) in the population, they are geographically widespread. Houston is the first city in the United States in which active circulation of all six current variants of concern has been documented by genome sequencing. As vaccine deployment accelerates, increased genomic surveillance of SARS-CoV-2 is essential to understanding the presence, frequency, and medical impact of consequential variants and their patterns and trajectory of dissemination.

Assuntos

COVID-19 , Mutação , Pandemias , SARS-CoV-2/genética , COVID-19/epidemiologia , COVID-19/genética , COVID-19/transmissão , Feminino , Humanos , Masculino , SARS-CoV-2/isolamento & purificação , Texas/epidemiologia

17.

Erratum for Pincus et al., "A Genome-Based Model to Predict the Virulence of Pseudomonas aeruginosa Isolates".

Pincus, Nathan B; Ozer, Egon A; Allen, Jonathan P; Nguyen, Marcus; Davis, James J; Winter, Deborah R; Chuang, Chih-Hsien; Chiu, Cheng-Hsun; Zamorano, Laura; Oliver, Antonio; Hauser, Alan R.

mBio ; 12(1)2021 Feb 23.

Artigo em Inglês | MEDLINE | ID: mdl-33622735

18.

Predicting antimicrobial susceptibility from the bacterial genome: A new paradigm for one health resistance monitoring.

McDermott, Patrick F; Davis, James J.

J Vet Pharmacol Ther ; 44(2): 223-237, 2021 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-33010049

RESUMO

The laboratory identification of antibacterial resistance is a cornerstone of infectious disease medicine. In vitro antimicrobial susceptibility testing has long been based on the growth response of organisms in pure culture to a defined concentration of antimicrobial agents. By comparing individual isolates to wild-type susceptibility patterns, strains with acquired resistance can be identified. Acquired resistance can also be detected genetically. After many decades of research, the inventory of genes underlying antimicrobial resistance is well known for several pathogenic genera including zoonotic enteric organisms such as Salmonella and Campylobacter and continues to grow substantially for others. With the decline in costs for large scale DNA sequencing, it is now practicable to characterize bacteria using whole genome sequencing, including the carriage of resistance genes in individual microorganisms and those present in complex biological samples. With genomics, we can generate comprehensive, detailed information on the bacterium, the mechanisms of antibiotic resistance, clues to its source, and the nature of mobile DNA elements by which resistance spreads. These developments point to a new paradigm for antimicrobial resistance detection and tracking for both clinical and public health purposes.

Assuntos

Saúde Única , Animais , Antibacterianos/farmacologia , Bactérias/genética , Farmacorresistência Bacteriana/genética , Genoma Bacteriano , Testes de Sensibilidade Microbiana/veterinária , Sequenciamento Completo do Genoma/veterinária

19.

Molecular Architecture of Early Dissemination and Massive Second Wave of the SARS-CoV-2 Virus in a Major Metropolitan Area.

Long, S Wesley; Olsen, Randall J; Christensen, Paul A; Bernard, David W; Davis, James J; Shukla, Maulik; Nguyen, Marcus; Saavedra, Matthew Ojeda; Yerramilli, Prasanti; Pruitt, Layne; Subedi, Sishir; Kuo, Hung-Che; Hendrickson, Heather; Eskandari, Ghazaleh; Nguyen, Hoang A T; Long, J Hunter; Kumaraswami, Muthiah; Goike, Jule; Boutz, Daniel; Gollihar, Jimmy; McLellan, Jason S; Chou, Chia-Wei; Javanmardi, Kamyab; Finkelstein, Ilya J; Musser, James M.

mBio ; 11(6)2020 10 30.

Artigo em Inglês | MEDLINE | ID: mdl-33127862

RESUMO

We sequenced the genomes of 5,085 severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) strains causing two coronavirus disease 2019 (COVID-19) disease waves in metropolitan Houston, TX, an ethnically diverse region with 7 million residents. The genomes were from viruses recovered in the earliest recognized phase of the pandemic in Houston and from viruses recovered in an ongoing massive second wave of infections. The virus was originally introduced into Houston many times independently. Virtually all strains in the second wave have a Gly614 amino acid replacement in the spike protein, a polymorphism that has been linked to increased transmission and infectivity. Patients infected with the Gly614 variant strains had significantly higher virus loads in the nasopharynx on initial diagnosis. We found little evidence of a significant relationship between virus genotype and altered virulence, stressing the linkage between disease severity, underlying medical conditions, and host genetics. Some regions of the spike protein-the primary target of global vaccine efforts-are replete with amino acid replacements, perhaps indicating the action of selection. We exploited the genomic data to generate defined single amino acid replacements in the receptor binding domain of spike protein that, importantly, produced decreased recognition by the neutralizing monoclonal antibody CR3022. Our report represents the first analysis of the molecular architecture of SARS-CoV-2 in two infection waves in a major metropolitan region. The findings will help us to understand the origin, composition, and trajectory of future infection waves and the potential effect of the host immune response and therapeutic maneuvers on SARS-CoV-2 evolution.IMPORTANCE There is concern about second and subsequent waves of COVID-19 caused by the SARS-CoV-2 coronavirus occurring in communities globally that had an initial disease wave. Metropolitan Houston, TX, with a population of 7 million, is experiencing a massive second disease wave that began in late May 2020. To understand SARS-CoV-2 molecular population genomic architecture and evolution and the relationship between virus genotypes and patient features, we sequenced the genomes of 5,085 SARS-CoV-2 strains from these two waves. Our report provides the first molecular characterization of SARS-CoV-2 strains causing two distinct COVID-19 disease waves.

Assuntos

Betacoronavirus/genética , Infecções por Coronavirus/virologia , Pneumonia Viral/virologia , Glicoproteína da Espícula de Coronavírus/química , Glicoproteína da Espícula de Coronavírus/genética , Sequência de Aminoácidos , Substituição de Aminoácidos , Anticorpos Neutralizantes/imunologia , Sequência de Bases , Betacoronavirus/imunologia , COVID-19 , Teste para COVID-19 , Técnicas de Laboratório Clínico , Infecções por Coronavirus/diagnóstico , Infecções por Coronavirus/epidemiologia , Infecções por Coronavirus/imunologia , RNA-Polimerase RNA-Dependente de Coronavírus , Genoma Viral , Genótipo , Humanos , Aprendizado de Máquina , Modelos Moleculares , Técnicas de Diagnóstico Molecular , Pandemias , Filogenia , Pneumonia Viral/epidemiologia , Pneumonia Viral/imunologia , RNA Polimerase Dependente de RNA/química , RNA Polimerase Dependente de RNA/genética , SARS-CoV-2 , Análise de Sequência de Proteína , Glicoproteína da Espícula de Coronavírus/imunologia , Texas/epidemiologia , Proteínas não Estruturais Virais/química , Proteínas não Estruturais Virais/genética

20.

Predicting antimicrobial resistance using conserved genes.

Nguyen, Marcus; Olson, Robert; Shukla, Maulik; VanOeffelen, Margo; Davis, James J.

PLoS Comput Biol ; 16(10): e1008319, 2020 10.

Artigo em Inglês | MEDLINE | ID: mdl-33075053

RESUMO

A growing number of studies are using machine learning models to accurately predict antimicrobial resistance (AMR) phenotypes from bacterial sequence data. Although these studies are showing promise, the models are typically trained using features derived from comprehensive sets of AMR genes or whole genome sequences and may not be suitable for use when genomes are incomplete. In this study, we explore the possibility of predicting AMR phenotypes using incomplete genome sequence data. Models were built from small sets of randomly-selected core genes after removing the AMR genes. For Klebsiella pneumoniae, Mycobacterium tuberculosis, Salmonella enterica, and Staphylococcus aureus, we report that it is possible to classify susceptible and resistant phenotypes with average F1 scores ranging from 0.80-0.89 with as few as 100 conserved non-AMR genes, with very major error rates ranging from 0.11-0.23 and major error rates ranging from 0.10-0.20. Models built from core genes have predictive power in cases where the primary AMR mechanisms result from SNPs or horizontal gene transfer. By randomly sampling non-overlapping sets of core genes, we show that F1 scores and error rates are stable and have little variance between replicates. Although these small core gene models have lower accuracies and higher error rates than models built from the corresponding assembled genomes, the results suggest that sufficient variation exists in the core non-AMR genes of a species for predicting AMR phenotypes.

Assuntos

Sequência Conservada/genética , Farmacorresistência Bacteriana/genética , Genoma Bacteriano/genética , Genômica/métodos , Aprendizado de Máquina , Algoritmos , Antibacterianos/farmacologia , Bactérias/efeitos dos fármacos , Bactérias/genética , Fenótipo

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA