Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Viruses ; 16(3)2024 03 11.
Artigo em Inglês | MEDLINE | ID: mdl-38543795

RESUMO

Genomic sequencing of clinical samples to identify emerging variants of SARS-CoV-2 has been a key public health tool for curbing the spread of the virus. As a result, an unprecedented number of SARS-CoV-2 genomes were sequenced during the COVID-19 pandemic, which allowed for rapid identification of genetic variants, enabling the timely design and testing of therapies and deployment of new vaccine formulations to combat the new variants. However, despite the technological advances of deep sequencing, the analysis of the raw sequence data generated globally is neither standardized nor consistent, leading to vastly disparate sequences that may impact identification of variants. Here, we show that for both Illumina and Oxford Nanopore sequencing platforms, downstream bioinformatic protocols used by industry, government, and academic groups resulted in different virus sequences from same sample. These bioinformatic workflows produced consensus genomes with differences in single nucleotide polymorphisms, inclusion and exclusion of insertions, and/or deletions, despite using the same raw sequence as input datasets. Here, we compared and characterized such discrepancies and propose a specific suite of parameters and protocols that should be adopted across the field. Consistent results from bioinformatic workflows are fundamental to SARS-CoV-2 and future pathogen surveillance efforts, including pandemic preparation, to allow for a data-driven and timely public health response.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , COVID-19/epidemiologia , Pandemias , Fluxo de Trabalho , Biologia Computacional
2.
Nucleic Acids Res ; 52(D1): D33-D43, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37994677

RESUMO

The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, SciENcv, the NIH Comparative Genomics Resource (CGR), NCBI Virus, SRA, RefSeq, foreign contamination screening tools, Taxonomy, iCn3D, ClinVar, GTR, MedGen, dbSNP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.


Assuntos
Bases de Dados Genéticas , National Library of Medicine (U.S.) , Biotecnologia/instrumentação , Bases de Dados de Ácidos Nucleicos , Internet , Estados Unidos
3.
Microb Genom ; 9(12)2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38085797

RESUMO

Fast, efficient public health actions require well-organized and coordinated systems that can supply timely and accurate knowledge. Public databases of pathogen genomic data, such as the International Nucleotide Sequence Database Collaboration (INSDC), have become essential tools for efficient public health decisions. However, these international resources began primarily for academic purposes, rather than for surveillance or interventions. Now, queries need to access not only the whole genomes of multiple pathogens but also make connections using robust contextual metadata to identify issues of public health relevance. Databases that over time developed a patchwork of submission formats and requirements need to be consistently organized and coordinated internationally to allow effective searches.To help resolve these issues, we propose a common pathogen data structure called the Pathogen Data Object Model (DOM) that will formalize the minimum pieces of sequence data and contextual data necessary for general public health uses, while recognizing that submitters will likely withhold a wide range of non-public contextual data. Further, we propose contributors use the Pathogen DOM for all pathogen submissions (bacterial, viral, fungal, and parasites), which will simplify data submissions and provide a consistent and transparent data structure for downstream data analyses. We also highlight how improved submission tools can support the Pathogen DOM, offering users additional easy-to-use methods to ensure this structure is followed.


Assuntos
Nucleotídeos , Saúde Pública , Sequência de Bases , Genômica/métodos , Bases de Dados de Ácidos Nucleicos
6.
Arch Virol ; 168(2): 74, 2023 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-36683075

RESUMO

This article summarises the activities of the Bacterial Viruses Subcommittee of the International Committee on Taxonomy of Viruses for the period of March 2021-March 2022. We provide an overview of the new taxa proposed in 2021, approved by the Executive Committee, and ratified by vote in 2022. Significant changes to the taxonomy of bacterial viruses were introduced: the paraphyletic morphological families Podoviridae, Siphoviridae, and Myoviridae as well as the order Caudovirales were abolished, and a binomial system of nomenclature for species was established. In addition, one order, 22 families, 30 subfamilies, 321 genera, and 862 species were newly created, promoted, or moved.


Assuntos
Bacteriófagos , Caudovirales , Siphoviridae , Vírus , Humanos , Vírus/genética , Myoviridae
7.
Nucleic Acids Res ; 51(D1): D29-D38, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36370100

RESUMO

The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. New resources include the Comparative Genome Resource (CGR) and the BLAST ClusteredNR database. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, IgBLAST, GDV, RefSeq, NCBI Virus, GenBank type assemblies, iCn3D, ClinVar, GTR, dbGaP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.


Assuntos
Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Estados Unidos , National Library of Medicine (U.S.) , Alinhamento de Sequência , Biotecnologia , Internet
8.
bioRxiv ; 2022 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-36380755

RESUMO

During the COVID-19 pandemic, SARS-CoV-2 surveillance efforts integrated genome sequencing of clinical samples to identify emergent viral variants and to support rapid experimental examination of genome-informed vaccine and therapeutic designs. Given the broad range of methods applied to generate new viral genomes, it is critical that consensus and variant calling tools yield consistent results across disparate pipelines. Here we examine the impact of sequencing technologies (Illumina and Oxford Nanopore) and 7 different downstream bioinformatic protocols on SARS-CoV-2 variant calling as part of the NIH Accelerating COVID-19 Therapeutic Interventions and Vaccines (ACTIV) Tracking Resistance and Coronavirus Evolution (TRACE) initiative, a public-private partnership established to address the COVID-19 outbreak. Our results indicate that bioinformatic workflows can yield consensus genomes with different single nucleotide polymorphisms, insertions, and/or deletions even when using the same raw sequence input datasets. We introduce the use of a specific suite of parameters and protocols that greatly improves the agreement among pipelines developed by diverse organizations. Such consistency among bioinformatic pipelines is fundamental to SARS-CoV-2 and future pathogen surveillance efforts. The application of analysis standards is necessary to more accurately document phylogenomic trends and support data-driven public health responses.

9.
PNAS Nexus ; 1(3): pgac105, 2022 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-35899067

RESUMO

The COVID-19 pandemic has seen the persistent emergence of immune-evasive SARS-CoV-2 variants under the selection pressure of natural and vaccination-acquired immunity. However, it is currently challenging to quantify how immunologically distinct a new variant is compared to all the prior variants to which a population has been exposed. Here, we define "Distinctiveness" of SARS-CoV-2 sequences based on a proteome-wide comparison with all prior sequences from the same geographical region. We observe a correlation between Distinctiveness relative to contemporary sequences and future change in prevalence of a newly circulating lineage (Pearson r = 0.75), suggesting that the Distinctiveness of emergent SARS-CoV-2 lineages is associated with their epidemiological fitness. We further show that the average Distinctiveness of sequences belonging to a lineage, relative to the Distinctiveness of other sequences that occur at the same place and time (n = 944 location/time data points), is predictive of future increases in prevalence (Area Under the Curve, AUC = 0.88 [95% confidence interval 0.86 to 0.90]). By assessing the Delta variant in India versus Brazil, we show that the same lineage can have different Distinctiveness-contributing positions in different geographical regions depending on the other variants that previously circulated in those regions. Finally, we find that positions that constitute epitopes contribute disproportionately (20-fold higher than the average position) to Distinctiveness. Overall, this study suggests that real-time assessment of new SARS-CoV-2 variants in the context of prior regional herd exposure via Distinctiveness can augment genomic surveillance efforts.

10.
Nucleic Acids Res ; 50(D1): D387-D390, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34850094

RESUMO

The Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra/) stores raw sequencing data and alignment information to enhance reproducibility and facilitate new discoveries through data analysis. Here we note changes in storage designed to increase access and highlight analyses that augment metadata with taxonomic insight to help users select data. In addition, we present three unanticipated applications of taxonomic analysis.


Assuntos
Bactérias/genética , Bases de Dados Genéticas , Metadados/estatística & dados numéricos , Software , Vírus/genética , Bactérias/classificação , Sequência de Bases , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Filogenia , Reprodutibilidade dos Testes , SARS-CoV-2/genética , Análise de Sequência de RNA , Vírus/classificação
11.
Nucleic Acids Res ; 50(D1): D20-D26, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34850941

RESUMO

The National Center for Biotechnology Information (NCBI) produces a variety of online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for the most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, RefSeq, SRA, Virus, dbSNP, dbVar, ClinicalTrials.gov, MMDB, iCn3D and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.


Assuntos
Biotecnologia/tendências , Bases de Dados Genéticas/tendências , Bases de Dados de Compostos Químicos , Bases de Dados de Ácidos Nucleicos , Bases de Dados de Proteínas , Humanos , Internet , National Library of Medicine (U.S.) , PubMed , Estados Unidos
12.
Curr Opin Virol ; 51: 207-215, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34781105

RESUMO

Historically, virus taxonomy has been limited to describing viruses that were readily cultivated in the laboratory or emerging in natural biomes. Metagenomic analyses, single-particle sequencing, and database mining efforts have yielded new sequence data on an astounding number of previously unknown viruses. As metagenomes are relatively free of biases, these data provide an unprecedented insight into the vastness of the virosphere, but to properly value the extent of this diversity it is critical that the viruses are taxonomically classified. Inclusion of uncultivated viruses has already improved the process as well as the understanding of the taxa, viruses, and their evolutionary relationships. The continuous development and testing of computational tools will be required to maintain a dynamic virus taxonomy that can accommodate the new discoveries.


Assuntos
Filogenia , Vírus/classificação , Animais , Evolução Molecular , Humanos , Metagenômica , Vírus/genética , Vírus/crescimento & desenvolvimento
13.
Genome Biol ; 22(1): 270, 2021 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-34544477

RESUMO

Sequence Read Archive submissions to the National Center for Biotechnology Information often lack useful metadata, which limits the utility of these submissions. We describe the Sequence Taxonomic Analysis Tool (STAT), a scalable k-mer-based tool for fast assessment of taxonomic diversity intrinsic to submissions, independent of metadata. We show that our MinHash-based k-mer tool is accurate and scalable, offering reliable criteria for efficient selection of data for further analysis by the scientific community, at once validating submissions while also augmenting sample metadata with reliable, searchable, taxonomic terms.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Contaminação por DNA , Humanos , Metagenômica/métodos , SARS-CoV-2/genética
14.
Arch Virol ; 166(11): 3239-3244, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34417873

RESUMO

In this article, we - the Bacterial Viruses Subcommittee and the Archaeal Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV) - summarise the results of our activities for the period March 2020 - March 2021. We report the division of the former Bacterial and Archaeal Viruses Subcommittee in two separate Subcommittees, welcome new members, a new Subcommittee Chair and Vice Chair, and give an overview of the new taxa that were proposed in 2020, approved by the Executive Committee and ratified by vote in 2021. In particular, a new realm, three orders, 15 families, 31 subfamilies, 734 genera and 1845 species were newly created or redefined (moved/promoted).


Assuntos
Vírus de Archaea/classificação , Bacteriófagos/classificação , Sociedades Científicas/organização & administração , Archaea/virologia , Bactérias/virologia
15.
Emerg Infect Dis ; 27(6): 1-9, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-34013862

RESUMO

Human respiratory syncytial virus (HRSV) is the leading viral cause of serious pediatric respiratory disease, and lifelong reinfections are common. Its 2 major subgroups, A and B, exhibit some antigenic variability, enabling HRSV to circulate annually. Globally, research has increased the number of HRSV genomic sequences available. To ensure accurate molecular epidemiology analyses, we propose a uniform nomenclature for HRSV-positive samples and isolates, and HRSV sequences, namely: HRSV/subgroup identifier/geographic identifier/unique sequence identifier/year of sampling. We also propose a template for submitting associated metadata. Universal nomenclature would help researchers retrieve and analyze sequence data to better understand the evolution of this virus.


Assuntos
Infecções por Vírus Respiratório Sincicial , Vírus Sincicial Respiratório Humano , Criança , Variação Genética , Genótipo , Humanos , Epidemiologia Molecular , Filogenia , Vírus Sincicial Respiratório Humano/genética
16.
BMC Bioinformatics ; 21(1): 211, 2020 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-32448124

RESUMO

BACKGROUND: GenBank contains over 3 million viral sequences. The National Center for Biotechnology Information (NCBI) previously made available a tool for validating and annotating influenza virus sequences that is used to check submissions to GenBank. Before this project, there was no analogous tool in use for non-influenza viral sequence submissions. RESULTS: We developed a system called VADR (Viral Annotation DefineR) that validates and annotates viral sequences in GenBank submissions. The annotation system is based on the analysis of the input nucleotide sequence using models built from curated RefSeqs. Hidden Markov models are used to classify sequences by determining the RefSeq they are most similar to, and feature annotation from the RefSeq is mapped based on a nucleotide alignment of the full sequence to a covariance model. Predicted proteins encoded by the sequence are validated with nucleotide-to-protein alignments using BLAST. The system identifies 43 types of "alerts" that (unlike the previous BLAST-based system) provide deterministic and rigorous feedback to researchers who submit sequences with unexpected characteristics. VADR has been integrated into GenBank's submission processing pipeline allowing for viral submissions passing all tests to be accepted and annotated automatically, without the need for any human (GenBank indexer) intervention. Unlike the previous submission-checking system, VADR is freely available (https://github.com/nawrockie/vadr) for local installation and use. VADR has been used for Norovirus submissions since May 2018 and for Dengue virus submissions since January 2019. Since March 2020, VADR has also been used to check SARS-CoV-2 sequence submissions. Other viruses with high numbers of submissions will be added incrementally. CONCLUSION: VADR improves the speed with which non-flu virus submissions to GenBank can be checked and improves the content and quality of the GenBank annotations. The availability and portability of the software allow researchers to run the GenBank checks prior to submitting their viral sequences, and thereby gain confidence that their submissions will be accepted immediately without the need to correspond with GenBank staff. Reciprocally, the adoption of VADR frees GenBank staff to spend more time on services other than checking routine viral sequence submissions.


Assuntos
Betacoronavirus , Infecções por Coronavirus , Bases de Dados de Ácidos Nucleicos , Anotação de Sequência Molecular , Pandemias , Pneumonia Viral , Software , Betacoronavirus/genética , COVID-19 , Infecções por Coronavirus/genética , Vírus de DNA , Genômica , Humanos , Anotação de Sequência Molecular/normas , Pneumonia Viral/genética , SARS-CoV-2 , Vírus
17.
Nucleic Acids Res ; 48(D1): D9-D16, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31602479

RESUMO

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Custom implementations of the BLAST program provide sequence-based searching of many specialized datasets. New resources released in the past year include a new PubMed interface, a sequence database search and a gene orthologs page. Additional resources that were updated in the past year include PMC, Bookshelf, My Bibliography, Assembly, RefSeq, viral genomes, the prokaryotic genome annotation pipeline, Genome Workbench, dbSNP, BLAST, Primer-BLAST, IgBLAST and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.


Assuntos
Biologia Computacional/métodos , Biologia Computacional/organização & administração , Bases de Dados Genéticas , National Library of Medicine (U.S.) , Bases de Dados de Ácidos Nucleicos , Genômica/métodos , Humanos , PubMed , Estados Unidos , Navegador
18.
Syst Biol ; 69(1): 110-123, 2020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31127947

RESUMO

Tailed bacteriophages are the most abundant and diverse viruses in the world, with genome sizes ranging from 10 kbp to over 500 kbp. Yet, due to historical reasons, all this diversity is confined to a single virus order-Caudovirales, composed of just four families: Myoviridae, Siphoviridae, Podoviridae, and the newly created Ackermannviridae family. In recent years, this morphology-based classification scheme has started to crumble under the constant flood of phage sequences, revealing that tailed phages are even more genetically diverse than once thought. This prompted us, the Bacterial and Archaeal Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV), to consider overall reorganization of phage taxonomy. In this study, we used a wide range of complementary methods-including comparative genomics, core genome analysis, and marker gene phylogenetics-to show that the group of Bacillus phage SPO1-related viruses previously classified into the Spounavirinae subfamily, is clearly distinct from other members of the family Myoviridae and its diversity deserves the rank of an autonomous family. Thus, we removed this group from the Myoviridae family and created the family Herelleviridae-a new taxon of the same rank. In the process of the taxon evaluation, we explored the feasibility of different demarcation criteria and critically evaluated the usefulness of our methods for phage classification. The convergence of results, drawing a consistent and comprehensive picture of a new family with associated subfamilies, regardless of method, demonstrates that the tools applied here are particularly useful in phage taxonomy. We are convinced that creation of this novel family is a crucial milestone toward much-needed reclassification in the Caudovirales order.


Assuntos
Caudovirales/classificação , Filogenia , Caudovirales/genética , Classificação , Genoma Viral/genética
19.
Nat Biotechnol ; 37(6): 632-639, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31061483

RESUMO

Microbiomes from every environment contain a myriad of uncultivated archaeal and bacterial viruses, but studying these viruses is hampered by the lack of a universal, scalable taxonomic framework. We present vConTACT v.2.0, a network-based application utilizing whole genome gene-sharing profiles for virus taxonomy that integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions. We report near-identical (96%) replication of existing genus-level viral taxonomy assignments from the International Committee on Taxonomy of Viruses for National Center for Biotechnology Information virus RefSeq. Application of vConTACT v.2.0 to 1,364 previously unclassified viruses deposited in virus RefSeq as reference genomes produced automatic, high-confidence genus assignments for 820 of the 1,364. We applied vConTACT v.2.0 to analyze 15,280 Global Ocean Virome genome fragments and were able to provide taxonomic assignments for 31% of these data, which shows that our algorithm is scalable to very large metagenomic datasets. Our taxonomy tool can be automated and applied to metagenomes from any environment for virus classification.


Assuntos
Redes Reguladoras de Genes/genética , Genoma Viral/genética , Metagenômica , Vírus/genética , Bacteriófagos/genética , Classificação , Metagenoma/genética , Filogenia , Células Procarióticas/virologia , Vírus/classificação
20.
Viruses ; 11(1)2019 01 14.
Artigo em Inglês | MEDLINE | ID: mdl-30646581

RESUMO

RNA viruses that contain single-stranded RNA genomes of positive sense make up the largest group of pathogens infecting honey bees. Sacbrood virus (SBV) is one of the most widely distributed honey bee viruses and infects the larvae of honey bees, resulting in failure to pupate and death. Among all of the viruses infecting honey bees, SBV has the greatest number of complete genomes isolated from both European honey bees Apis mellifera and Asian honey bees A. cerana worldwide. To enhance our understanding of the evolution and pathogenicity of SBV, in this study, we present the first report of whole genome sequences of two U.S. strains of SBV. The complete genome sequences of the two U.S. SBV strains were deposited in GenBank under accession numbers: MG545286.1 and MG545287.1. Both SBV strains show the typical genomic features of the Iflaviridae family. The phylogenetic analysis of the single polyprotein coding region of the U.S. strains, and other GenBank SBV submissions revealed that SBV strains split into two distinct lineages, possibly reflecting host affiliation. The phylogenetic analysis based on the 5'UTR revealed a monophyletic clade with the deep parts of the tree occupied by SBV strains from both A. cerane and A. mellifera, and the tips of branches of the tree occupied by SBV strains from A. mellifera. The study of the cold stress on the pathogenesis of the SBV infection showed that cold stress could have profound effects on sacbrood disease severity manifested by increased mortality of infected larvae. This result suggests that the high prevalence of sacbrood disease in early spring may be due to the fluctuating temperatures during the season. This study will contribute to a better understanding of the evolution and pathogenesis of SBV infection in honey bees, and have important epidemiological relevance.


Assuntos
Abelhas/virologia , Genoma Viral , Vírus de Insetos/genética , Filogenia , Vírus de RNA/patogenicidade , Animais , Resposta ao Choque Frio , Variação Genética , Vírus de Insetos/patogenicidade , Infecções por Vírus de RNA , Vírus de RNA/genética , Estados Unidos , Sequenciamento Completo do Genoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...