Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 46
Filter
1.
Nucleic Acids Res ; 52(D1): D33-D43, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37994677

ABSTRACT

The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, SciENcv, the NIH Comparative Genomics Resource (CGR), NCBI Virus, SRA, RefSeq, foreign contamination screening tools, Taxonomy, iCn3D, ClinVar, GTR, MedGen, dbSNP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.


Subject(s)
Databases, Genetic , National Library of Medicine (U.S.) , Biotechnology/instrumentation , Databases, Nucleic Acid , Internet , United States
2.
Nucleic Acids Res ; 51(D1): D29-D38, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36370100

ABSTRACT

The National Center for Biotechnology Information (NCBI) provides online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for most of these databases. New resources include the Comparative Genome Resource (CGR) and the BLAST ClusteredNR database. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, IgBLAST, GDV, RefSeq, NCBI Virus, GenBank type assemblies, iCn3D, ClinVar, GTR, dbGaP, ALFA, ClinicalTrials.gov, Pathogen Detection, antimicrobial resistance resources, and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.


Subject(s)
Databases, Genetic , Databases, Nucleic Acid , United States , National Library of Medicine (U.S.) , Sequence Alignment , Biotechnology , Internet
3.
Nucleic Acids Res ; 50(D1): D387-D390, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34850094

ABSTRACT

The Sequence Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra/) stores raw sequencing data and alignment information to enhance reproducibility and facilitate new discoveries through data analysis. Here we note changes in storage designed to increase access and highlight analyses that augment metadata with taxonomic insight to help users select data. In addition, we present three unanticipated applications of taxonomic analysis.


Subject(s)
Bacteria/genetics , Databases, Genetic , Metadata/statistics & numerical data , Software , Viruses/genetics , Bacteria/classification , Base Sequence , High-Throughput Nucleotide Sequencing , Internet , Phylogeny , Reproducibility of Results , SARS-CoV-2/genetics , Sequence Analysis, RNA , Viruses/classification
4.
Nucleic Acids Res ; 50(D1): D20-D26, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34850941

ABSTRACT

The National Center for Biotechnology Information (NCBI) produces a variety of online information resources for biology, including the GenBank® nucleic acid sequence database and the PubMed® database of citations and abstracts published in life science journals. NCBI provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for the most of these databases. Resources receiving significant updates in the past year include PubMed, PMC, Bookshelf, RefSeq, SRA, Virus, dbSNP, dbVar, ClinicalTrials.gov, MMDB, iCn3D and PubChem. These resources can be accessed through the NCBI home page at https://www.ncbi.nlm.nih.gov.


Subject(s)
Biotechnology/trends , Databases, Genetic/trends , Databases, Chemical , Databases, Nucleic Acid , Databases, Protein , Humans , Internet , National Library of Medicine (U.S.) , PubMed , United States
5.
Arch Virol ; 168(2): 74, 2023 Jan 23.
Article in English | MEDLINE | ID: mdl-36683075

ABSTRACT

This article summarises the activities of the Bacterial Viruses Subcommittee of the International Committee on Taxonomy of Viruses for the period of March 2021-March 2022. We provide an overview of the new taxa proposed in 2021, approved by the Executive Committee, and ratified by vote in 2022. Significant changes to the taxonomy of bacterial viruses were introduced: the paraphyletic morphological families Podoviridae, Siphoviridae, and Myoviridae as well as the order Caudovirales were abolished, and a binomial system of nomenclature for species was established. In addition, one order, 22 families, 30 subfamilies, 321 genera, and 862 species were newly created, promoted, or moved.


Subject(s)
Bacteriophages , Caudovirales , Siphoviridae , Viruses , Humans , Viruses/genetics , Myoviridae
6.
Nucleic Acids Res ; 48(D1): D9-D16, 2020 01 08.
Article in English | MEDLINE | ID: mdl-31602479

ABSTRACT

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 35 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Custom implementations of the BLAST program provide sequence-based searching of many specialized datasets. New resources released in the past year include a new PubMed interface, a sequence database search and a gene orthologs page. Additional resources that were updated in the past year include PMC, Bookshelf, My Bibliography, Assembly, RefSeq, viral genomes, the prokaryotic genome annotation pipeline, Genome Workbench, dbSNP, BLAST, Primer-BLAST, IgBLAST and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.


Subject(s)
Computational Biology/methods , Computational Biology/organization & administration , Databases, Genetic , National Library of Medicine (U.S.) , Databases, Nucleic Acid , Genomics/methods , Humans , PubMed , United States , Web Browser
7.
Emerg Infect Dis ; 27(6): 1-9, 2021 06.
Article in English | MEDLINE | ID: mdl-34013862

ABSTRACT

Human respiratory syncytial virus (HRSV) is the leading viral cause of serious pediatric respiratory disease, and lifelong reinfections are common. Its 2 major subgroups, A and B, exhibit some antigenic variability, enabling HRSV to circulate annually. Globally, research has increased the number of HRSV genomic sequences available. To ensure accurate molecular epidemiology analyses, we propose a uniform nomenclature for HRSV-positive samples and isolates, and HRSV sequences, namely: HRSV/subgroup identifier/geographic identifier/unique sequence identifier/year of sampling. We also propose a template for submitting associated metadata. Universal nomenclature would help researchers retrieve and analyze sequence data to better understand the evolution of this virus.


Subject(s)
Respiratory Syncytial Virus Infections , Respiratory Syncytial Virus, Human , Child , Genetic Variation , Genotype , Humans , Molecular Epidemiology , Phylogeny , Respiratory Syncytial Virus, Human/genetics
8.
Syst Biol ; 69(1): 110-123, 2020 01 01.
Article in English | MEDLINE | ID: mdl-31127947

ABSTRACT

Tailed bacteriophages are the most abundant and diverse viruses in the world, with genome sizes ranging from 10 kbp to over 500 kbp. Yet, due to historical reasons, all this diversity is confined to a single virus order-Caudovirales, composed of just four families: Myoviridae, Siphoviridae, Podoviridae, and the newly created Ackermannviridae family. In recent years, this morphology-based classification scheme has started to crumble under the constant flood of phage sequences, revealing that tailed phages are even more genetically diverse than once thought. This prompted us, the Bacterial and Archaeal Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV), to consider overall reorganization of phage taxonomy. In this study, we used a wide range of complementary methods-including comparative genomics, core genome analysis, and marker gene phylogenetics-to show that the group of Bacillus phage SPO1-related viruses previously classified into the Spounavirinae subfamily, is clearly distinct from other members of the family Myoviridae and its diversity deserves the rank of an autonomous family. Thus, we removed this group from the Myoviridae family and created the family Herelleviridae-a new taxon of the same rank. In the process of the taxon evaluation, we explored the feasibility of different demarcation criteria and critically evaluated the usefulness of our methods for phage classification. The convergence of results, drawing a consistent and comprehensive picture of a new family with associated subfamilies, regardless of method, demonstrates that the tools applied here are particularly useful in phage taxonomy. We are convinced that creation of this novel family is a crucial milestone toward much-needed reclassification in the Caudovirales order.


Subject(s)
Caudovirales/classification , Phylogeny , Caudovirales/genetics , Classification , Genome, Viral/genetics
9.
Arch Virol ; 166(11): 3239-3244, 2021 Nov.
Article in English | MEDLINE | ID: mdl-34417873

ABSTRACT

In this article, we - the Bacterial Viruses Subcommittee and the Archaeal Viruses Subcommittee of the International Committee on Taxonomy of Viruses (ICTV) - summarise the results of our activities for the period March 2020 - March 2021. We report the division of the former Bacterial and Archaeal Viruses Subcommittee in two separate Subcommittees, welcome new members, a new Subcommittee Chair and Vice Chair, and give an overview of the new taxa that were proposed in 2020, approved by the Executive Committee and ratified by vote in 2021. In particular, a new realm, three orders, 15 families, 31 subfamilies, 734 genera and 1845 species were newly created or redefined (moved/promoted).


Subject(s)
Archaeal Viruses/classification , Bacteriophages/classification , Societies, Scientific/organization & administration , Archaea/virology , Bacteria/virology
10.
Nucleic Acids Res ; 47(D1): D23-D28, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30395293

ABSTRACT

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 38 distinct databases. The E-utilities serve as the programming interface for the Entrez system. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. New resources released in the past year include PubMed Labs and a new sequence database search. Resources that were updated in the past year include PubMed, PMC, Bookshelf, genome data viewer, Assembly, prokaryotic genomes, Genome, BioProject, dbSNP, dbVar, BLAST databases, igBLAST, iCn3D and PubChem. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.


Subject(s)
Biotechnology/organization & administration , Databases, Genetic , Animals , Biotechnology/methods , Databases, Chemical , Humans , Software , United States/epidemiology , Web Browser
11.
BMC Bioinformatics ; 21(1): 211, 2020 May 24.
Article in English | MEDLINE | ID: mdl-32448124

ABSTRACT

BACKGROUND: GenBank contains over 3 million viral sequences. The National Center for Biotechnology Information (NCBI) previously made available a tool for validating and annotating influenza virus sequences that is used to check submissions to GenBank. Before this project, there was no analogous tool in use for non-influenza viral sequence submissions. RESULTS: We developed a system called VADR (Viral Annotation DefineR) that validates and annotates viral sequences in GenBank submissions. The annotation system is based on the analysis of the input nucleotide sequence using models built from curated RefSeqs. Hidden Markov models are used to classify sequences by determining the RefSeq they are most similar to, and feature annotation from the RefSeq is mapped based on a nucleotide alignment of the full sequence to a covariance model. Predicted proteins encoded by the sequence are validated with nucleotide-to-protein alignments using BLAST. The system identifies 43 types of "alerts" that (unlike the previous BLAST-based system) provide deterministic and rigorous feedback to researchers who submit sequences with unexpected characteristics. VADR has been integrated into GenBank's submission processing pipeline allowing for viral submissions passing all tests to be accepted and annotated automatically, without the need for any human (GenBank indexer) intervention. Unlike the previous submission-checking system, VADR is freely available (https://github.com/nawrockie/vadr) for local installation and use. VADR has been used for Norovirus submissions since May 2018 and for Dengue virus submissions since January 2019. Since March 2020, VADR has also been used to check SARS-CoV-2 sequence submissions. Other viruses with high numbers of submissions will be added incrementally. CONCLUSION: VADR improves the speed with which non-flu virus submissions to GenBank can be checked and improves the content and quality of the GenBank annotations. The availability and portability of the software allow researchers to run the GenBank checks prior to submitting their viral sequences, and thereby gain confidence that their submissions will be accepted immediately without the need to correspond with GenBank staff. Reciprocally, the adoption of VADR frees GenBank staff to spend more time on services other than checking routine viral sequence submissions.


Subject(s)
Betacoronavirus , Coronavirus Infections , Databases, Nucleic Acid , Molecular Sequence Annotation , Pandemics , Pneumonia, Viral , Software , Betacoronavirus/genetics , COVID-19 , Coronavirus Infections/genetics , DNA Viruses , Genomics , Humans , Molecular Sequence Annotation/standards , Pneumonia, Viral/genetics , SARS-CoV-2 , Viruses
12.
Syst Biol ; 68(5): 828-839, 2019 09 01.
Article in English | MEDLINE | ID: mdl-30597118

ABSTRACT

The International Committee on Taxonomy of Viruses (ICTV) is tasked with classifying viruses into taxa (phyla to species) and devising taxon names. Virus names and virus name abbreviations are currently not within the ICTV's official remit and are not regulated by an official entity. Many scientists, medical/veterinary professionals, and regulatory agencies do not address evolutionary questions nor are they concerned with the hierarchical organization of the viral world, and therefore, have limited use for ICTV-devised taxa. Instead, these professionals look to the ICTV as an expert point source that provides the most current taxonomic affiliations of viruses of interests to facilitate document writing. These needs are currently unmet as an ICTV-supported, easily searchable database that includes all published virus names and abbreviations linked to their taxa is not available. In addition, in stark contrast to other biological taxonomic frameworks, virus taxonomy currently permits individual species to have several members. Consequently, confusion emerges among those who are not aware of the difference between taxa and viruses, and because certain well-known viruses cannot be located in ICTV publications or be linked to their species. In addition, the number of duplicate names and abbreviations has increased dramatically in the literature. To solve this conundrum, the ICTV could mandate listing all viruses of established species and all reported unclassified viruses in forthcoming online ICTV Reports and create a searchable webpage using this information. The International Union of Microbiology Societies could also consider changing the mandate of the ICTV to include the nomenclature of all viruses in addition to taxon considerations. With such a mandate expansion, official virus names and virus name abbreviations could be catalogued and virus nomenclature could be standardized. As a result, the ICTV would become an even more useful resource for all stakeholders in virology.


Subject(s)
Classification/methods , Virology/methods , Viruses/classification , International Cooperation , Virology/standards , Virology/trends
13.
Nucleic Acids Res ; 45(D1): D482-D490, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899678

ABSTRACT

The Virus Variation Resource is a value-added viral sequence data resource hosted by the National Center for Biotechnology Information. The resource is located at http://www.ncbi.nlm.nih.gov/genome/viruses/variation/ and includes modules for seven viral groups: influenza virus, Dengue virus, West Nile virus, Ebolavirus, MERS coronavirus, Rotavirus A and Zika virus Each module is supported by pipelines that scan newly released GenBank records, annotate genes and proteins and parse sample descriptors and then map them to controlled vocabulary. These processes in turn support a purpose-built search interface where users can select sequences based on standardized gene, protein and metadata terms. Once sequences are selected, a suite of tools for downloading data, multi-sequence alignment and tree building supports a variety of user directed activities. This manuscript describes a series of features and functionalities recently added to the Virus Variation Resource.


Subject(s)
Computational Biology/methods , Disease Outbreaks , Genetic Variation , Software , Virus Diseases/epidemiology , Virus Diseases/virology , Viruses/genetics , Databases, Genetic
14.
Nucleic Acids Res ; 44(D1): D733-45, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26553804

ABSTRACT

The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.


Subject(s)
Databases, Genetic , Genomics , Animals , Cattle , Gene Expression Profiling , Genome, Fungal , Genome, Human , Genome, Microbial , Genome, Plant , Genome, Viral , Genomics/standards , Humans , Invertebrates/genetics , Mice , Molecular Sequence Annotation , Nematoda/genetics , Phylogeny , RNA, Long Noncoding/genetics , Rats , Reference Standards , Sequence Analysis, Protein , Sequence Analysis, RNA , Vertebrates/genetics
15.
Nucleic Acids Res ; 43(Database issue): D571-7, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25428358

ABSTRACT

Recent technological innovations have ignited an explosion in virus genome sequencing that promises to fundamentally alter our understanding of viral biology and profoundly impact public health policy. Yet, any potential benefits from the billowing cloud of next generation sequence data hinge upon well implemented reference resources that facilitate the identification of sequences, aid in the assembly of sequence reads and provide reference annotation sources. The NCBI Viral Genomes Resource is a reference resource designed to bring order to this sequence shockwave and improve usability of viral sequence data. The resource can be accessed at http://www.ncbi.nlm.nih.gov/genome/viruses/ and catalogs all publicly available virus genome sequences and curates reference genome sequences. As the number of genome sequences has grown, so too have the difficulties in annotating and maintaining reference sequences. The rapid expansion of the viral sequence universe has forced a recalibration of the data model to better provide extant sequence representation and enhanced reference sequence products to serve the needs of the various viral communities. This, in turn, has placed increased emphasis on leveraging the knowledge of individual scientific communities to identify important viral sequences and develop well annotated reference virus genome sets.


Subject(s)
Databases, Nucleic Acid , Genome, Viral , High-Throughput Nucleotide Sequencing , Internet , Molecular Sequence Annotation , Software , Viruses/classification
16.
Nucleic Acids Res ; 43(Database issue): D566-70, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25378338

ABSTRACT

The 'Human Immunodeficiency Virus Type 1 (HIV-1), Human Interaction Database', available through the National Library of Medicine at http://www.ncbi.nlm.nih.gov/genome/viruses/retroviruses/hiv-1/interactions, serves the scientific community exploring the discovery of novel HIV vaccine candidates and therapeutic targets. Each HIV-1 human protein interaction can be retrieved without restriction by web-based downloads and ftp protocols and includes: Reference Sequence (RefSeq) protein accession numbers, National Center for Biotechnology Information Gene identification numbers, brief descriptions of the interactions, searchable keywords for interactions and PubMed identification numbers (PMIDs) of journal articles describing the interactions. In addition to specific HIV-1 protein-human protein interactions, included are interaction effects upon HIV-1 replication resulting when individual human gene expression is blocked using siRNA. A total of 3142 human genes are described participating in 12,786 protein-protein interactions, along with 1316 replication interactions described for each of 1250 human genes identified using small interfering RNA (siRNA). Together the data identifies 4006 human genes involved in 14,102 interactions. With the inclusion of siRNA interactions we introduce a redesigned web interface to enhance viewing, filtering and downloading of the combined data set.


Subject(s)
Databases, Genetic , HIV-1/metabolism , Human Immunodeficiency Virus Proteins/metabolism , HIV-1/genetics , HIV-1/physiology , Humans , Internet , Protein Interaction Mapping , RNA, Small Interfering/metabolism , Virus Replication
17.
Nucleic Acids Res ; 42(Database issue): D660-5, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24304891

ABSTRACT

Virus Variation (http://www.ncbi.nlm.nih.gov/genomes/VirusVariation/) is a comprehensive, web-based resource designed to support the retrieval and display of large virus sequence datasets. The resource includes a value added database, a specialized search interface and a suite of sequence data displays. Virus-specific sequence annotation and database loading pipelines produce consistent protein and gene annotation and capture sequence descriptors from sequence records then map these metadata to a controlled vocabulary. The database supports a metadata driven, web-based search interface where sequences can be selected using a variety of biological and clinical criteria. Retrieved sequences can then be downloaded in a variety of formats or analyzed using a suite of tools and displays. Over the past 2 years, the pre-existing influenza and Dengue virus resources have been combined into a single construct and West Nile virus added to the resultant resource. A number of improvements were incorporated into the sequence annotation and database loading pipelines, and the virus-specific search interfaces were updated to support more advanced functions. Several new features have also been added to the sequence download options, and a new multiple sequence alignment viewer has been incorporated into the resource tool set. Together these enhancements should support enhanced usability and the inclusion of new viruses in the future.


Subject(s)
Databases, Genetic , Viruses/genetics , Genes, Viral , Genome, Viral , Genomics , Internet , Molecular Sequence Annotation , Orthomyxoviridae/genetics , Sequence Alignment , Viral Proteins
18.
Viruses ; 16(3)2024 03 11.
Article in English | MEDLINE | ID: mdl-38543795

ABSTRACT

Genomic sequencing of clinical samples to identify emerging variants of SARS-CoV-2 has been a key public health tool for curbing the spread of the virus. As a result, an unprecedented number of SARS-CoV-2 genomes were sequenced during the COVID-19 pandemic, which allowed for rapid identification of genetic variants, enabling the timely design and testing of therapies and deployment of new vaccine formulations to combat the new variants. However, despite the technological advances of deep sequencing, the analysis of the raw sequence data generated globally is neither standardized nor consistent, leading to vastly disparate sequences that may impact identification of variants. Here, we show that for both Illumina and Oxford Nanopore sequencing platforms, downstream bioinformatic protocols used by industry, government, and academic groups resulted in different virus sequences from same sample. These bioinformatic workflows produced consensus genomes with differences in single nucleotide polymorphisms, inclusion and exclusion of insertions, and/or deletions, despite using the same raw sequence as input datasets. Here, we compared and characterized such discrepancies and propose a specific suite of parameters and protocols that should be adopted across the field. Consistent results from bioinformatic workflows are fundamental to SARS-CoV-2 and future pathogen surveillance efforts, including pandemic preparation, to allow for a data-driven and timely public health response.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , COVID-19/epidemiology , Pandemics , Workflow , Computational Biology
19.
J Biol Chem ; 287(22): 18596-607, 2012 May 25.
Article in English | MEDLINE | ID: mdl-22427673

ABSTRACT

Efficient DNA replication involves coordinated interactions among DNA polymerase, multiple factors, and the DNA. From bacteriophage T4 to eukaryotes, these factors include a helicase to unwind the DNA ahead of the replication fork, a single-stranded binding protein (SSB) to bind to the ssDNA on the lagging strand, and a helicase loader that associates with the fork, helicase, and SSB. The previously reported structure of the helicase loader in the T4 system, gene product (gp)59, has revealed an N-terminal domain, which shares structural homology with the high mobility group (HMG) proteins from eukaryotic organisms. Modeling of this structure with fork DNA has suggested that the HMG-like domain could bind to the duplex DNA ahead of the fork, whereas the C-terminal portion of gp59 would provide the docking sites for helicase (T4 gp41), SSB (T4 gp32), and the ssDNA fork arms. To test this model, we have used random and targeted mutagenesis to generate mutations throughout gp59. We have assayed the ability of the mutant proteins to bind to fork, primed fork, and ssDNAs, to interact with SSB, to stimulate helicase activity, and to function in leading and lagging strand DNA synthesis. Our results provide strong biochemical support for the role of the N-terminal gp59 HMG motif in fork binding and the interaction of the C-terminal portion of gp59 with helicase and SSB. Our results also suggest that processive replication may involve the switching of gp59 between its interactions with helicase and SSB.


Subject(s)
Bacteriophage T4/genetics , DNA Helicases/genetics , DNA, Single-Stranded/genetics , DNA, Viral/metabolism , DNA-Binding Proteins/genetics , Viral Proteins/genetics , Amino Acid Sequence , Binding Sites , DNA-Binding Proteins/chemistry , Molecular Sequence Data , Sequence Homology, Amino Acid , Viral Proteins/chemistry
20.
Microb Genom ; 9(12)2023 Dec.
Article in English | MEDLINE | ID: mdl-38085797

ABSTRACT

Fast, efficient public health actions require well-organized and coordinated systems that can supply timely and accurate knowledge. Public databases of pathogen genomic data, such as the International Nucleotide Sequence Database Collaboration (INSDC), have become essential tools for efficient public health decisions. However, these international resources began primarily for academic purposes, rather than for surveillance or interventions. Now, queries need to access not only the whole genomes of multiple pathogens but also make connections using robust contextual metadata to identify issues of public health relevance. Databases that over time developed a patchwork of submission formats and requirements need to be consistently organized and coordinated internationally to allow effective searches.To help resolve these issues, we propose a common pathogen data structure called the Pathogen Data Object Model (DOM) that will formalize the minimum pieces of sequence data and contextual data necessary for general public health uses, while recognizing that submitters will likely withhold a wide range of non-public contextual data. Further, we propose contributors use the Pathogen DOM for all pathogen submissions (bacterial, viral, fungal, and parasites), which will simplify data submissions and provide a consistent and transparent data structure for downstream data analyses. We also highlight how improved submission tools can support the Pathogen DOM, offering users additional easy-to-use methods to ensure this structure is followed.


Subject(s)
Nucleotides , Public Health , Base Sequence , Genomics/methods , Databases, Nucleic Acid
SELECTION OF CITATIONS
SEARCH DETAIL