Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Methods Mol Biol ; 2802: 547-571, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38819571

RESUMEN

As genomic and related data continue to expand, research biologists are often hampered by the computational hurdles required to analyze their data. The National Institute of Allergy and Infectious Diseases (NIAID) established the Bioinformatics Resource Centers (BRC) to assist researchers with their analysis of genome sequence and other omics-related data. Recently, the PAThosystems Resource Integration Center (PATRIC), the Influenza Research Database (IRD), and the Virus Pathogen Database and Analysis Resource (ViPR) BRCs merged to form the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) at https://www.bv-brc.org/ . The combined BV-BRC leverages the functionality of the original resources for bacterial and viral research communities with a unified data model, enhanced web-based visualization and analysis tools, and bioinformatics services. Here we demonstrate how antimicrobial resistance data can be analyzed in the new resource.


Asunto(s)
Bacterias , Biología Computacional , Bases de Datos Genéticas , Farmacorresistencia Bacteriana , Genómica , Genómica/métodos , Biología Computacional/métodos , Farmacorresistencia Bacteriana/genética , Bacterias/genética , Bacterias/efectos de los fármacos , Humanos , Programas Informáticos , Genoma Bacteriano , Antibacterianos/farmacología , Navegador Web , Estados Unidos , National Institute of Allergy and Infectious Diseases (U.S.)
2.
Nucleic Acids Res ; 51(D1): D678-D689, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36350631

RESUMEN

The National Institute of Allergy and Infectious Diseases (NIAID) established the Bioinformatics Resource Center (BRC) program to assist researchers with analyzing the growing body of genome sequence and other omics-related data. In this report, we describe the merger of the PAThosystems Resource Integration Center (PATRIC), the Influenza Research Database (IRD) and the Virus Pathogen Database and Analysis Resource (ViPR) BRCs to form the Bacterial and Viral Bioinformatics Resource Center (BV-BRC) https://www.bv-brc.org/. The combined BV-BRC leverages the functionality of the bacterial and viral resources to provide a unified data model, enhanced web-based visualization and analysis tools, bioinformatics services, and a powerful suite of command line tools that benefit the bacterial and viral research communities.


Asunto(s)
Genómica , Programas Informáticos , Virus , Humanos , Bacterias/genética , Biología Computacional , Bases de Datos Genéticas , Gripe Humana , Virus/genética
3.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34379107

RESUMEN

Antimicrobial resistance (AMR) is a major global health threat that affects millions of people each year. Funding agencies worldwide and the global research community have expended considerable capital and effort tracking the evolution and spread of AMR by isolating and sequencing bacterial strains and performing antimicrobial susceptibility testing (AST). For the last several years, we have been capturing these efforts by curating data from the literature and data resources and building a set of assembled bacterial genome sequences that are paired with laboratory-derived AST data. This collection currently contains AST data for over 67 000 genomes encompassing approximately 40 genera and over 100 species. In this paper, we describe the characteristics of this collection, highlighting areas where sampling is comparatively deep or shallow, and showing areas where attention is needed from the research community to improve sampling and tracking efforts. In addition to using the data to track the evolution and spread of AMR, it also serves as a useful starting point for building machine learning models for predicting AMR phenotypes. We demonstrate this by describing two machine learning models that are built from the entire dataset to show where the predictive power is comparatively high or low. This AMR metadata collection is freely available and maintained on the Bacterial and Viral Bioinformatics Center (BV-BRC) FTP site ftp://ftp.bvbrc.org/RELEASE_NOTES/PATRIC_genomes_AMR.txt.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Farmacorresistencia Microbiana , Genómica/métodos , Pruebas de Sensibilidad Microbiana , Inteligencia Artificial , Bacterias/efectos de los fármacos , Bacterias/genética , Genoma Bacteriano , Humanos , Laboratorios , Aprendizaje Automático , Fenotipo
4.
PLoS One ; 16(4): e0250092, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33857229

RESUMEN

Large amounts of metagenomically-derived data are submitted to PATRIC for analysis. In the future, we expect even more jobs submitted to PATRIC will use metagenomic data. One in-demand use case is the extraction of near-complete draft genomes from assembled contigs of metagenomic origin. The PATRIC metagenome binning service utilizes the PATRIC database to furnish a large, diverse set of reference genomes. We provide a new service for supervised extraction and annotation of high-quality, near-complete genomes from metagenomically-derived contigs. Reference genomes are assigned to putative draft genome bins based on the presence of single-copy universal marker roles in the sample, and contigs are sorted into these bins by their similarity to reference genomes in PATRIC. Each set of binned contigs represents a draft genome that will be annotated by RASTtk in PATRIC. A structured-language binning report is provided containing quality measurements and taxonomic information about the contig bins. The PATRIC metagenome binning service emphasizes extraction of high-quality genomes for downstream analysis using other PATRIC tools and services. Due to its supervised nature, the binning service is not appropriate for mining novel or extremely low-coverage genomes from metagenomic samples.


Asunto(s)
Metagenoma , Metagenómica/métodos , Análisis por Conglomerados , Humanos , Análisis de Secuencia de ADN/métodos
5.
Nucleic Acids Res ; 48(D1): D606-D612, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31667520

RESUMEN

The PathoSystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center funded by the National Institute of Allergy and Infectious Diseases (https://www.patricbrc.org). PATRIC supports bioinformatic analyses of all bacteria with a special emphasis on pathogens, offering a rich comparative analysis environment that provides users with access to over 250 000 uniformly annotated and publicly available genomes with curated metadata. PATRIC offers web-based visualization and comparative analysis tools, a private workspace in which users can analyze their own data in the context of the public collections, services that streamline complex bioinformatic workflows and command-line tools for bulk data analysis. Over the past several years, as genomic and other omics-related experiments have become more cost-effective and widespread, we have observed considerable growth in the usage of and demand for easy-to-use, publicly available bioinformatic tools and services. Here we report the recent updates to the PATRIC resource, including new web-based comparative analysis tools, eight new services and the release of a command-line interface to access, query and analyze data.


Asunto(s)
Bacterias/genética , Biología Computacional/métodos , Bases de Datos Genéticas , Algoritmos , Animales , Caenorhabditis elegans/genética , Pollos/genética , Drosophila melanogaster/genética , Interacciones Huésped-Patógeno/genética , Humanos , Internet , Macaca mulatta/genética , Metagenómica , Ratones , National Institute of Allergy and Infectious Diseases (U.S.) , Fenotipo , Filogenia , Ratas , Porcinos/genética , Estados Unidos , Pez Cebra/genética
6.
BMC Bioinformatics ; 20(1): 486, 2019 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-31581946

RESUMEN

BACKGROUND: Recent advances in high-volume sequencing technology and mining of genomes from metagenomic samples call for rapid and reliable genome quality evaluation. The current release of the PATRIC database contains over 220,000 genomes, and current metagenomic technology supports assemblies of many draft-quality genomes from a single sample, most of which will be novel. DESCRIPTION: We have added two quality assessment tools to the PATRIC annotation pipeline. EvalCon uses supervised machine learning to calculate an annotation consistency score. EvalG implements a variant of the CheckM algorithm to estimate contamination and completeness of an annotated genome.We report on the performance of these tools and the potential utility of the consistency score. Additionally, we provide contamination, completeness, and consistency measures for all genomes in PATRIC and in a recent set of metagenomic assemblies. CONCLUSION: EvalG and EvalCon facilitate the rapid quality control and exploration of PATRIC-annotated draft genomes.


Asunto(s)
Bases de Datos Genéticas , Genoma Arqueal , Genoma Bacteriano , Aprendizaje Automático , Metagenómica/métodos , Metagenómica/normas , Programas Informáticos
7.
Brief Bioinform ; 20(4): 1094-1102, 2019 07 19.
Artículo en Inglés | MEDLINE | ID: mdl-28968762

RESUMEN

The Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org) is designed to provide researchers with the tools and services that they need to perform genomic and other 'omic' data analyses. In response to mounting concern over antimicrobial resistance (AMR), the PATRIC team has been developing new tools that help researchers understand AMR and its genetic determinants. To support comparative analyses, we have added AMR phenotype data to over 15 000 genomes in the PATRIC database, often assembling genomes from reads in public archives and collecting their associated AMR panel data from the literature to augment the collection. We have also been using this collection of AMR metadata to build machine learning-based classifiers that can predict the AMR phenotypes and the genomic regions associated with resistance for genomes being submitted to the annotation service. Likewise, we have undertaken a large AMR protein annotation effort by manually curating data from the literature and public repositories. This collection of 7370 AMR reference proteins, which contains many protein annotations (functional roles) that are unique to PATRIC and RAST, has been manually curated so that it projects stably across genomes. The collection currently projects to 1 610 744 proteins in the PATRIC database. Finally, the PATRIC Web site has been expanded to enable AMR-based custom page views so that researchers can easily explore AMR data and design experiments based on whole genomes or individual genes.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Farmacorresistencia Microbiana/genética , Integración de Sistemas , Biología Computacional/tendencias , Bases de Datos Genéticas/estadística & datos numéricos , Genoma Microbiano , Humanos , Internet , Anotación de Secuencia Molecular
8.
Methods Mol Biol ; 1681: 231-238, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29134599

RESUMEN

Phages are complex biomolecular machineries that have to survive in a bacterial world. Phage genomes show many adaptations to their lifestyle such as shorter genes, reduced capacity for redundant DNA sequences, and the inclusion of tRNAs in their genomes. In addition, phages are not free-living, they require a host for replication and survival. These unique adaptations provide challenges for the bioinformatics analysis of phage genomes. In particular, ORF calling, genome annotation, noncoding RNA (ncRNA) identification, and the identification of transposons and insertions are all complicated in phage genome analysis. We provide a road map through the phage genome annotation pipeline, and discuss the challenges and solutions for phage genome annotation as we have implemented in the rapid annotation using subsystems (RAST) pipeline.


Asunto(s)
Bacteriófagos/genética , Biología Computacional/métodos , Genoma Viral , Anotación de Secuencia Molecular/métodos , Secuencia de Bases
9.
Methods Mol Biol ; 1704: 79-101, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29277864

RESUMEN

In the "big data" era, research biologists are faced with analyzing new types that usually require some level of computational expertise. A number of programs and pipelines exist, but acquiring the expertise to run them, and then understanding the output can be a challenge.The Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org ) has created an end-to-end analysis platform that allows researchers to take their raw reads, assemble a genome, annotate it, and then use a suite of user-friendly tools to compare it to any public data that is available in the repository. With close to 113,000 bacterial and more than 1000 archaeal genomes, PATRIC creates a unique research experience with "virtual integration" of private and public data. PATRIC contains many diverse tools and functionalities to explore both genome-scale and gene expression data, but the main focus of this chapter is on assembly, annotation, and the downstream comparative analysis functionality that is freely available in the resource.


Asunto(s)
Bacterias/genética , Bases de Datos Genéticas , Genoma Bacteriano , Genómica/métodos , Anotación de Secuencia Molecular , Programas Informáticos , Biología Computacional , Internet , Estadística como Asunto
10.
Nucleic Acids Res ; 45(D1): D535-D542, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899627

RESUMEN

The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by 'virtual integration' to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics.


Asunto(s)
Bacterias/genética , Biología Computacional/métodos , Bases de Datos Genéticas , Genoma Bacteriano , Genómica/métodos , Antibacterianos/farmacología , Bacterias/efectos de los fármacos , Bacterias/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Farmacorresistencia Bacteriana , Anotación de Secuencia Molecular , Proteoma , Proteómica/métodos , Programas Informáticos , Navegador Web
11.
Front Microbiol ; 7: 118, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-26903996

RESUMEN

The ability to build accurate protein families is a fundamental operation in bioinformatics that influences comparative analyses, genome annotation, and metabolic modeling. For several years we have been maintaining protein families for all microbial genomes in the PATRIC database (Pathosystems Resource Integration Center, patricbrc.org) in order to drive many of the comparative analysis tools that are available through the PATRIC website. However, due to the burgeoning number of genomes, traditional approaches for generating protein families are becoming prohibitive. In this report, we describe a new approach for generating protein families, which we call PATtyFams. This method uses the k-mer-based function assignments available through RAST (Rapid Annotation using Subsystem Technology) to rapidly guide family formation, and then differentiates the function-based groups into families using a Markov Cluster algorithm (MCL). This new approach for generating protein families is rapid, scalable and has properties that are consistent with alignment-based methods.

12.
Sci Rep ; 5: 8365, 2015 Feb 10.
Artículo en Inglés | MEDLINE | ID: mdl-25666585

RESUMEN

The RAST (Rapid Annotation using Subsystem Technology) annotation engine was built in 2008 to annotate bacterial and archaeal genomes. It works by offering a standard software pipeline for identifying genomic features (i.e., protein-encoding genes and RNA) and annotating their functions. Recently, in order to make RAST a more useful research tool and to keep pace with advancements in bioinformatics, it has become desirable to build a version of RAST that is both customizable and extensible. In this paper, we describe the RAST tool kit (RASTtk), a modular version of RAST that enables researchers to build custom annotation pipelines. RASTtk offers a choice of software for identifying and annotating genomic features as well as the ability to add custom features to an annotation job. RASTtk also accommodates the batch submission of genomes and the ability to customize annotation protocols for batch submissions. This is the first major software restructuring of RAST since its inception.


Asunto(s)
Anotación de Secuencia Molecular/métodos , Programas Informáticos
13.
3 Biotech ; 5(1): 101-105, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-28324362

RESUMEN

For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this short report, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as well as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.

14.
Nucleic Acids Res ; 42(Database issue): D206-14, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24293654

RESUMEN

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.


Asunto(s)
Bases de Datos Genéticas , Genoma Arqueal , Genoma Bacteriano , Anotación de Secuencia Molecular , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Proteínas Bacterianas/fisiología , Genómica , Internet , Programas Informáticos
15.
Nucleic Acids Res ; 42(Database issue): D581-91, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24225323

RESUMEN

The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.


Asunto(s)
Bases de Datos Genéticas , Genoma Bacteriano , Bacterias/clasificación , Bacterias/genética , Infecciones Bacterianas/microbiología , Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo , Técnicas de Tipificación Bacteriana , Perfilación de la Expresión Génica , Genómica , Humanos , Internet , Conformación Proteica , Mapeo de Interacción de Proteínas
16.
Bioinformatics ; 28(24): 3316-7, 2012 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-23047562

RESUMEN

Annotation of metagenomes involves comparing the individual sequence reads with a database of known sequences and assigning a unique function to each read. This is a time-consuming task that is computationally intensive (though not computationally complex). Here we present a novel approach to annotate metagenomes using unique k-mer oligopeptide sequences from 7 to 12 amino acids long. We demonstrate that k-mer-based annotations are faster and approach the sensitivity and precision of blastx-based annotations without loosing accuracy. A last-common ancestor approach was also developed to describe the members of the community.


Asunto(s)
Metagenómica/métodos , Anotación de Secuencia Molecular , Algoritmos , Metagenoma , Análisis de Secuencia de ADN
17.
PLoS One ; 7(10): e48053, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23110173

RESUMEN

The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers (http://www.theseed.org/servers): four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential users.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Factuales/estadística & datos numéricos , Almacenamiento y Recuperación de la Información/métodos , Programas Informáticos , Escherichia coli/genética , Escherichia coli/metabolismo , Genómica/métodos , Genómica/estadística & datos numéricos , Internet , Metabolómica/métodos , Metabolómica/estadística & datos numéricos , Anotación de Secuencia Molecular/métodos , Anotación de Secuencia Molecular/estadística & datos numéricos , Reproducibilidad de los Resultados
18.
BMC Genomics ; 12: 187, 2011 Apr 13.
Artículo en Inglés | MEDLINE | ID: mdl-21489287

RESUMEN

BACKGROUND: Staphylococcus aureus is associated with a spectrum of symbiotic relationships with its human host from carriage to sepsis and is frequently associated with nosocomial and community-acquired infections, thus the differential gene content among strains is of interest. RESULTS: We sequenced three clinical strains and combined these data with 13 publically available human isolates and one bovine strain for comparative genomic analyses. All genomes were annotated using RAST, and then their gene similarities and differences were delineated. Gene clustering yielded 3,155 orthologous gene clusters, of which 2,266 were core, 755 were distributed, and 134 were unique. Individual genomes contained between 2,524 and 2,648 genes. Gene-content comparisons among all possible S. aureus strain pairs (n = 136) revealed a mean difference of 296 genes and a maximum difference of 476 genes. We developed a revised version of our finite supragenome model to estimate the size of the S. aureus supragenome (3,221 genes, with 2,245 core genes), and compared it with those of Haemophilus influenzae and Streptococcus pneumoniae. There was excellent agreement between RAST's annotations and our CDS clustering procedure providing for high fidelity metabolomic subsystem analyses to extend our comparative genomic characterization of these strains. CONCLUSIONS: Using a multi-species comparative supragenomic analysis enabled by an improved version of our finite supragenome model we provide data and an interpretation explaining the relatively larger core genome of S. aureus compared to other opportunistic nasopharyngeal pathogens. In addition, we provide independent validation for the efficiency and effectiveness of our orthologous gene clustering algorithm.


Asunto(s)
Genoma Bacteriano , Haemophilus influenzae/genética , Staphylococcus aureus/genética , Streptococcus pneumoniae/genética , Algoritmos , Animales , Bovinos , Regulación Bacteriana de la Expresión Génica , Haemophilus influenzae/aislamiento & purificación , Humanos , Modelos Genéticos , Familia de Multigenes , Sistemas de Lectura Abierta , Infecciones Estafilocócicas/microbiología , Staphylococcus aureus/aislamiento & purificación , Streptococcus pneumoniae/aislamiento & purificación
19.
BMC Genomics ; 9: 75, 2008 Feb 08.
Artículo en Inglés | MEDLINE | ID: mdl-18261238

RESUMEN

BACKGROUND: The number of prokaryotic genome sequences becoming available is growing steadily and is growing faster than our ability to accurately annotate them. DESCRIPTION: We describe a fully automated service for annotating bacterial and archaeal genomes. The service identifies protein-encoding, rRNA and tRNA genes, assigns functions to the genes, predicts which subsystems are represented in the genome, uses this information to reconstruct the metabolic network and makes the output easily downloadable for the user. In addition, the annotated genome can be browsed in an environment that supports comparative analysis with the annotated genomes maintained in the SEED environment. The service normally makes the annotated genome available within 12-24 hours of submission, but ultimately the quality of such a service will be judged in terms of accuracy, consistency, and completeness of the produced annotations. We summarize our attempts to address these issues and discuss plans for incrementally enhancing the service. CONCLUSION: By providing accurate, rapid annotation freely to the community we have created an important community resource. The service has now been utilized by over 120 external users annotating over 350 distinct genomes.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Genes de ARNr/genética , Genoma Arqueal , Genoma Bacteriano , Sistemas de Lectura Abierta/genética , Filogenia , Proteínas/genética , ARN de Transferencia/genética , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Factores de Tiempo , Interfaz Usuario-Computador
20.
Infect Immun ; 75(5): 2645-7, 2007 May.
Artículo en Inglés | MEDLINE | ID: mdl-17283087

RESUMEN

Vibrio cholerae NRT36S is a non-cholera toxin-producing, non-O1 strain that causes diarrhea in volunteers. The genome of NRT36S was sequenced to create a draft containing 174 contigs plus the superintegron region. Our analysis of the draft genome revealed several putative toxin genes and colonization factors. Besides confirming the existence of nonagglutinable heat-stable toxin, we also identified the genes for a type three secretion system, a putative exotoxin, two different RTX toxins, and four pilus systems.


Asunto(s)
Proteínas Bacterianas/genética , Genoma Bacteriano , Vibrio cholerae O1/patogenicidad , Vibrio cholerae no O1/patogenicidad , Diarrea/microbiología , Diarrea/patología , Variación Genética , Humanos , Datos de Secuencia Molecular , Análisis de Secuencia de ADN , Especificidad de la Especie , Vibriosis/microbiología , Vibriosis/patología , Vibrio cholerae O1/clasificación , Vibrio cholerae O1/genética , Vibrio cholerae no O1/clasificación , Vibrio cholerae no O1/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA