Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 69
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
J Clin Microbiol ; 61(7): e0019923, 2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37338371

RESUMEN

Escherichia coli sequence type 131 (ST131) is a globally dominant multidrug-resistant clone, although its clinical impact on patients with bloodstream infection (BSI) is incompletely understood. This study aims to further define the risk factors, clinical outcomes, and bacterial genetics associated with ST131 BSI. A prospectively enrolled cohort study of adult inpatients with E. coli BSI was conducted from 2002 to 2015. Whole-genome sequencing was performed with the E. coli isolates. Of the 227 patients with E. coli BSI in this study, 88 (39%) were infected with ST131. Patients with E. coli ST131 BSI and those with non-ST131 BSI did not differ with respect to in-hospital mortality (17/82 [20%] versus 26/145 [18%]; P = 0.73). However, in patients with BSI from a urinary tract source, ST131 was associated with a numerically higher in-hospital mortality than patients with non-ST131 BSI (8/42 [19%] versus 4/63 [6%]; P = 0.06) and increased mortality in an adjusted analysis (odds ratio of 5.85; 95% confidence interval of 1.44 to 29.49; P = 0.02). Genomic analyses showed that ST131 isolates primarily had an H4:O25 serotype, had a higher number of prophages, and were associated with 11 flexible genomic islands as well as virulence genes involved in adhesion (papA, kpsM, yfcV, and iha), iron acquisition (iucC and iutA), and toxin production (usp and sat). In patients with E. coli BSI from a urinary tract source, ST131 was associated with increased mortality in an adjusted analysis and contained a distinct repertoire of genes influencing pathogenesis. These genes could contribute to the higher mortality observed in patients with ST131 BSI.


Asunto(s)
Infecciones por Escherichia coli , Sepsis , Infecciones Urinarias , Sistema Urinario , Adulto , Humanos , Escherichia coli/genética , Estudios de Cohortes , Infecciones por Escherichia coli/microbiología , Infecciones Urinarias/microbiología , Antibacterianos , beta-Lactamasas/genética
2.
Nucleic Acids Res ; 47(D1): D564-D572, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30364992

RESUMEN

Automatic annotation of protein function is routinely applied to newly sequenced genomes. While this provides a fine-grained view of an organism's functional protein repertoire, proteins, more commonly function in a coordinated manner, such as in pathways or multimeric complexes. Genome Properties (GPs) define such functional entities as a series of steps, originally described by either TIGRFAMs or Pfam entries. To increase the scope of coverage, we have migrated GPs to function as a companion resource utilizing InterPro entries. Having introduced GPs-specific versioned releases, we provide software and data via a GitHub repository, and have developed a new web interface to GPs (available at https://www.ebi.ac.uk/interpro/genomeproperties). In addition to exploring each of the 1286 GPs, the website contains GPs pre-calculated for a representative set of proteomes; these results can be used to profile GPs phylogenetically via an interactive viewer. Users can upload novel data to the viewer for comparison with the pre-calculated results. Over the last year, we have added ∼700 new GPs, increasing the coverage of eukaryotic systems, as well as increasing general coverage through automatic generation of GPs from related resources. All data are freely available via the website and the GitHub repository.


Asunto(s)
Bases de Datos de Proteínas , Genoma , Proteínas/genética , Genoma Microbiano , Redes y Vías Metabólicas/genética , Complejos Multiproteicos/genética , Proteínas/metabolismo , Proteoma
3.
Nucleic Acids Res ; 47(D1): D351-D360, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30398656

RESUMEN

The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains and sites. Here, we report recent developments with InterPro (version 70.0) and its associated software, including an 18% growth in the size of the database in terms on new InterPro entries, updates to content, the inclusion of an additional entry type, refined modelling of discontinuous domains, and the development of a new programmatic interface and website. These developments extend and enrich the information provided by InterPro, and provide greater flexibility in terms of data access. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB, and discuss how our evaluation of residue coverage may help guide future curation activities.


Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Animales , Bases de Datos Genéticas , Ontología de Genes , Humanos , Internet , Familia de Multigenes , Dominios Proteicos/genética , Homología de Secuencia de Aminoácido , Programas Informáticos , Interfaz Usuario-Computador
4.
J Clin Microbiol ; 58(9)2020 08 24.
Artículo en Inglés | MEDLINE | ID: mdl-32493786

RESUMEN

Enterobacter aerogenes was recently renamed Klebsiella aerogenes This study aimed to identify differences in clinical characteristics, outcomes, and bacterial genetics among patients with K. aerogenes versus Enterobacter species bloodstream infections (BSI). We prospectively enrolled patients with K. aerogenes or Enterobacter cloacae complex (Ecc) BSI from 2002 to 2015. We performed whole-genome sequencing (WGS) and pan-genome analysis on all bacteria. Overall, 150 patients with K. aerogenes (46/150 [31%]) or Ecc (104/150 [69%]) BSI were enrolled. The two groups had similar baseline characteristics. Neither total in-hospital mortality (13/46 [28%] versus 22/104 [21%]; P = 0.3) nor attributable in-hospital mortality (9/46 [20%] versus 13/104 [12%]; P = 0.3) differed between patients with K. aerogenes versus Ecc BSI, respectively. However, poor clinical outcome (death before discharge, recurrent BSI, and/or BSI complication) was higher for K. aerogenes than Ecc BSI (32/46 [70%] versus 42/104 [40%]; P = 0.001). In a multivariable regression model, K. aerogenes BSI, relative to Ecc BSI, was predictive of poor clinical outcome (odds ratio 3.3; 95% confidence interval 1.4 to 8.1; P = 0.008). Pan-genome analysis revealed 983 genes in 323 genomic islands unique to K. aerogenes isolates, including putative virulence genes involved in iron acquisition (n = 67), fimbriae/pili/flagella production (n = 117), and metal homeostasis (n = 34). Antibiotic resistance was largely found in Ecc lineage 1, which had a higher rate of multidrug resistant phenotype (23/54 [43%]) relative to all other bacterial isolates (23/96 [24%]; P = 0.03). K. aerogenes BSI was associated with poor clinical outcomes relative to Ecc BSI. Putative virulence factors in K. aerogenes may account for these differences.


Asunto(s)
Bacteriemia , Enterobacter aerogenes , Sepsis , Antibacterianos/farmacología , Antibacterianos/uso terapéutico , Bacteriemia/tratamiento farmacológico , Enterobacter , Enterobacter aerogenes/genética , Humanos , Sepsis/tratamiento farmacológico
5.
Bioinformatics ; 35(6): 1049-1050, 2019 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-30165579

RESUMEN

SUMMARY: The JCVI pan-genome pipeline is a collection of programs to run PanOCT and tools that support and extend the capabilities of PanOCT. PanOCT (pan-genome ortholog clustering tool) is a tool for pan-genome analysis of closely related prokaryotic species or strains. The JCVI Pan-Genome Pipeline wrapper invokes command-line utilities that prepare input genomes, invoke third-party tools such as NCBI Blast+, run PanOCT, generate a consensus pan-genome, annotate features of the pan-genome, detect sets of genes of interest such as antimicrobial resistance (AMR) genes and generate figures, tables and html pages to visualize the results. The pipeline can run in a hierarchical mode, lowering the RAM and compute resources used. AVAILABILITY AND IMPLEMENTATION: Source code, demo data, and detailed documentation are freely available at https://github.com/JCVenterInstitute/PanGenomePipeline.


Asunto(s)
Genoma Bacteriano , Genoma Microbiano , Análisis por Conglomerados , Células Procariotas , Programas Informáticos
6.
J Infect Dis ; 220(4): 666-676, 2019 07 19.
Artículo en Inglés | MEDLINE | ID: mdl-31099835

RESUMEN

Previously, by targeting penicillin-binding protein 3, Pseudomonas-derived cephalosporinase (PDC), and MurA with ceftazidime-avibactam-fosfomycin, antimicrobial susceptibility was restored among multidrug-resistant (MDR) Pseudomonas aeruginosa. Herein, ceftazidime-avibactam-fosfomycin combination therapy against MDR P. aeruginosa clinical isolate CL232 was further evaluated. Checkerboard susceptibility analysis revealed synergy between ceftazidime-avibactam and fosfomycin. Accordingly, the resistance elements present and expressed in P. aeruginosa were analyzed using whole-genome sequencing and transcriptome profiling. Mutations in genes that are known to contribute to ß-lactam resistance were identified. Moreover, expression of blaPDC, the mexAB-oprM efflux pump, and murA were upregulated. When fosfomycin was administered alone, the frequency of mutations conferring resistance was high; however, coadministration of fosfomycin with ceftazidime-avibactam yielded a lower frequency of resistance mutations. In a murine infection model using a high bacterial burden, ceftazidime-avibactam-fosfomycin significantly reduced the P. aeruginosa colony-forming units (CFUs), by approximately 2 and 5 logs, compared with stasis and in the vehicle-treated control, respectively. Administration of ceftazidime-avibactam and fosfomycin separately significantly increased CFUs, by approximately 3 logs and 1 log, respectively, compared with the number at stasis, and only reduced CFUs by approximately 1 log and 2 logs, respectively, compared with the number in the vehicle-treated control. Thus, the combination of ceftazidime-avibactam-fosfomycin was superior to either drug alone. By employing a "mechanism-based approach" to combination chemotherapy, we show that ceftazidime-avibactam-fosfomycin has the potential to offer infected patients with high bacterial burdens a therapeutic hope against infection with MDR P. aeruginosa that lack metallo-ß-lactamases.


Asunto(s)
Antibacterianos/administración & dosificación , Compuestos de Azabiciclo/administración & dosificación , Ceftazidima/administración & dosificación , Farmacorresistencia Bacteriana Múltiple , Fosfomicina/administración & dosificación , Infecciones por Pseudomonas/tratamiento farmacológico , Pseudomonas aeruginosa/efectos de los fármacos , Animales , Combinación de Medicamentos , Sinergismo Farmacológico , Quimioterapia Combinada , Femenino , Humanos , Ratones , Pruebas de Sensibilidad Microbiana , Mutación , Infecciones por Pseudomonas/microbiología , Células Madre
7.
BMC Bioinformatics ; 20(1): 8, 2019 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-30612540

RESUMEN

BACKGROUND: The development of high-throughput sequencing and analysis has accelerated multi-omics studies of thousands of microbial species, metagenomes, and infectious disease pathogens. Omics studies are enabling genotype-phenotype association studies which identify genetic determinants of pathogen virulence and drug resistance, as well as phylogenetic studies designed to track the origin and spread of disease outbreaks. These omics studies are complex and often employ multiple assay technologies including genomics, metagenomics, transcriptomics, proteomics, and metabolomics. To maximize the impact of omics studies, it is essential that data be accompanied by detailed contextual metadata (e.g., specimen, spatial-temporal, phenotypic characteristics) in clear, organized, and consistent formats. Over the years, many metadata standards developed by various metadata standards initiatives have arisen; the Genomic Standards Consortium's minimal information standards (MIxS), the GSCID/BRC Project and Sample Application Standard. Some tools exist for tracking metadata, but they do not provide event based capabilities to configure, collect, validate, and distribute metadata. To address this gap in the scientific community, an event based data-driven application, OMeta, was created that allows users to quickly configure, collect, validate, distribute, and integrate metadata. RESULTS: A data-driven web application, OMeta, has been developed for use by researchers consisting of a browser-based interface, a command-line interface (CLI), and server-side components that provide an intuitive platform for configuring, capturing, viewing, and sharing metadata. Project and sample metadata can be set based on existing standards or based on projects goals. Recorded information includes details on the biological samples, procedures, protocols, and experimental technologies, etc. This information can be organized based on events, including sample collection, sample quantification, sequencing assay, and analysis results. OMeta enables configuration in various presentation types: checkbox, file, drop-box, ontology, and fields can be configured to use the National Center for Biomedical Ontology (NCBO), a biomedical ontology server. Furthermore, OMeta maintains a complete audit trail of all changes made by users and allows metadata export in comma separated value (CSV) format for convenient deposition of data into public databases. CONCLUSIONS: We present, OMeta, a web-based software application that is built on data-driven principles for configuring and customizing data standards, capturing, curating, and sharing metadata.


Asunto(s)
Ontologías Biológicas , Metadatos , Programas Informáticos , Bases de Datos Factuales , Metagenómica , Filogenia , Interfaz Usuario-Computador , Secuenciación Completa del Genoma
8.
Bioinformatics ; 34(17): 3032-3034, 2018 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-29668840

RESUMEN

Motivation: The vast number of available sequenced bacterial genomes occasionally exceeds the facilities of comparative genomic methods or is dominated by a single outbreak strain, and thus a diverse and representative subset is required. Generation of the reduced subset currently requires a priori supervised clustering and sequence-only selection of medoid genomic sequences, independent of any additional genome metrics or strain attributes. Results: The Gaussian Genome Representative Selector with Prioritization (GGRaSP) R-package described below generates a reduced subset of genomes that prioritizes maintaining genomes of interest to the user as well as minimizing the loss of genetic variation. The package also allows for unsupervised clustering by modeling the genomic relationships using a Gaussian mixture model to select an appropriate cluster threshold. We demonstrate the capabilities of GGRaSP by generating a reduced list of 315 genomes from a genomic dataset of 4600 Escherichia coli genomes, prioritizing selection by type strain and by genome completeness. Availability and implementaion: GGRaSP is available at https://github.com/JCVenterInstitute/ggrasp/. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma , Análisis por Conglomerados , Genómica/métodos , Distribución Normal , Programas Informáticos
9.
Nucleic Acids Res ; 45(D1): D190-D199, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899635

RESUMEN

InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and to predict the presence of important domains and sites. InterProScan is the underlying software that allows both protein and nucleic acid sequences to be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with InterPro and its associated software, including the addition of two new databases (SFLD and CDD), and the functionality to include residue-level annotation and prediction of intrinsic disorder. These developments enrich the annotations provided by InterPro, increase the overall number of residues annotated and allow more specific functional inferences.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Dominios y Motivos de Interacción de Proteínas , Programas Informáticos , Humanos , Anotación de Secuencia Molecular , Filogenia
10.
BMC Bioinformatics ; 19(1): 246, 2018 06 27.
Artículo en Inglés | MEDLINE | ID: mdl-29945570

RESUMEN

BACKGROUND: Bacterial pan-genomes, comprised of conserved and variable genes across multiple sequenced bacterial genomes, allow for identification of genomic regions that are phylogenetically discriminating or functionally important. Pan-genomes consist of large amounts of data, which can restrict researchers ability to locate and analyze these regions. Multiple software packages are available to visualize pan-genomes, but currently their ability to address these concerns are limited by using only pre-computed data sets, prioritizing core over variable gene clusters, or by not accounting for pan-chromosome positioning in the viewer. RESULTS: We introduce PanACEA (Pan-genome Atlas with Chromosome Explorer and Analyzer), which utilizes locally-computed interactive web-pages to view ordered pan-genome data. It consists of multi-tiered, hierarchical display pages that extend from pan-chromosomes to both core and variable regions to single genes. Regions and genes are functionally annotated to allow for rapid searching and visual identification of regions of interest with the option that user-supplied genomic phylogenies and metadata can be incorporated. PanACEA's memory and time requirements are within the capacities of standard laptops. The capability of PanACEA as a research tool is demonstrated by highlighting a variable region important in differentiating strains of Enterobacter hormaechei. CONCLUSIONS: PanACEA can rapidly translate the results of pan-chromosome programs into an intuitive and interactive visual representation. It will empower researchers to visually explore and identify regions of the pan-chromosome that are most biologically interesting, and to obtain publication quality images of these regions.


Asunto(s)
Cromosomas/genética , Biología Computacional/métodos , Genómica/métodos , Humanos
11.
Artículo en Inglés | MEDLINE | ID: mdl-30012762

RESUMEN

Burkholderia multivorans is a member of the Burkholderia cepacia complex, a group of >20 related species of nosocomial pathogens that commonly infect individuals suffering from cystic fibrosis. ß-Lactam antibiotics are recommended as therapy for infections due to Bmultivorans, which possesses two ß-lactamase genes, blapenA and blaAmpC PenA is a carbapenemase with a substrate profile similar to that of the Klebsiella pneumoniae carbapenemase (KPC); in addition, expression of PenA is inducible by ß-lactams in Bmultivorans Here, we characterize AmpC from Bmultivorans ATCC 17616. AmpC possesses only 38 to 46% protein identity with non-Burkholderia AmpC proteins (e.g., PDC-1 and CMY-2). Among 49 clinical isolates of Bmultivorans, we identified 27 different AmpC variants. Some variants possessed single amino acid substitutions within critical active-site motifs (Ω loop and R2 loop). Purified AmpC1 demonstrated minimal measurable catalytic activity toward ß-lactams (i.e., nitrocefin and cephalothin). Moreover, avibactam was a poor inhibitor of AmpC1 (Kiapp > 600 µM), and acyl-enzyme complex formation with AmpC1 was slow, likely due to lack of productive interactions with active-site residues. Interestingly, immunoblotting using a polyclonal anti-AmpC antibody revealed that protein expression of AmpC1 was inducible in Bmultivorans ATCC 17616 after growth in subinhibitory concentrations of imipenem (1 µg/ml). AmpC is a unique inducible class C cephalosporinase that may play an ancillary role in Bmultivorans compared to PenA, which is the dominant ß-lactamase in Bmultivorans ATCC 17616.


Asunto(s)
Antibacterianos/farmacología , Proteínas Bacterianas/química , Proteínas Bacterianas/metabolismo , Burkholderia/efectos de los fármacos , Burkholderia/enzimología , beta-Lactamasas/química , beta-Lactamasas/metabolismo , beta-Lactamas/farmacología , Secuencia de Aminoácidos , Compuestos de Azabiciclo/farmacología , Cefalosporinasa/química , Cefalosporinasa/metabolismo , Cefalosporinas/farmacología , Cefalotina/farmacología , Imipenem/farmacología , Pruebas de Sensibilidad Microbiana , Estructura Secundaria de Proteína
12.
Bioinformatics ; 33(11): 1725-1726, 2017 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-28130240

RESUMEN

SUMMARY: LOCUST is a custom sequence locus typer tool for classifying microbial genomes. It provides a fully automated opportunity to customize the classification of genome-wide nucleotide variant data most relevant to biological research. AVAILABILITY AND IMPLEMENTATION: Source code, demo data, and detailed documentation are freely available at http://sourceforge.net/projects/locustyper . CONTACT: lbrinkac@jcvi.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Bacterias/clasificación , Genoma Bacteriano , Tipificación Molecular/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Bacterias/genética , Genómica/métodos
13.
Nature ; 486(7404): 527-31, 2012 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-22722832

RESUMEN

Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other.


Asunto(s)
Evolución Molecular , Variación Genética/genética , Genoma Humano/genética , Genoma/genética , Pan paniscus/genética , Pan troglodytes/genética , Animales , Elementos Transponibles de ADN/genética , Duplicación de Gen/genética , Genotipo , Humanos , Datos de Secuencia Molecular , Fenotipo , Filogenia , Especificidad de la Especie
14.
Nature ; 468(7320): 60-6, 2010 Nov 04.
Artículo en Inglés | MEDLINE | ID: mdl-21048761

RESUMEN

The understanding of marine microbial ecology and metabolism has been hampered by the paucity of sequenced reference genomes. To this end, we report the sequencing of 137 diverse marine isolates collected from around the world. We analysed these sequences, along with previously published marine prokaryotic genomes, in the context of marine metagenomic data, to gain insights into the ecology of the surface ocean prokaryotic picoplankton (0.1-3.0 µm size range). The results suggest that the sequenced genomes define two microbial groups: one composed of only a few taxa that are nearly always abundant in picoplanktonic communities, and the other consisting of many microbial taxa that are rarely abundant. The genomic content of the second group suggests that these microbes are capable of slow growth and survival in energy-limited environments, and rapid growth in energy-rich environments. By contrast, the abundant and cosmopolitan picoplanktonic prokaryotes for which there is genomic representation have smaller genomes, are probably capable of only slow growth and seem to be relatively unable to sense or rapidly acclimate to energy-rich conditions. Their genomic features also lead us to propose that one method used to avoid predation by viruses and/or bacterivores is by means of slow growth and the maintenance of low biomass.


Asunto(s)
Organismos Acuáticos/genética , Genómica , Metagenoma , Plancton/genética , Células Procariotas/metabolismo , Organismos Acuáticos/clasificación , Organismos Acuáticos/aislamiento & purificación , Organismos Acuáticos/virología , Biodiversidad , Biomasa , Bases de Datos de Proteínas , Genoma Bacteriano/genética , Modelos Biológicos , Océanos y Mares , Filogenia , Plancton/crecimiento & desarrollo , Plancton/aislamiento & purificación , Plancton/metabolismo , Células Procariotas/clasificación , Células Procariotas/virología , ARN Ribosómico 16S/genética , Microbiología del Agua
15.
Nature ; 464(7288): 592-6, 2010 Mar 25.
Artículo en Inglés | MEDLINE | ID: mdl-20228792

RESUMEN

The freshwater cnidarian Hydra was first described in 1702 and has been the object of study for 300 years. Experimental studies of Hydra between 1736 and 1744 culminated in the discovery of asexual reproduction of an animal by budding, the first description of regeneration in an animal, and successful transplantation of tissue between animals. Today, Hydra is an important model for studies of axial patterning, stem cell biology and regeneration. Here we report the genome of Hydra magnipapillata and compare it to the genomes of the anthozoan Nematostella vectensis and other animals. The Hydra genome has been shaped by bursts of transposable element expansion, horizontal gene transfer, trans-splicing, and simplification of gene structure and gene content that parallel simplification of the Hydra life cycle. We also report the sequence of the genome of a novel bacterium stably associated with H. magnipapillata. Comparisons of the Hydra genome to the genomes of other animals shed light on the evolution of epithelia, contractile tissues, developmentally regulated transcription factors, the Spemann-Mangold organizer, pluripotency genes and the neuromuscular junction.


Asunto(s)
Genoma/genética , Hydra/genética , Animales , Antozoos/genética , Comamonadaceae/genética , Elementos Transponibles de ADN/genética , Transferencia de Gen Horizontal/genética , Genoma Bacteriano/genética , Hydra/microbiología , Hydra/ultraestructura , Datos de Secuencia Molecular , Unión Neuromuscular/ultraestructura
16.
Nucleic Acids Res ; 40(22): e172, 2012 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-22904089

RESUMEN

Pan-genome ortholog clustering tool (PanOCT) is a tool for pan-genomic analysis of closely related prokaryotic species or strains. PanOCT uses conserved gene neighborhood information to separate recently diverged paralogs into orthologous clusters where homology-only clustering methods cannot. The results from PanOCT and three commonly used graph-based ortholog-finding programs were compared using a set of four publicly available strains of the same bacterial species. All four methods agreed on ∼70% of the clusters and ∼86% of the proteins. The clusters that did not agree were inspected for evidence of correctness resulting in 85 high-confidence manually curated clusters that were used to compare all four methods.


Asunto(s)
Proteínas Bacterianas/genética , Genes Bacterianos , Genoma Bacteriano , Programas Informáticos , Bacterias/clasificación , Proteínas Bacterianas/clasificación , Análisis por Conglomerados , Genómica/métodos
17.
Nucleic Acids Res ; 40(Database issue): D237-41, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22140108

RESUMEN

CharProtDB (http://www.jcvi.org/charprotdb/) is a curated database of biochemically characterized proteins. It provides a source of direct rather than transitive assignments of function, designed to support automated annotation pipelines. The initial data set in CharProtDB was collected through manual literature curation over the years by analysts at the J. Craig Venter Institute (JCVI) [formerly The Institute of Genomic Research (TIGR)] as part of their prokaryotic genome sequencing projects. The CharProtDB has been expanded by import of selected records from publicly available protein collections whose biocuration indicated direct rather than homology-based assignment of function. Annotations in CharProtDB include gene name, symbol and various controlled vocabulary terms, including Gene Ontology terms, Enzyme Commission number and TransportDB accession. Each annotation is referenced with the source; ideally a journal reference, or, if imported and lacking one, the original database source.


Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Proteínas/química , Proteínas/genética , Proteínas/fisiología
18.
Nucleic Acids Res ; 38(Database issue): D340-5, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19892825

RESUMEN

The Comprehensive Microbial Resource or CMR (http://cmr.jcvi.org) provides a web-based central resource for the display, search and analysis of the sequence and annotation for complete and publicly available bacterial and archaeal genomes. In addition to displaying the original annotation from GenBank, the CMR makes available secondary automated structural and functional annotation across all genomes to provide consistent data types necessary for effective mining of genomic data. Precomputed homology searches are stored to allow meaningful genome comparisons. The CMR supplies users with over 50 different tools to utilize the sequence and annotation data across one or more of the 571 currently available genomes. At the gene level users can view the gene annotation and underlying evidence. Genome level information includes whole genome graphical displays, biochemical pathway maps and genome summary data. Comparative tools display analysis between genomes with homology and genome alignment tools, and searches across the accessions, annotation, and evidence assigned to all genes/genomes are available. The data and tools on the CMR aid genomic research and analysis, and the CMR is included in over 200 scientific publications. The code underlying the CMR website and the CMR database are freely available for download with no license restrictions.


Asunto(s)
Bacterias/genética , Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Bases de Datos de Proteínas , Genes Bacterianos , Biología Computacional/tendencias , Genoma Bacteriano , Almacenamiento y Recuperación de la Información/métodos , Internet , Estructura Terciaria de Proteína , Programas Informáticos
19.
Nucleic Acids Res ; 38(Database issue): D336-9, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-20007151

RESUMEN

Generation of syntactically correct and unambiguous names for proteins is a challenging, yet vital task for functional annotation processes. Proteins are often named based on homology to known proteins, many of which have problematic names. To address the need to generate high-quality protein names, and capture our significant experience correcting protein names manually, we have developed the Protein Naming Utility (PNU, http://www.jcvi.org/pn-utility). The PNU is a web-based database for storing and applying naming rules to identify and correct syntactically incorrect protein names, or to replace synonyms with their preferred name. The PNU allows users to generate and manage collections of naming rules, optionally building upon the growing body of rules generated at the J. Craig Venter Institute (JCVI). Since communities often enforce disparate conventions for naming proteins, the PNU supports grouping rules into user-managed collections. Users can check their protein names against a selected PNU rule collection, generating both statistics and corrected names. The PNU can also be used to correct GenBank table files prior to submission to GenBank. Currently, the database features 3080 manual rules that have been entered by JCVI Bioinformatics Analysts as well as 7458 automatically imported names.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Proteínas , Proteínas/química , Terminología como Asunto , Algoritmos , Animales , Automatización , Biología Computacional/tendencias , Genoma , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Programas Informáticos
20.
Nucleic Acids Res ; 38(Database issue): D408-14, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19843611

RESUMEN

Pathema (http://pathema.jcvi.org) is one of the eight Bioinformatics Resource Centers (BRCs) funded by the National Institute of Allergy and Infectious Disease (NIAID) designed to serve as a core resource for the bio-defense and infectious disease research community. Pathema strives to support basic research and accelerate scientific progress for understanding, detecting, diagnosing and treating an established set of six target NIAID Category A-C pathogens: Category A priority pathogens; Bacillus anthracis and Clostridium botulinum, and Category B priority pathogens; Burkholderia mallei, Burkholderia pseudomallei, Clostridium perfringens and Entamoeba histolytica. Each target pathogen is represented in one of four distinct clade-specific Pathema web resources and underlying databases developed to target the specific data and analysis needs of each scientific community. All publicly available complete genome projects of phylogenetically related organisms are also represented, providing a comprehensive collection of organisms for comparative analyses. Pathema facilitates the scientific exploration of genomic and related data through its integration with web-based analysis tools, customized to obtain, display, and compute results relevant to ongoing pathogen research. Pathema serves the bio-defense and infectious disease research community by disseminating data resulting from pathogen genome sequencing projects and providing access to the results of inter-genomic comparisons for these organisms.


Asunto(s)
Infecciones Bacterianas/microbiología , Enfermedades Transmisibles/microbiología , Biología Computacional/métodos , Bases de Datos Genéticas , Secuencia de Aminoácidos , Animales , Infecciones Bacterianas/diagnóstico , Biología Computacional/tendencias , Genoma Bacteriano , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Datos de Secuencia Molecular , National Institute of Allergy and Infectious Diseases (U.S.) , Homología de Secuencia de Aminoácido , Programas Informáticos , Estados Unidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA