Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 40(Web Server issue): W186-92, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22669909

RESUMO

A gene prediction program, VIGOR (Viral Genome ORF Reader), was developed at J. Craig Venter Institute in 2010 and has been successfully performing gene calling in coronavirus, influenza, rhinovirus and rotavirus for projects at the Genome Sequencing Center for Infectious Diseases. VIGOR uses sequence similarity search against custom protein databases to identify protein coding regions, start and stop codons and other gene features. Ribonucleicacid editing and other features are accurately identified based on sequence similarity and signature residues. VIGOR produces four output files: a gene prediction file, a complementary DNA file, an alignment file, and a gene feature table file. The gene feature table can be used to create GenBank submission. VIGOR takes a single input: viral genomic sequences in FASTA format. VIGOR has been extended to predict genes for 12 viruses: measles virus, mumps virus, rubella virus, respiratory syncytial virus, alphavirus and Venezuelan equine encephalitis virus, norovirus, metapneumovirus, yellow fever virus, Japanese encephalitis virus, parainfluenza virus and Sendai virus. VIGOR accurately detects the complex gene features like ribonucleicacid editing, stop codon leakage and ribosomal shunting. Precisely identifying the mat_peptide cleavage for some viruses is a built-in feature of VIGOR. The gene predictions for these viruses have been evaluated by testing from 27 to 240 genomes from GenBank.


Assuntos
Genoma Viral , Software , Códon de Terminação , Genes Virais , Internet , Anotação de Sequência Molecular , Edição de RNA , Vírus de RNA/genética , Proteínas Virais/genética
2.
Nucleic Acids Res ; 39(Database issue): D658-62, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21071397

RESUMO

The Hymenoptera Genome Database (HGD) is a comprehensive model organism database that caters to the needs of scientists working on insect species of the order Hymenoptera. This system implements open-source software and relational databases providing access to curated data contributed by an extensive, active research community. HGD contains data from 9 different species across ∼200 million years in the phylogeny of Hymenoptera, allowing researchers to leverage genetic, genome sequence and gene expression data, as well as the biological knowledge of related model organisms. The availability of resources across an order greatly facilitates comparative genomics and enhances our understanding of the biology of agriculturally important Hymenoptera species through genomics. Curated data at HGD includes predicted and annotated gene sets supported with evidence tracks such as ESTs/cDNAs, small RNA sequences and GC composition domains. Data at HGD can be queried using genome browsers and/or BLAST/PSI-BLAST servers, and it may also be downloaded to perform local searches. We encourage the public to access and contribute data to HGD at: http://HymenopteraGenome.org.


Assuntos
Bases de Dados Genéticas , Genoma de Inseto , Himenópteros/genética , Animais , Genômica , Anotação de Sequência Molecular , Software , Integração de Sistemas
3.
Nucleic Acids Res ; 39(Database issue): D830-4, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21123190

RESUMO

The Bovine Genome Database (BGD; http://BovineGenome.org) strives to improve annotation of the bovine genome and to integrate the genome sequence with other genomics data. BGD includes GBrowse genome browsers, the Apollo Annotation Editor, a quantitative trait loci (QTL) viewer, BLAST databases and gene pages. Genome browsers, available for both scaffold and chromosome coordinate systems, display the bovine Official Gene Set (OGS), RefSeq and Ensembl gene models, non-coding RNA, repeats, pseudogenes, single-nucleotide polymorphism, markers, QTL and alignments to complementary DNAs, ESTs and protein homologs. The Bovine QTL viewer is connected to the BGD Chromosome GBrowse, allowing for the identification of candidate genes underlying QTL. The Apollo Annotation Editor connects directly to the BGD Chado database to provide researchers with remote access to gene evidence in a graphical interface that allows editing and creating new gene models. Researchers may upload their annotations to the BGD server for review and integration into the subsequent release of the OGS. Gene pages display information for individual OGS gene models, including gene structure, transcript variants, functional descriptions, gene symbols, Gene Ontology terms, annotator comments and links to National Center for Biotechnology Information and Ensembl. Each gene page is linked to a wiki page to allow input from the research community.


Assuntos
Bovinos/genética , Bases de Dados Genéticas , Genômica , Anotação de Sequência Molecular , Animais , Genoma , Modelos Genéticos , Locos de Características Quantitativas , Alinhamento de Sequência , Software , Integração de Sistemas
4.
Nucleic Acids Res ; 38(Database issue): D408-14, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19843611

RESUMO

Pathema (http://pathema.jcvi.org) is one of the eight Bioinformatics Resource Centers (BRCs) funded by the National Institute of Allergy and Infectious Disease (NIAID) designed to serve as a core resource for the bio-defense and infectious disease research community. Pathema strives to support basic research and accelerate scientific progress for understanding, detecting, diagnosing and treating an established set of six target NIAID Category A-C pathogens: Category A priority pathogens; Bacillus anthracis and Clostridium botulinum, and Category B priority pathogens; Burkholderia mallei, Burkholderia pseudomallei, Clostridium perfringens and Entamoeba histolytica. Each target pathogen is represented in one of four distinct clade-specific Pathema web resources and underlying databases developed to target the specific data and analysis needs of each scientific community. All publicly available complete genome projects of phylogenetically related organisms are also represented, providing a comprehensive collection of organisms for comparative analyses. Pathema facilitates the scientific exploration of genomic and related data through its integration with web-based analysis tools, customized to obtain, display, and compute results relevant to ongoing pathogen research. Pathema serves the bio-defense and infectious disease research community by disseminating data resulting from pathogen genome sequencing projects and providing access to the results of inter-genomic comparisons for these organisms.


Assuntos
Infecções Bacterianas/microbiologia , Doenças Transmissíveis/microbiologia , Biologia Computacional/métodos , Bases de Dados Genéticas , Sequência de Aminoácidos , Animais , Infecções Bacterianas/diagnóstico , Biologia Computacional/tendências , Genoma Bacteriano , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Dados de Sequência Molecular , National Institute of Allergy and Infectious Diseases (U.S.) , Homologia de Sequência de Aminoácidos , Software , Estados Unidos
5.
Bioinformatics ; 26(12): 1488-92, 2010 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-20413634

RESUMO

MOTIVATION: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. RESULTS: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. AVAILABILITY: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net.


Assuntos
Biologia Computacional/métodos , Internet , Software , Bases de Dados Genéticas , Bases de Dados de Proteínas , Fluxo de Trabalho
6.
PLoS Genet ; 4(4): e1000046, 2008 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-18404212

RESUMO

We present the genome sequences of a new clinical isolate of the important human pathogen, Aspergillus fumigatus, A1163, and two closely related but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of A1163 with the recently sequenced A. fumigatus isolate Af293 has identified core, variable and up to 2% unique genes in each genome. While the core genes are 99.8% identical at the nucleotide level, identity for variable genes can be as low 40%. The most divergent loci appear to contain heterokaryon incompatibility (het) genes associated with fungal programmed cell death such as developmental regulator rosA. Cross-species comparison has revealed that 8.5%, 13.5% and 12.6%, respectively, of A. fumigatus, N. fischeri and A. clavatus genes are species-specific. These genes are significantly smaller in size than core genes, contain fewer exons and exhibit a subtelomeric bias. Most of them cluster together in 13 chromosomal islands, which are enriched for pseudogenes, transposons and other repetitive elements. At least 20% of A. fumigatus-specific genes appear to be functional and involved in carbohydrate and chitin catabolism, transport, detoxification, secondary metabolism and other functions that may facilitate the adaptation to heterogeneous environments such as soil or a mammalian host. Contrary to what was suggested previously, their origin cannot be attributed to horizontal gene transfer (HGT), but instead is likely to involve duplication, diversification and differential gene loss (DDL). The role of duplication in the origin of lineage-specific genes is further underlined by the discovery of genomic islands that seem to function as designated "gene dumps" and, perhaps, simultaneously, as "gene factories".


Assuntos
Aspergillus fumigatus/genética , Ilhas Genômicas , Alérgenos/genética , Aspergillus/classificação , Aspergillus/genética , Aspergillus/fisiologia , Aspergillus fumigatus/classificação , Aspergillus fumigatus/patogenicidade , Aspergillus fumigatus/fisiologia , Cromossomos Fúngicos/genética , Eurotiales/classificação , Eurotiales/genética , Eurotiales/fisiologia , Evolução Molecular , Proteínas Fúngicas/genética , Proteínas Fúngicas/imunologia , Genoma Fúngico , Humanos , Filogenia , Especificidade da Espécie , Virulência/genética
7.
BMC Bioinformatics ; 11: 451, 2010 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-20822531

RESUMO

BACKGROUND: the decrease in cost for sequencing and improvement in technologies has made it easier and more common for the re-sequencing of large genomes as well as parallel sequencing of small genomes. It is possible to completely sequence a small genome within days and this increases the number of publicly available genomes. Among the types of genomes being rapidly sequenced are those of microbial and viral genomes responsible for infectious diseases. However, accurate gene prediction is a challenge that persists for decoding a newly sequenced genome. Therefore, accurate and efficient gene prediction programs are highly desired for rapid and cost effective surveillance of RNA viruses through full genome sequencing. RESULTS: we have developed VIGOR (Viral Genome ORF Reader), a web application tool for gene prediction in influenza virus, rotavirus, rhinovirus and coronavirus subtypes. VIGOR detects protein coding regions based on sequence similarity searches and can accurately detect genome specific features such as frame shifts, overlapping genes, embedded genes, and can predict mature peptides within the context of a single polypeptide open reading frame. Genotyping capability for influenza and rotavirus is built into the program. We compared VIGOR to previously described gene prediction programs, ZCURVE_V, GeneMarkS and FLAN. The specificity and sensitivity of VIGOR are greater than 99% for the RNA viral genomes tested. CONCLUSIONS: VIGOR is a user friendly web-based genome annotation program for five different viral agents, influenza, rotavirus, rhinovirus, coronavirus and SARS coronavirus. This is the first gene prediction program for rotavirus and rhinovirus for public access. VIGOR is able to accurately predict protein coding genes for the above five viral types and has the capability to assign function to the predicted open reading frames and genotype influenza virus. The prediction software was designed for performing high throughput annotation and closure validation in a post-sequencing production pipeline.


Assuntos
Genoma Viral/genética , Anotação de Sequência Molecular/métodos , Software , Códon de Iniciação , Códon de Terminação , Biologia Computacional/métodos , Bases de Dados Genéticas , Fases de Leitura Aberta , Orthomyxoviridae/genética
8.
BMC Genomics ; 11: 645, 2010 Nov 19.
Artigo em Inglês | MEDLINE | ID: mdl-21092105

RESUMO

BACKGROUND: A goal of the Bovine Genome Database (BGD; http://BovineGenome.org) has been to support the Bovine Genome Sequencing and Analysis Consortium (BGSAC) in the annotation and analysis of the bovine genome. We were faced with several challenges, including the need to maintain consistent quality despite diversity in annotation expertise in the research community, the need to maintain consistent data formats, and the need to minimize the potential duplication of annotation effort. With new sequencing technologies allowing many more eukaryotic genomes to be sequenced, the demand for collaborative annotation is likely to increase. Here we present our approach, challenges and solutions facilitating a large distributed annotation project. RESULTS AND DISCUSSION: BGD has provided annotation tools that supported 147 members of the BGSAC in contributing 3,871 gene models over a fifteen-week period, and these annotations have been integrated into the bovine Official Gene Set. Our approach has been to provide an annotation system, which includes a BLAST site, multiple genome browsers, an annotation portal, and the Apollo Annotation Editor configured to connect directly to our Chado database. In addition to implementing and integrating components of the annotation system, we have performed computational analyses to create gene evidence tracks and a consensus gene set, which can be viewed on individual gene pages at BGD. CONCLUSIONS: We have provided annotation tools that alleviate challenges associated with distributed annotation. Our system provides a consistent set of data to all annotators and eliminates the need for annotators to format data. Involving the bovine research community in genome annotation has allowed us to leverage expertise in various areas of bovine biology to provide biological insight into the genome sequence.


Assuntos
Bovinos/genética , Bases de Dados Genéticas , Genoma/genética , Anotação de Sequência Molecular , Animais , Internet , Estatística como Assunto
9.
Proc Natl Acad Sci U S A ; 102(39): 13950-5, 2005 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-16172379

RESUMO

The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for approximately 80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes.


Assuntos
Genoma Bacteriano , Streptococcus agalactiae/classificação , Streptococcus agalactiae/genética , Sequência de Aminoácidos , Cápsulas Bacterianas/genética , Sequência de Bases , Expressão Gênica , Genes Bacterianos , Variação Genética , Dados de Sequência Molecular , Filogenia , Alinhamento de Sequência , Análise de Sequência de DNA , Streptococcus agalactiae/patogenicidade , Virulência/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA