Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
BMC Bioinformatics ; 12: 418, 2011 Oct 27.
Artículo en Inglés | MEDLINE | ID: mdl-22032770

RESUMEN

BACKGROUND: Ontologies are widely used to represent knowledge in biomedicine. Systematic approaches for detecting errors and disagreements are needed for large ontologies with hundreds or thousands of terms and semantic relationships. A recent approach of defining terms using logical definitions is now increasingly being adopted as a method for quality control as well as for facilitating interoperability and data integration. RESULTS: We show how automated reasoning over logical definitions of ontology terms can be used to improve ontology structure. We provide the Java software package GULO (Getting an Understanding of LOgical definitions), which allows fast and easy evaluation for any kind of logically decomposed ontology by generating a composite OWL ontology from appropriate subsets of the referenced ontologies and comparing the inferred relationships with the relationships asserted in the target ontology. As a case study we show how to use GULO to evaluate the logical definitions that have been developed for the Mammalian Phenotype Ontology (MPO). CONCLUSIONS: Logical definitions of terms from biomedical ontologies represent an important resource for error and disagreement detection. GULO gives ontology curators a fast and simple tool for validation of their work.


Asunto(s)
Semántica , Programas Informáticos , Vocabulario Controlado , Animales , Humanos , Conocimiento , Lógica , Fenotipo
2.
PLoS Comput Biol ; 2(3): e15, 2006 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-16518452

RESUMEN

We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans) together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library"). Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.


Asunto(s)
Biología Computacional/métodos , Evolución Molecular , Genoma , Animales , Anopheles/genética , Caenorhabditis elegans/genética , Ciona intestinalis/genética , Drosophila melanogaster/genética , Humanos , Ratones , Filogenia , Proteómica/métodos , Especificidad de la Especie
3.
OMICS ; 10(2): 185-98, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16901225

RESUMEN

The National Center for Biomedical Ontology is a consortium that comprises leading informaticians, biologists, clinicians, and ontologists, funded by the National Institutes of Health (NIH) Roadmap, to develop innovative technology and methods that allow scientists to record, manage, and disseminate biomedical information and knowledge in machine-processable form. The goals of the Center are (1) to help unify the divergent and isolated efforts in ontology development by promoting high quality open-source, standards-based tools to create, manage, and use ontologies, (2) to create new software tools so that scientists can use ontologies to annotate and analyze biomedical data, (3) to provide a national resource for the ongoing evaluation, integration, and evolution of biomedical ontologies and associated tools and theories in the context of driving biomedical projects (DBPs), and (4) to disseminate the tools and resources of the Center and to identify, evaluate, and communicate best practices of ontology development to the biomedical community. Through the research activities within the Center, collaborations with the DBPs, and interactions with the biomedical community, our goal is to help scientists to work more effectively in the e-science paradigm, enhancing experiment design, experiment execution, data analysis, information synthesis, hypothesis generation and testing, and understand human disease.


Asunto(s)
Investigación Biomédica/normas , National Institutes of Health (U.S.) , Programas Informáticos , Internet , Semántica , Estados Unidos
4.
Genome Res ; 12(10): 1611-8, 2002 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-12368254

RESUMEN

The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of Perl modules available for managing and manipulating life-science information. Bioperl provides an easy-to-use, stable, and consistent programming interface for bioinformatics application programmers. The Bioperl modules have been successfully and repeatedly used to reduce otherwise complex tasks to only a few lines of code. The Bioperl object model has been proven to be flexible enough to support enterprise-level applications such as EnsEMBL, while maintaining an easy learning curve for novice Perl programmers. Bioperl is capable of executing analyses and processing results from programs such as BLAST, ClustalW, or the EMBOSS suite. Interoperation with modules written in Python and Java is supported through the evolving BioCORBA bridge. Bioperl provides access to data stores such as GenBank and SwissProt via a flexible series of sequence input/output modules, and to the emerging common sequence data storage format of the Open Bioinformatics Database Access project. This study describes the overall architecture of the toolkit, the problem domains that it addresses, and gives specific examples of how the toolkit can be used to solve common life-sciences problems. We conclude with a discussion of how the open-source nature of the project has contributed to the development effort.


Asunto(s)
Disciplinas de las Ciencias Biológicas/métodos , Biología Computacional/métodos , Algoritmos , Animales , Disciplinas de las Ciencias Biológicas/tendencias , Biología Computacional/tendencias , Gráficos por Computador , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Humanos , Internet , Sistemas en Línea , Programas Informáticos , Diseño de Software , Integración de Sistemas
5.
Genome Biol ; 3(12): RESEARCH0085, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-12537574

RESUMEN

BACKGROUND: Most eukaryotic genomes include a substantial repeat-rich fraction termed heterochromatin, which is concentrated in centric and telomeric regions. The repetitive nature of heterochromatic sequence makes it difficult to assemble and analyze. To better understand the heterochromatic component of the Drosophila melanogaster genome, we characterized and annotated portions of a whole-genome shotgun sequence assembly. RESULTS: WGS3, an improved whole-genome shotgun assembly, includes 20.7 Mb of draft-quality sequence not represented in the Release 3 sequence spanning the euchromatin. We annotated this sequence using the methods employed in the re-annotation of the Release 3 euchromatic sequence. This analysis predicted 297 protein-coding genes and six non-protein-coding genes, including known heterochromatic genes, and regions of similarity to known transposable elements. Bacterial artificial chromosome (BAC)-based fluorescence in situ hybridization analysis was used to correlate the genomic sequence with the cytogenetic map in order to refine the genomic definition of the centric heterochromatin; on the basis of our cytological definition, the annotated Release 3 euchromatic sequence extends into the centric heterochromatin on each chromosome arm. CONCLUSIONS: Whole-genome shotgun assembly produced a reliable draft-quality sequence of a significant part of the Drosophila heterochromatin. Annotation of this sequence defined the intron-exon structures of 30 known protein-coding genes and 267 protein-coding gene models. The cytogenetic mapping suggests that an additional 150 predicted genes are located in heterochromatin at the base of the Release 3 euchromatic sequence. Our analysis suggests strategies for improving the sequence and annotation of the heterochromatic portions of the Drosophila and other complex genomes.


Asunto(s)
Drosophila melanogaster/genética , Genoma , Heterocromatina/genética , Análisis de Secuencia de ADN/métodos , Algoritmos , Animales , Mapeo Contig , Elementos Transponibles de ADN/genética , Bases de Datos Genéticas , Programas Informáticos
6.
Genome Biol ; 3(12): RESEARCH0086, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-12537575

RESUMEN

BACKGROUND: It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined. RESULTS: We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences. CONCLUSIONS: Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone.


Asunto(s)
Biología Computacional/métodos , Drosophila/genética , Genoma , Animales , Secuencia Conservada/genética , Bases de Datos Genéticas , Drosophila melanogaster/genética , Evolución Molecular , Predicción , Reordenamiento Génico , Genes de Insecto , Variación Genética , ARN Mensajero/análisis , Análisis de Secuencia de ADN/métodos , Especificidad de la Especie , Regiones no Traducidas/análisis
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA