Pesquisa | BVS Violência e Saúde

FlyTED: the Drosophila Testis Gene Expression Database.

Zhao, Jun; Klyne, Graham; Benson, Elizabeth; Gudmannsdottir, Elin; White-Cooper, Helen; Shotton, David.

Nucleic Acids Res ; 38(Database issue): D710-5, 2010 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-19934263

RESUMO

FlyTED, the Drosophila Testis Gene Expression Database, is a biological research database for gene expression images from the testis of the fruit fly Drosophila melanogaster. It currently contains 2762 mRNA in situ hybridization images and ancillary metadata revealing the patterns of gene expression of 817 Drosophila genes in testes of wild type flies and of seven meiotic arrest mutant strains in which spermatogenesis is defective. This database has been built by adapting a widely used digital library repository software system, EPrints (http://eprints.org/software/), and provides both web-based search and browse interfaces, and programmatic access via an SQL dump, OAI-PMH and SPARQL. FlyTED is available at http://www.fly-ted.org/.

Assuntos

Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Drosophila melanogaster/metabolismo , Regulação da Expressão Gênica , Testículo/metabolismo , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Genes de Insetos , Armazenamento e Recuperação da Informação/métodos , Internet , Masculino , Meiose , Software

Linked data and provenance in biological data webs.

Zhao, Jun; Miles, Alistair; Klyne, Graham; Shotton, David.

Brief Bioinform ; 10(2): 139-52, 2009 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-19060306

RESUMO

The Web is now being used as a platform for publishing and linking life science data. The Web's linking architecture can be exploited to join heterogeneous data from multiple sources. However, as data are frequently being updated in a decentralized environment, provenance information becomes critical to providing reliable and trustworthy services to scientists. This article presents design patterns for representing and querying provenance information relating to mapping links between heterogeneous data from sources in the domain of functional genomics. We illustrate the use of named resource description framework (RDF) graphs at different levels of granularity to make provenance assertions about linked data, and demonstrate that these assertions are sufficient to support requirements including data currency, integrity, evidential support and historical queries.

Assuntos

Biologia , Coleta de Dados/métodos , Sistemas de Gerenciamento de Base de Dados , Armazenamento e Recuperação da Informação/métodos , Internet , Semântica , Algoritmos , Animais , Biologia/métodos , Bases de Dados Factuais , Humanos , Disseminação de Informação , Bases de Conhecimento , Software , Interface Usuário-Computador , Vocabulário Controlado

Adventures in semantic publishing: exemplar semantic enhancements of a research article.

Shotton, David; Portwin, Katie; Klyne, Graham; Miles, Alistair.

PLoS Comput Biol ; 5(4): e1000361, 2009 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-19381256

RESUMO

Scientific innovation depends on finding, integrating, and re-using the products of previous research. Here we explore how recent developments in Web technology, particularly those related to the publication of data and metadata, might assist that process by providing semantic enhancements to journal articles within the mainstream process of scholarly journal publishing. We exemplify this by describing semantic enhancements we have made to a recent biomedical research article taken from PLoS Neglected Tropical Diseases, providing enrichment to its content and increased access to datasets within it. These semantic enhancements include provision of live DOIs and hyperlinks; semantic markup of textual terms, with links to relevant third-party information resources; interactive figures; a re-orderable reference list; a document summary containing a study summary, a tag cloud, and a citation analysis; and two novel types of semantic enrichment: the first, a Supporting Claims Tooltip to permit "Citations in Context", and the second, Tag Trees that bring together semantically related terms. In addition, we have published downloadable spreadsheets containing data from within tables and figures, have enriched these with provenance information, and have demonstrated various types of data fusion (mashups) with results from other research articles and with Google Maps. We have also published machine-readable RDF metadata both about the article and about the references it cites, for which we developed a Citation Typing Ontology, CiTO (http://purl.org/net/cito/). The enhanced article, which is available at http://dx.doi.org/10.1371/journal.pntd.0000228.x001, presents a compelling existence proof of the possibilities of semantic publication. We hope the showcase of examples and ideas it contains, described in this paper, will excite the imaginations of researchers and publishers, stimulating them to explore the possibilities of semantic publishing for their own research articles, and thereby break down present barriers to the discovery and re-use of information within traditional modes of scholarly communication.

Assuntos

Disseminação de Informação/métodos , Processamento de Linguagem Natural , Publicações Periódicas como Assunto , Editoração , Projetos de Pesquisa , Semântica , Redação

OpenFlyData: an exemplar data web integrating gene expression data on the fruit fly Drosophila melanogaster.

Miles, Alistair; Zhao, Jun; Klyne, Graham; White-Cooper, Helen; Shotton, David.

J Biomed Inform ; 43(5): 752-61, 2010 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-20382263

RESUMO

MOTIVATION: Integrating heterogeneous data across distributed sources is a major requirement for in silico bioinformatics supporting translational research. For example, genome-scale data on patterns of gene expression in the fruit fly Drosophila melanogaster are widely used in functional genomic studies in many organisms to inform candidate gene selection and validate experimental results. However, current data integration solutions tend to be heavy weight, and require significant initial and ongoing investment of effort. Development of a common Web-based data integration infrastructure (a.k.a. data web), using Semantic Web standards, promises to alleviate these difficulties, but little is known about the feasibility, costs, risks or practical means of migrating to such an infrastructure. RESULTS: We describe the development of OpenFlyData, a proof-of-concept system integrating gene expression data on D. melanogaster, combining Semantic Web standards with light-weight approaches to Web programming based on Web 2.0 design patterns. To support researchers designing and validating functional genomic studies, OpenFlyData includes user-facing search applications providing intuitive access to and comparison of gene expression data from FlyAtlas, the BDGP in situ database, and FlyTED, using data from FlyBase to expand and disambiguate gene names. OpenFlyData's services are also openly accessible, and are available for reuse by other bioinformaticians and application developers. Semi-automated methods and tools were developed to support labour- and knowledge-intensive tasks involved in deploying SPARQL services. These include methods for generating ontologies and relational-to-RDF mappings for relational databases, which we illustrate using the FlyBase Chado database schema; and methods for mapping gene identifiers between databases. The advantages of using Semantic Web standards for biomedical data integration are discussed, as are open issues. In particular, although the performance of open source SPARQL implementations is sufficient to query gene expression data directly from user-facing applications such as Web-based data fusions (a.k.a. mashups), we found open SPARQL endpoints to be vulnerable to denial-of-service-type problems, which must be mitigated to ensure reliability of services based on this standard. These results are relevant to data integration activities in translational bioinformatics. AVAILABILITY: The gene expression search applications and SPARQL endpoints developed for OpenFlyData are deployed at http://openflydata.org. FlyUI, a library of JavaScript widgets providing re-usable user-interface components for Drosophila gene expression data, is available at http://flyui.googlecode.com. Software and ontologies to support transformation of data from FlyBase, FlyAtlas, BDGP and FlyTED to RDF are available at http://openflydata.googlecode.com. SPARQLite, an implementation of the SPARQL protocol, is available at http://sparqlite.googlecode.com. All software is provided under the GPL version 3 open source license.

Assuntos

Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Drosophila melanogaster/genética , Internet , Animais , Drosophila melanogaster/fisiologia , Expressão Gênica , Perfilação da Expressão Gênica , Hibridização de Ácido Nucleico , Análise de Sequência com Séries de Oligonucleotídeos , Interface Usuário-Computador

Structuring research methods and data with the research object model: genomics workflows as a case study.

Hettne, Kristina M; Dharuri, Harish; Zhao, Jun; Wolstencroft, Katherine; Belhajjame, Khalid; Soiland-Reyes, Stian; Mina, Eleni; Thompson, Mark; Cruickshank, Don; Verdes-Montenegro, Lourdes; Garrido, Julian; de Roure, David; Corcho, Oscar; Klyne, Graham; van Schouwen, Reinout; 't Hoen, Peter A C; Bechhofer, Sean; Goble, Carole; Roos, Marco.

J Biomed Semantics ; 5(1): 41, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-25276335

RESUMO

BACKGROUND: One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e.g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows. RESULTS: We present the application of the workflow-centric RO model for our bioinformatics case study. Three workflows were produced following recently defined Best Practices for workflow design. By modelling the experiment as an RO, we were able to automatically query the experiment and answer questions such as "which particular data was input to a particular workflow to test a particular hypothesis?", and "which particular conclusions were drawn from a particular workflow?". CONCLUSIONS: Applying a workflow-centric RO model to aggregate and annotate the resources used in a bioinformatics experiment, allowed us to retrieve the conclusions of the experiment in the context of the driving hypothesis, the executed workflows and their input data. The RO model is an extendable reference model that can be used by other systems as well. AVAILABILITY: The Research Object is available at http://www.myexperiment.org/packs/428 The Wf4Ever Research Object Model is available at http://wf4ever.github.io/ro.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA