Pesquisa | BVS Violência e Saúde

1.

WikiPathways 2024: next generation pathway database.

Agrawal, Ayushi; Balci, Hasan; Hanspers, Kristina; Coort, Susan L; Martens, Marvin; Slenter, Denise N; Ehrhart, Friederike; Digles, Daniela; Waagmeester, Andra; Wassink, Isabel; Abbassi-Daloii, Tooba; Lopes, Elisson N; Iyer, Aishwarya; Acosta, Javier Millán; Willighagen, Lars G; Nishida, Kozo; Riutta, Anders; Basaric, Helena; Evelo, Chris T; Willighagen, Egon L; Kutmon, Martina; Pico, Alexander R.

Nucleic Acids Res ; 52(D1): D679-D689, 2024 Jan 05.

Artigo em Inglês | MEDLINE | ID: mdl-37941138

RESUMO

WikiPathways (wikipathways.org) is an open-source biological pathway database. Collaboration and open science are pivotal to the success of WikiPathways. Here we highlight the continuing efforts supporting WikiPathways, content growth and collaboration among pathway researchers. As an evolving database, there is a growing need for WikiPathways to address and overcome technical challenges. In this direction, WikiPathways has undergone major restructuring, enabling a renewed approach for sharing and curating pathway knowledge, thus providing stability for the future of community pathway curation. The website has been redesigned to improve and enhance user experience. This next generation of WikiPathways continues to support existing features while improving maintainability of the database and facilitating community input by providing new functionality and leveraging automation.

Assuntos

Bases de Dados Factuais

2.

Complex Portal 2022: new curation frontiers.

Meldal, Birgit H M; Perfetto, Livia; Combe, Colin; Lubiana, Tiago; Ferreira Cavalcante, João Vitor; Bye-A-Jee, Hema; Waagmeester, Andra; Del-Toro, Noemi; Shrivastava, Anjali; Barrera, Elisabeth; Wong, Edith; Mlecnik, Bernhard; Bindea, Gabriela; Panneerselvam, Kalpana; Willighagen, Egon; Rappsilber, Juri; Porras, Pablo; Hermjakob, Henning; Orchard, Sandra.

Nucleic Acids Res ; 50(D1): D578-D586, 2022 01 07.

Artigo em Inglês | MEDLINE | ID: mdl-34718729

RESUMO

The Complex Portal (www.ebi.ac.uk/complexportal) is a manually curated, encyclopaedic database of macromolecular complexes with known function from a range of model organisms. It summarizes complex composition, topology and function along with links to a large range of domain-specific resources (i.e. wwPDB, EMDB and Reactome). Since the last update in 2019, we have produced a first draft complexome for Escherichia coli, maintained and updated that of Saccharomyces cerevisiae, added over 40 coronavirus complexes and increased the human complexome to over 1100 complexes that include approximately 200 complexes that act as targets for viral proteins or are part of the immune system. The display of protein features in ComplexViewer has been improved and the participant table is now colour-coordinated with the nodes in ComplexViewer. Community collaboration has expanded, for example by contributing to an analysis of putative transcription cofactors and providing data accessible to semantic web tools through Wikidata which is now populated with manually curated Complex Portal content through a new bot. Our data license is now CC0 to encourage data reuse. Users are encouraged to get in touch, provide us with feedback and send curation requests through the 'Support' link.

Assuntos

Curadoria de Dados/métodos , Bases de Dados de Proteínas , Complexos Multiproteicos/química , Coronavirus/química , Visualização de Dados , Bases de Dados de Compostos Químicos , Enzimas/química , Enzimas/metabolismo , Escherichia coli/química , Humanos , Cooperação Internacional , Anotação de Sequência Molecular , Complexos Multiproteicos/metabolismo , Interface Usuário-Computador

3.

WikiPathways: connecting communities.

Martens, Marvin; Ammar, Ammar; Riutta, Anders; Waagmeester, Andra; Slenter, Denise N; Hanspers, Kristina; A Miller, Ryan; Digles, Daniela; Lopes, Elisson N; Ehrhart, Friederike; Dupuis, Lauren J; Winckers, Laurent A; Coort, Susan L; Willighagen, Egon L; Evelo, Chris T; Pico, Alexander R; Kutmon, Martina.

Nucleic Acids Res ; 49(D1): D613-D621, 2021 01 08.

Artigo em Inglês | MEDLINE | ID: mdl-33211851

RESUMO

WikiPathways (https://www.wikipathways.org) is a biological pathway database known for its collaborative nature and open science approaches. With the core idea of the scientific community developing and curating biological knowledge in pathway models, WikiPathways lowers all barriers for accessing and using its content. Increasingly more content creators, initiatives, projects and tools have started using WikiPathways. Central in this growth and increased use of WikiPathways are the various communities that focus on particular subsets of molecular pathways such as for rare diseases and lipid metabolism. Knowledge from published pathway figures helps prioritize pathway development, using optical character and named entity recognition. We show the growth of WikiPathways over the last three years, highlight the new communities and collaborations of pathway authors and curators, and describe various technologies to connect to external resources and initiatives. The road toward a sustainable, community-driven pathway database goes through integration with other resources such as Wikidata and allowing more use, curation and redistribution of WikiPathways content.

Assuntos

Bases de Dados Factuais , COVID-19/patologia , Curadoria de Dados , Humanos , Publicações , Interface Usuário-Computador

4.

A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses.

Waagmeester, Andra; Willighagen, Egon L; Su, Andrew I; Kutmon, Martina; Gayo, Jose Emilio Labra; Fernández-Álvarez, Daniel; Groom, Quentin; Schaap, Peter J; Verhagen, Lisa M; Koehorst, Jasper J.

BMC Biol ; 19(1): 12, 2021 01 22.

Artigo em Inglês | MEDLINE | ID: mdl-33482803

RESUMO

BACKGROUND: Pandemics, even more than other medical problems, require swift integration of knowledge. When caused by a new virus, understanding the underlying biology may help finding solutions. In a setting where there are a large number of loosely related projects and initiatives, we need common ground, also known as a "commons." Wikidata, a public knowledge graph aligned with Wikipedia, is such a commons and uses unique identifiers to link knowledge in other knowledge bases. However, Wikidata may not always have the right schema for the urgent questions. In this paper, we address this problem by showing how a data schema required for the integration can be modeled with entity schemas represented by Shape Expressions. RESULTS: As a telling example, we describe the process of aligning resources on the genomes and proteomes of the SARS-CoV-2 virus and related viruses as well as how Shape Expressions can be defined for Wikidata to model the knowledge, helping others studying the SARS-CoV-2 pandemic. How this model can be used to make data between various resources interoperable is demonstrated by integrating data from NCBI (National Center for Biotechnology Information) Taxonomy, NCBI Genes, UniProt, and WikiPathways. Based on that model, a set of automated applications or bots were written for regular updates of these sources in Wikidata and added to a platform for automatically running these updates. CONCLUSIONS: Although this workflow is developed and applied in the context of the COVID-19 pandemic, to demonstrate its broader applicability it was also applied to other human coronaviruses (MERS, SARS, human coronavirus NL63, human coronavirus 229E, human coronavirus HKU1, human coronavirus OC4).

Assuntos

COVID-19/patologia , Genômica/métodos , Bases de Conhecimento , Proteômica/métodos , SARS-CoV-2/fisiologia , COVID-19/metabolismo , COVID-19/virologia , Coronavirus/genética , Coronavirus/fisiologia , Infecções por Coronavirus/metabolismo , Infecções por Coronavirus/patologia , Infecções por Coronavirus/virologia , Genoma Viral , Humanos , Internet , Pandemias , SARS-CoV-2/genética , Proteínas Virais/genética , Proteínas Virais/metabolismo , Fluxo de Trabalho

5.

WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research.

Slenter, Denise N; Kutmon, Martina; Hanspers, Kristina; Riutta, Anders; Windsor, Jacob; Nunes, Nuno; Mélius, Jonathan; Cirillo, Elisa; Coort, Susan L; Digles, Daniela; Ehrhart, Friederike; Giesbertz, Pieter; Kalafati, Marianthi; Martens, Marvin; Miller, Ryan; Nishida, Kozo; Rieswijk, Linda; Waagmeester, Andra; Eijssen, Lars M T; Evelo, Chris T; Pico, Alexander R; Willighagen, Egon L.

Nucleic Acids Res ; 46(D1): D661-D667, 2018 01 04.

Artigo em Inglês | MEDLINE | ID: mdl-29136241

RESUMO

WikiPathways (wikipathways.org) captures the collective knowledge represented in biological pathways. By providing a database in a curated, machine readable way, omics data analysis and visualization is enabled. WikiPathways and other pathway databases are used to analyze experimental data by research groups in many fields. Due to the open and collaborative nature of the WikiPathways platform, our content keeps growing and is getting more accurate, making WikiPathways a reliable and rich pathway database. Previously, however, the focus was primarily on genes and proteins, leaving many metabolites with only limited annotation. Recent curation efforts focused on improving the annotation of metabolism and metabolic pathways by associating unmapped metabolites with database identifiers and providing more detailed interaction knowledge. Here, we report the outcomes of the continued growth and curation efforts, such as a doubling of the number of annotated metabolite nodes in WikiPathways. Furthermore, we introduce an OpenAPI documentation of our web services and the FAIR (Findable, Accessible, Interoperable and Reusable) annotation of resources to increase the interoperability of the knowledge encoded in these pathways and experimental omics data. New search options, monthly downloads, more links to metabolite databases, and new portals make pathway knowledge more effortlessly accessible to individual researchers and research communities.

Assuntos

Bases de Dados de Compostos Químicos , Metabolômica , Animais , Curadoria de Dados , Mineração de Dados , Bases de Dados de Compostos Químicos/normas , Bases de Dados Genéticas , Humanos , Redes e Vias Metabólicas , Controle de Qualidade , Ferramenta de Busca , Software

6.

Ten quick tips for editing Wikidata.

Shafee, Thomas; Mietchen, Daniel; Lubiana, Tiago; Jemielniak, Dariusz; Waagmeester, Andra.

PLoS Comput Biol ; 19(7): e1011235, 2023 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-37471307

7.

Author Correction: A protocol for adding knowledge to Wikidata: aligning resources on human coronaviruses.

Waagmeester, Andra; Willighagen, Egon L; Su, Andrew I; Kutmon, Martina; Gayo, Jose Emilio Labra; Fernández-Álvarez, Daniel; Groom, Quentin; Schaap, Peter J; Verhagen, Lisa M; Koehorst, Jasper J.

BMC Biol ; 21(1): 261, 2023 Nov 16.

Artigo em Inglês | MEDLINE | ID: mdl-37974169

8.

WikiPathways: capturing the full diversity of pathway knowledge.

Kutmon, Martina; Riutta, Anders; Nunes, Nuno; Hanspers, Kristina; Willighagen, Egon L; Bohler, Anwesha; Mélius, Jonathan; Waagmeester, Andra; Sinha, Sravanthi R; Miller, Ryan; Coort, Susan L; Cirillo, Elisa; Smeets, Bart; Evelo, Chris T; Pico, Alexander R.

Nucleic Acids Res ; 44(D1): D488-94, 2016 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-26481357

RESUMO

WikiPathways (http://www.wikipathways.org) is an open, collaborative platform for capturing and disseminating models of biological pathways for data visualization and analysis. Since our last NAR update, 4 years ago, WikiPathways has experienced massive growth in content, which continues to be contributed by hundreds of individuals each year. New aspects of the diversity and depth of the collected pathways are described from the perspective of researchers interested in using pathway information in their studies. We provide updates on extensions and services to support pathway analysis and visualization via popular standalone tools, i.e. PathVisio and Cytoscape, web applications and common programming environments. We introduce the Quick Edit feature for pathway authors and curators, in addition to new means of publishing pathways and maintaining custom pathway collections to serve specific research topics and communities. In addition to the latest milestones in our pathway collection and curation effort, we also highlight the latest means to access the content as publishable figures, as standard data files, and as linked data, including bulk and programmatic access.

Assuntos

Bases de Dados de Compostos Químicos , Modelos Biológicos , Perfilação da Expressão Gênica , Genes , Humanos , Metabolômica

9.

Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources.

Waagmeester, Andra; Kutmon, Martina; Riutta, Anders; Miller, Ryan; Willighagen, Egon L; Evelo, Chris T; Pico, Alexander R.

PLoS Comput Biol ; 12(6): e1004989, 2016 06.

Artigo em Inglês | MEDLINE | ID: mdl-27336457

RESUMO

The diversity of online resources storing biological data in different formats provides a challenge for bioinformaticians to integrate and analyse their biological data. The semantic web provides a standard to facilitate knowledge integration using statements built as triples describing a relation between two objects. WikiPathways, an online collaborative pathway resource, is now available in the semantic web through a SPARQL endpoint at http://sparql.wikipathways.org. Having biological pathways in the semantic web allows rapid integration with data from other resources that contain information about elements present in pathways using SPARQL queries. In order to convert WikiPathways content into meaningful triples we developed two new vocabularies that capture the graphical representation and the pathway logic, respectively. Each gene, protein, and metabolite in a given pathway is defined with a standard set of identifiers to support linking to several other biological resources in the semantic web. WikiPathways triples were loaded into the Open PHACTS discovery platform and are available through its Web API (https://dev.openphacts.org/docs) to be used in various tools for drug development. We combined various semantic web resources with the newly converted WikiPathways content using a variety of SPARQL query types and third-party resources, such as the Open PHACTS API. The ability to use pathway information to form new links across diverse biological data highlights the utility of integrating WikiPathways in the semantic web.

Assuntos

Ontologias Biológicas , Biologia Computacional/métodos , Armazenamento e Recuperação da Informação/métodos , Internet , Semântica , Pesquisa Biomédica , Humanos

10.

Ten simple rules for creating reusable pathway models for computational analysis and visualization.

Hanspers, Kristina; Kutmon, Martina; Coort, Susan L; Digles, Daniela; Dupuis, Lauren J; Ehrhart, Friederike; Hu, Finterly; Lopes, Elisson N; Martens, Marvin; Pham, Nhung; Shin, Woosub; Slenter, Denise N; Waagmeester, Andra; Willighagen, Egon L; Winckers, Laurent A; Evelo, Chris T; Pico, Alexander R.

PLoS Comput Biol ; 17(8): e1009226, 2021 08.

Artigo em Inglês | MEDLINE | ID: mdl-34411100

Assuntos

Biologia Computacional/métodos , Redes e Vias Metabólicas , Modelos Biológicos , Animais , Gráficos por Computador , Bases de Dados Factuais , Humanos , Terminologia como Assunto

11.

Understanding signaling and metabolic paths using semantified and harmonized information about biological interactions.

Miller, Ryan A; Kutmon, Martina; Bohler, Anwesha; Waagmeester, Andra; Evelo, Chris T; Willighagen, Egon L.

PLoS One ; 17(4): e0263057, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35436299

RESUMO

To grasp the complexity of biological processes, the biological knowledge is often translated into schematic diagrams of, for example, signalling and metabolic pathways. These pathway diagrams describe relevant connections between biological entities and incorporate domain knowledge in a visual format making it easier for humans to interpret. Still, these diagrams can be represented in machine readable formats, as done in the KEGG, Reactome, and WikiPathways databases. However, while humans are good at interpreting the message of the creators of diagrams, algorithms struggle when the diversity in drawing approaches increases. WikiPathways supports multiple drawing styles which need harmonizing to offer semantically enriched access. Particularly challenging, here, are the interactions between the biological entities that underlie the biological causality. These interactions provide information about the biological process (metabolic conversion, inhibition, etc.), the direction, and the participating entities. Availability of the interactions in a semantic and harmonized format is essential for searching the full network of biological interactions. We here study how the graphically-modelled biological knowledge in diagrams can be semantified and harmonized, and exemplify how the resulting data is used to programmatically answer biological questions. We find that we can translate graphically modelled knowledge to a sufficient degree into a semantic model and discuss some of the current limitations. We then use this to show that reproducible notebooks can be used to explore up- and downstream targets of MECP2 and to analyse the sphingolipid metabolism. Our results demonstrate that most of the graphical biological knowledge from WikiPathways is modelled into the semantic layer with the semantic information intact and connectivity information preserved. Being able to evaluate how biological elements affect each other is useful and allows, for example, the identification of up or downstream targets that will have a similar effect when modified.

Assuntos

Fenômenos Biológicos , Transdução de Sinais , Algoritmos , Bases de Dados Factuais , Humanos , Redes e Vias Metabólicas , Transdução de Sinais/fisiologia

12.

Nature Europe site should highlight most productive countries.

Evelo, Chris T; Waagmeester, Andra.

Nature ; 465(7299): 685, 2010 Jun 10.

Artigo em Inglês | MEDLINE | ID: mdl-20535178

Assuntos

Internet , Editoração/estatística & dados numéricos , Pesquisa/estatística & dados numéricos , Europa (Continente) , Pesquisa/economia , Pesquisa/normas

13.

Wikidata as a knowledge graph for the life sciences.

Waagmeester, Andra; Stupp, Gregory; Burgstaller-Muehlbacher, Sebastian; Good, Benjamin M; Griffith, Malachi; Griffith, Obi L; Hanspers, Kristina; Hermjakob, Henning; Hudson, Toby S; Hybiske, Kevin; Keating, Sarah M; Manske, Magnus; Mayers, Michael; Mietchen, Daniel; Mitraka, Elvira; Pico, Alexander R; Putman, Timothy; Riutta, Anders; Queralt-Rosinach, Nuria; Schriml, Lynn M; Shafee, Thomas; Slenter, Denise; Stephan, Ralf; Thornton, Katherine; Tsueng, Ginger; Tu, Roger; Ul-Hasan, Sabah; Willighagen, Egon; Wu, Chunlei; Su, Andrew I.

Elife ; 92020 03 17.

Artigo em Inglês | MEDLINE | ID: mdl-32180547

RESUMO

Wikidata is a community-maintained knowledge base that has been assembled from repositories in the fields of genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases, and that adheres to the FAIR principles of findability, accessibility, interoperability and reusability. Here we describe the breadth and depth of the biomedical knowledge contained within Wikidata, and discuss the open-source tools we have built to add information to Wikidata and to synchronize it with source databases. We also demonstrate several use cases for Wikidata, including the crowdsourced curation of biomedical ontologies, phenotype-based diagnosis of disease, and drug repurposing.

Assuntos

Disciplinas das Ciências Biológicas , Biologia Computacional , Bases de Dados Factuais , Genômica , Proteômica , Humanos , Reconhecimento Automatizado de Padrão

14.

The public road to high-quality curated biological pathways.

Adriaens, Michiel E; Jaillard, Magali; Waagmeester, Andra; Coort, Susan L M; Pico, Alex R; Evelo, Chris T A.

Drug Discov Today ; 13(19-20): 856-62, 2008 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-18652912

RESUMO

Biological pathways are abstract and functional visual representations of existing biological knowledge. By mapping high-throughput data on these representations, changes and patterns in biological systems on the genetic, metabolic and protein level are instantly assessable. Many public domain repositories exist for storing biological pathways, each applying its own conventions and storage format. A pathway-based content review of these repositories reveals that none of them are comprehensive. To address this issue, we apply a general workflow to create curated biological pathways, in which we combine three content sources: public domain databases, literature and experts. In this workflow all content of a particular biological pathway is manually retrieved from biological pathway databases and literature, after which this content is compared, combined and subsequently curated by experts. From the curated content, new biological pathways can be created for a pathway analysis tool of choice and distributed among its user base. We applied this procedure to construct high-quality curated biological pathways involved in human fatty acid metabolism.

Assuntos

Inteligência Artificial , Disciplinas das Ciências Biológicas/normas , Disciplinas das Ciências Biológicas/tendências , Animais , Bases de Dados Factuais , Ácidos Graxos/metabolismo , Humanos

15.

Explicit interaction information from WikiPathways in RDF facilitates drug discovery in the Open PHACTS Discovery Platform.

Miller, Ryan A; Woollard, Peter; Willighagen, Egon L; Digles, Daniela; Kutmon, Martina; Loizou, Antonis; Waagmeester, Andra; Senger, Stefan; Evelo, Chris T.

F1000Res ; 7: 75, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30416713

RESUMO

Open PHACTS is a pre-competitive project to answer scientific questions developed recently by the pharmaceutical industry. Having high quality biological interaction information in the Open PHACTS Discovery Platform is needed to answer multiple pathway related questions. To address this, updated WikiPathways data has been added to the platform. This data includes information about biological interactions, such as stimulation and inhibition. The platform's Application Programming Interface (API) was extended with appropriate calls to reference these interactions. These new methods of the Open PHACTS API are available now.

Assuntos

Antineoplásicos/farmacologia , Pesquisa Biomédica , Biologia Computacional/métodos , Descoberta de Drogas , Armazenamento e Recuperação da Informação/métodos , Transdução de Sinais , Software , Indústria Farmacêutica , Humanos , Hipertrofia/tratamento farmacológico , Hipertrofia/patologia , Miócitos Cardíacos/citologia , Miócitos Cardíacos/efeitos dos fármacos , Neoplasias/tratamento farmacológico , Neoplasias/patologia

16.

WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata.

Putman, Tim E; Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra; Diesh, Colin; Dunn, Nathan; Munoz-Torres, Monica; Stupp, Gregory S; Wu, Chunlei; Su, Andrew I; Good, Benjamin M.

Database (Oxford) ; 2017(1)2017 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-28365742

RESUMO

With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don't exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomic data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction. Database URL: www.wikigenomes.org.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genoma , Internet , Anotação de Sequência Molecular/métodos , Anotação de Sequência Molecular/normas

17.

Identifying sigma factors in Mycobacterium smegmatis by comparative genomic analysis.

Waagmeester, Andra; Thompson, Julie; Reyrat, Jean-Marc.

Trends Microbiol ; 13(11): 505-9, 2005 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-16140533

RESUMO

Mycobacterium smegmatis is a saprophytic species that has been used for 15 years as a model to perform heterologous regulation and virulence studies of Mycobacterium tuberculosis. Members of the extracytoplasmic sigma factors family, which are required for adaptive responses to various environmental stresses, are responsible for some of the virulence traits of M. tuberculosis. A bioinformatic search on the genome of M. smegmatis has predicted the existence of 26 sigma factors, which is twice the number that are present in M. tuberculosis. A phylogenetic analysis has shown that despite this high number of sigma factors the orthologs of the genes sigC, sigI and sigK of M. tuberculosis are absent in the M. smegmatis genome. Several sigma factors are specific for M. smegmatis, with a special enrichment in the sigH and, to a lesser extent, in the sigJ and sigL subfamily, pinpointing the potential variability of the repertoire of adaptive response in this saprophytic species.

Assuntos

Genoma Bacteriano , Mycobacterium smegmatis/genética , Fator sigma/genética , Adaptação Fisiológica/genética , Biologia Computacional , Genômica , Mycobacterium tuberculosis/genética , Filogenia , Fatores de Virulência/genética

18.

Centralizing content and distributing labor: a community model for curating the very long tail of microbial genomes.

Putman, Tim E; Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra; Wu, Chunlei; Su, Andrew I; Good, Benjamin M.

Database (Oxford) ; 20162016.

Artigo em Inglês | MEDLINE | ID: mdl-27022157

RESUMO

The last 20 years of advancement in sequencing technologies have led to sequencing thousands of microbial genomes, creating mountains of genetic data. While efficiency in generating the data improves almost daily, applying meaningful relationships between taxonomic and genetic entities on this scale requires a structured and integrative approach. Currently, knowledge is distributed across a fragmented landscape of resources from government-funded institutions such as National Center for Biotechnology Information (NCBI) and UniProt to topic-focused databases like the ODB3 database of prokaryotic operons, to the supplemental table of a primary publication. A major drawback to large scale, expert-curated databases is the expense of maintaining and extending them over time. No entity apart from a major institution with stable long-term funding can consider this, and their scope is limited considering the magnitude of microbial data being generated daily. Wikidata is an openly editable, semantic web compatible framework for knowledge representation. It is a project of the Wikimedia Foundation and offers knowledge integration capabilities ideally suited to the challenge of representing the exploding body of information about microbial genomics. We are developing a microbial specific data model, based on Wikidata's semantic web compatibility, which represents bacterial species, strains and the gene and gene products that define them. Currently, we have loaded 43,694 gene and 37,966 protein items for 21 species of bacteria, including the human pathogenic bacteriaChlamydia trachomatis.Using this pathogen as an example, we explore complex interactions between the pathogen, its host, associated genes, other microbes, disease and drugs using the Wikidata SPARQL endpoint. In our next phase of development, we will add another 99 bacterial genomes and their gene and gene products, totaling â¼900,000 additional entities. This aggregation of knowledge will be a platform for community-driven collaboration, allowing the networking of microbial genetic data through the sharing of knowledge by both the data and domain expert.

Assuntos

Curadoria de Dados , Genoma Microbiano , Modelos Teóricos , Feminino , Ontologia Genética , Genes Bacterianos , Humanos , Anotação de Sequência Molecular , Óperon/genética , Ferramenta de Busca

19.

Wikidata as a semantic framework for the Gene Wiki initiative.

Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra; Mitraka, Elvira; Turner, Julia; Putman, Tim; Leong, Justin; Naik, Chinmay; Pavlidis, Paul; Schriml, Lynn; Good, Benjamin M; Su, Andrew I.

Database (Oxford) ; 20162016.

Artigo em Inglês | MEDLINE | ID: mdl-26989148

RESUMO

Open biological data are distributed over many resources making them challenging to integrate, to update and to disseminate quickly. Wikidata is a growing, open community database which can serve this purpose and also provides tight integration with Wikipedia. In order to improve the state of biological data, facilitate data management and dissemination, we imported all human and mouse genes, and all human and mouse proteins into Wikidata. In total, 59,721 human genes and 73,355 mouse genes have been imported from NCBI and 27,306 human proteins and 16,728 mouse proteins have been imported from the Swissprot subset of UniProt. As Wikidata is open and can be edited by anybody, our corpus of imported data serves as the starting point for integration of further data by scientists, the Wikidata community and citizen scientists alike. The first use case for these data is to populate Wikipedia Gene Wiki infoboxes directly from Wikidata with the data integrated above. This enables immediate updates of the Gene Wiki infoboxes as soon as the data in Wikidata are modified. Although Gene Wiki pages are currently only on the English language version of Wikipedia, the multilingual nature of Wikidata allows for usage of the data we imported in all 280 different language Wikipedias. Apart from the Gene Wiki infobox use case, a SPARQL endpoint and exporting functionality to several standard formats (e.g. JSON, XML) enable use of the data by scientists. In summary, we created a fully open and extensible data resource for human and mouse molecular biology and biochemistry data. This resource enriches all the Wikipedias with structured information and serves as a new linking hub for the biological semantic web. Database URL: https://www.wikidata.org/.

Assuntos

Bases de Dados de Ácidos Nucleicos , Semântica , Animais , Humanos , Camundongos , Modelos Teóricos , Ferramenta de Busca

20.

The FAIR Guiding Principles for scientific data management and stewardship.

Wilkinson, Mark D; Dumontier, Michel; Aalbersberg, I Jsbrand Jan; Appleton, Gabrielle; Axton, Myles; Baak, Arie; Blomberg, Niklas; Boiten, Jan-Willem; da Silva Santos, Luiz Bonino; Bourne, Philip E; Bouwman, Jildau; Brookes, Anthony J; Clark, Tim; Crosas, Mercè; Dillo, Ingrid; Dumon, Olivier; Edmunds, Scott; Evelo, Chris T; Finkers, Richard; Gonzalez-Beltran, Alejandra; Gray, Alasdair J G; Groth, Paul; Goble, Carole; Grethe, Jeffrey S; Heringa, Jaap; 't Hoen, Peter A C; Hooft, Rob; Kuhn, Tobias; Kok, Ruben; Kok, Joost; Lusher, Scott J; Martone, Maryann E; Mons, Albert; Packer, Abel L; Persson, Bengt; Rocca-Serra, Philippe; Roos, Marco; van Schaik, Rene; Sansone, Susanna-Assunta; Schultes, Erik; Sengstag, Thierry; Slater, Ted; Strawn, George; Swertz, Morris A; Thompson, Mark; van der Lei, Johan; van Mulligen, Erik; Velterop, Jan; Waagmeester, Andra; Wittenburg, Peter.

Sci Data ; 3: 160018, 2016 Mar 15.

Artigo em Inglês | MEDLINE | ID: mdl-26978244

RESUMO

There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders-representing academia, industry, funding agencies, and scholarly publishers-have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.

Assuntos

Coleta de Dados , Curadoria de Dados , Projetos de Pesquisa , Sistemas de Gerenciamento de Base de Dados , Guias como Assunto , Reprodutibilidade dos Testes

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA