Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Plant Biotechnol J ; 19(8): 1670-1678, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-33750020

RESUMEN

The generation of new ideas and scientific hypotheses is often the result of extensive literature and database searches, but, with the growing wealth of public and private knowledge, the process of searching diverse and interconnected data to generate new insights into genes, gene networks, traits and diseases is becoming both more complex and more time-consuming. To guide this technically challenging data integration task and to make gene discovery and hypotheses generation easier for researchers, we have developed a comprehensive software package called KnetMiner which is open-source and containerized for easy use. KnetMiner is an integrated, intelligent, interactive gene and gene network discovery platform that supports scientists explore and understand the biological stories of complex traits and diseases across species. It features fast algorithms for generating rich interactive gene networks and prioritizing candidate genes based on knowledge mining approaches. KnetMiner is used in many plant science institutions and has been adopted by several plant breeding organizations to accelerate gene discovery. The software is generic and customizable and can therefore be readily applied to new species and data types; for example, it has been applied to pest insects and fungal pathogens; and most recently repurposed to support COVID-19 research. Here, we give an overview of the main approaches behind KnetMiner and we report plant-centric case studies for identifying genes, gene networks and trait relationships in Triticum aestivum (bread wheat), as well as, an evidence-based approach to rank candidate genes under a large Arabidopsis thaliana QTL. KnetMiner is available at: https://knetminer.org.


Asunto(s)
COVID-19 , Herencia Multifactorial , Estudios de Asociación Genética , Humanos , Fitomejoramiento , SARS-CoV-2
2.
J Integr Bioinform ; 15(3)2018 Aug 07.
Artículo en Inglés | MEDLINE | ID: mdl-30085931

RESUMEN

The speed and accuracy of new scientific discoveries - be it by humans or artificial intelligence - depends on the quality of the underlying data and on the technology to connect, search and share the data efficiently. In recent years, we have seen the rise of graph databases and semi-formal data models such as knowledge graphs to facilitate software approaches to scientific discovery. These approaches extend work based on formalised models, such as the Semantic Web. In this paper, we present our developments to connect, search and share data about genome-scale knowledge networks (GSKN). We have developed a simple application ontology based on OWL/RDF with mappings to standard schemas. We are employing the ontology to power data access services like resolvable URIs, SPARQL endpoints, JSON-LD web APIs and Neo4j-based knowledge graphs. We demonstrate how the proposed ontology and graph databases considerably improve search and access to interoperable and reusable biological knowledge (i.e. the FAIRness data principles).


Asunto(s)
Biología Computacional/métodos , Gráficos por Computador , Redes Reguladoras de Genes , Genoma Humano , Programas Informáticos , Bases de Datos Factuales , Estudio de Asociación del Genoma Completo , Humanos , Conocimiento
3.
BMC Med Inform Decis Mak ; 17(1): 30, 2017 03 23.
Artículo en Inglés | MEDLINE | ID: mdl-28330491

RESUMEN

BACKGROUND: Translational researchers need robust IT solutions to access a range of data types, varying from public data sets to pseudonymised patient information with restricted access, provided on a case by case basis. The reason for this complication is that managing access policies to sensitive human data must consider issues of data confidentiality, identifiability, extent of consent, and data usage agreements. All these ethical, social and legal aspects must be incorporated into a differential management of restricted access to sensitive data. METHODS: In this paper we present a pilot system that uses several common open source software components in a novel combination to coordinate access to heterogeneous biomedical data repositories containing open data (open access) as well as sensitive data (restricted access) in the domain of biobanking and biosample research. Our approach is based on a digital identity federation and software to manage resource access entitlements. RESULTS: Open source software components were assembled and configured in such a way that they allow for different ways of restricted access according to the protection needs of the data. We have tested the resulting pilot infrastructure and assessed its performance, feasibility and reproducibility. CONCLUSIONS: Common open source software components are sufficient to allow for the creation of a secure system for differential access to sensitive data. The implementation of this system is exemplary for researchers facing similar requirements for restricted access data. Here we report experience and lessons learnt of our pilot implementation, which may be useful for similar use cases. Furthermore, we discuss possible extensions for more complex scenarios.


Asunto(s)
Bancos de Muestras Biológicas/normas , Investigación Biomédica/normas , Seguridad Computacional/normas , Conjuntos de Datos como Asunto , Investigación Biomédica Traslacional/normas , Humanos , Proyectos Piloto
4.
Nucleic Acids Res ; 43(Database issue): D1113-6, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25361974

RESUMEN

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42,000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Programas Informáticos
5.
Bioinformatics ; 30(9): 1338-9, 2014 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-24413672

RESUMEN

MOTIVATION: Resource description framework (RDF) is an emerging technology for describing, publishing and linking life science data. As a major provider of bioinformatics data and services, the European Bioinformatics Institute (EBI) is committed to making data readily accessible to the community in ways that meet existing demand. The EBI RDF platform has been developed to meet an increasing demand to coordinate RDF activities across the institute and provides a new entry point to querying and exploring integrated resources available at the EBI.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Academias e Institutos , Investigación Biomédica , Internet
6.
Nucleic Acids Res ; 42(Database issue): D50-2, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24265224

RESUMEN

The BioSamples database at the EBI (http://www.ebi.ac.uk/biosamples) provides an integration point for BioSamples information between technology specific databases at the EBI, projects such as ENCODE and reference collections such as cell lines. The database delivers a unified query interface and API to query sample information across EBI's databases and provides links back to assay databases. Sample groups are used to manage related samples, e.g. those from an experimental submission, or a single reference collection. Infrastructural improvements include a new user interface with ontological and key word queries, a new query API, a new data submission API, complete RDF data download and a supporting SPARQL endpoint, accessioning at the point of submission to the European Nucleotide Archive and European Genotype Phenotype Archives and improved query response times.


Asunto(s)
Bases de Datos Genéticas , Línea Celular , Europa (Continente) , Humanos , Internet , Neoplasias/genética , Integración de Sistemas
7.
Nucleic Acids Res ; 41(Database issue): D987-90, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23193272

RESUMEN

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.


Asunto(s)
Bases de Datos Genéticas , Genómica , Análisis por Micromatrices , Bases de Datos Genéticas/estadística & datos numéricos , Bases de Datos Genéticas/tendencias , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Programas Informáticos , Interfaz Usuario-Computador
8.
Bioinformatics ; 28(12): 1665-7, 2012 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-22556367

RESUMEN

MOTIVATIONS: Spreadsheet-like tabular formats are ever more popular in the biomedical field as a mean for experimental reporting. The problem of converting the graph of an experimental workflow into a table-based representation occurs in many such formats and is not easy to solve. RESULTS: We describe graph2tab, a library that implements methods to realise such a conversion in a size-optimised way. Our solution is generic and can be adapted to specific cases of data exporters or data converters that need to be implemented. AVAILABILITY AND IMPLEMENTATION: The library source code and documentation are available at http://github.com/ISA-tools/graph2tab.


Asunto(s)
Gráficos por Computador , Lenguajes de Programación , Flujo de Trabajo , Biología Computacional/métodos , Bases de Datos Factuales , Análisis de Secuencia por Matrices de Oligonucleótidos
9.
Nucleic Acids Res ; 40(Database issue): D64-70, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22096232

RESUMEN

The BioSample Database (http://www.ebi.ac.uk/biosamples) is a new database at EBI that stores information about biological samples used in molecular experiments, such as sequencing, gene expression or proteomics. The goals of the BioSample Database include: (i) recording and linking of sample information consistently within EBI databases such as ENA, ArrayExpress and PRIDE; (ii) minimizing data entry efforts for EBI database submitters by enabling submitting sample descriptions once and referencing them later in data submissions to assay databases and (iii) supporting cross database queries by sample characteristics. Each sample in the database is assigned an accession number. The database includes a growing set of reference samples, such as cell lines, which are repeatedly used in experiments and can be easily referenced from any database by their accession numbers. Accession numbers for the reference samples will be exchanged with a similar database at NCBI. The samples in the database can be queried by their attributes, such as sample types, disease names or sample providers. A simple tab-delimited format facilitates submissions of sample information to the database, initially via email to biosamples@ebi.ac.uk.


Asunto(s)
Bases de Datos Genéticas , Línea Celular , Expresión Génica , Genómica , Proteómica , Análisis de Secuencia , Integración de Sistemas , Interfaz Usuario-Computador
10.
Brief Bioinform ; 12(6): 562-75, 2011 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-21969471

RESUMEN

Biomedical research relies increasingly on large collections of data sets and knowledge whose generation, representation and analysis often require large collaborative and interdisciplinary efforts. This dimension of 'big data' research calls for the development of computational tools to manage such a vast amount of data, as well as tools that can improve communication and access to information from collaborating researchers and from the wider community. Whenever research projects have a defined temporal scope, an additional issue of data management arises, namely how the knowledge generated within the project can be made available beyond its boundaries and life-time. DC-THERA is a European 'Network of Excellence' (NoE) that spawned a very large collaborative and interdisciplinary research community, focusing on the development of novel immunotherapies derived from fundamental research in dendritic cell immunobiology. In this article we introduce the DC-THERA Directory, which is an information system designed to support knowledge management for this research community and beyond. We present how the use of metadata and Semantic Web technologies can effectively help to organize the knowledge generated by modern collaborative research, how these technologies can enable effective data management solutions during and beyond the project lifecycle, and how resources such as the DC-THERA Directory fit into the larger context of e-science.


Asunto(s)
Difusión de la Información/métodos , Almacenamiento y Recuperación de la Información/métodos , Semántica , Investigación Biomédica Traslacional , Sistemas de Administración de Bases de Datos , Internet
11.
Nucleic Acids Res ; 39(Database issue): D1002-4, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21071405

RESUMEN

The ArrayExpress Archive (http://www.ebi.ac.uk/arrayexpress) is one of the three international public repositories of functional genomics data supporting publications. It includes data generated by sequencing or array-based technologies. Data are submitted by users and imported directly from the NCBI Gene Expression Omnibus. The ArrayExpress Archive is closely integrated with the Gene Expression Atlas and the sequence databases at the European Bioinformatics Institute. Advanced queries provided via ontology enabled interfaces include queries based on technology and sample attributes such as disease, cell types and anatomy.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia por Matrices de Oligonucleótidos , Expresión Génica
12.
Immunome Res ; 6: 10, 2010 Nov 19.
Artículo en Inglés | MEDLINE | ID: mdl-21092113

RESUMEN

BACKGROUND: The advent of Systems Biology has been accompanied by the blooming of pathway databases. Currently pathways are defined generically with respect to the organ or cell type where a reaction takes place. The cell type specificity of the reactions is the foundation of immunological research, and capturing this specificity is of paramount importance when using pathway-based analyses to decipher complex immunological datasets. Here, we present DC-ATLAS, a novel and versatile resource for the interpretation of high-throughput data generated perturbing the signaling network of dendritic cells (DCs). RESULTS: Pathways are annotated using a novel data model, the Biological Connection Markup Language (BCML), a SBGN-compliant data format developed to store the large amount of information collected. The application of DC-ATLAS to pathway-based analysis of the transcriptional program of DCs stimulated with agonists of the toll-like receptor family allows an integrated description of the flow of information from the cellular sensors to the functional outcome, capturing the temporal series of activation events by grouping sets of reactions that occur at different time points in well-defined functional modules. CONCLUSIONS: The initiative significantly improves our understanding of DC biology and regulatory networks. Developing a systems biology approach for immune system holds the promise of translating knowledge on the immune system into more successful immunotherapy strategies.

13.
Bioinformatics ; 26(18): 2354-6, 2010 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-20679334

RESUMEN

UNLABELLED: The first open source software suite for experimentalists and curators that (i) assists in the annotation and local management of experimental metadata from high-throughput studies employing one or a combination of omics and other technologies; (ii) empowers users to uptake community-defined checklists and ontologies; and (iii) facilitates submission to international public repositories. AVAILABILITY AND IMPLEMENTATION: Software, documentation, case studies and implementations at http://www.isa-tools.org.


Asunto(s)
Programas Informáticos , Lista de Verificación , Documentación
14.
Nucleic Acids Res ; 37(Database issue): D868-72, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19015125

RESUMEN

ArrayExpress http://www.ebi.ac.uk/arrayexpress consists of three components: the ArrayExpress Repository--a public archive of functional genomics experiments and supporting data, the ArrayExpress Warehouse--a database of gene expression profiles and other bio-measurements and the ArrayExpress Atlas--a new summary database and meta-analytical tool of ranked gene expression across multiple experiments and different biological conditions. The Repository contains data from over 6000 experiments comprising approximately 200,000 assays, and the database doubles in size every 15 months. The majority of the data are array based, but other data types are included, most recently-ultra high-throughput sequencing transcriptomics and epigenetic data. The Warehouse and Atlas allow users to query for differentially expressed genes by gene names and properties, experimental conditions and sample properties, or a combination of both. In this update, we describe the ArrayExpress developments over the last two years.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Genómica
15.
Summit Transl Bioinform ; 2009: 112-5, 2009 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-21347181

RESUMEN

BACKGROUND: As the size and complexity of scientific datasets and the corresponding information stores grow, standards for collecting, describing, formatting, submitting and exchanging information are playing an increasingly active role. Several initiatives occupy strategic positions in the international scenario, both within and across domains. However, the job of harmonising reporting standards is still very much a work in progress; both software interoperability and the data integration remain challenging as things stand. RESULTS: The status quo with respect to standardization initiatives is summarized here, with particular emphasis on the motivation for, and the challenges of, ongoing synergistic activities amongst the academic community focused on the creation of truly interoperable standards. CONCLUSIONS: Groups generating standards should engage with ongoing cross-domain activities to simplify the integration of heterogeneous data sets to the greatest possible extent.

16.
OMICS ; 12(2): 143-9, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18447634

RESUMEN

This article summarizes the motivation for, and the proceedings of, the first ISA-TAB workshop held December 6-8, 2007, at the EBI, Cambridge, UK. This exploratory workshop, organized by members of the Microarray Gene Expression Data (MGED) Society's Reporting Structure for Biological Investigations (RSBI) working group, brought together a group of developers of a range of collaborative systems to discuss the use of a common format to address the pressing need of reporting and communicating data and metadata from biological, biomedical, and environmental studies employing combinations of genomics, transcriptomics, proteomics, and metabolomics technologies along with more conventional methodologies. The expertise of the participants comprised database development, data management, and hands-on experience in the development of data communication standards. The workshop's outcomes are set to help formalize the proposed Investigation, Study, Assay (ISA)-TAB tab-delimited format for representing and communicating experimental metadata. This article is part of the special issue of OMICS on the activities of the Genomics Standards Consortium (GSC).


Asunto(s)
Biología Computacional , Sistemas de Administración de Bases de Datos , Educación , Genómica , Proteómica , ARN Mensajero/genética , Reino Unido
17.
BMC Bioinformatics ; 8 Suppl 1: S21, 2007 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-17430566

RESUMEN

BACKGROUND: Gene expression databases are key resources for microarray data management and analysis and the importance of a proper annotation of their content is well understood. Public repositories as well as microarray database systems that can be implemented by single laboratories exist. However, there is not yet a tool that can easily support a collaborative environment where different users with different rights of access to data can interact to define a common highly coherent content. The scope of the Genopolis database is to provide a resource that allows different groups performing microarray experiments related to a common subject to create a common coherent knowledge base and to analyse it. The Genopolis database has been implemented as a dedicated system for the scientific community studying dendritic and macrophage cells functions and host-parasite interactions. RESULTS: The Genopolis Database system allows the community to build an object based MIAME compliant annotation of their experiments and to store images, raw and processed data from the Affymetrix GeneChip platform. It supports dynamical definition of controlled vocabularies and provides automated and supervised steps to control the coherence of data and annotations. It allows a precise control of the visibility of the database content to different sub groups in the community and facilitates exports of its content to public repositories. It provides an interactive users interface for data analysis: this allows users to visualize data matrices based on functional lists and sample characterization, and to navigate to other data matrices defined by similarity of expression values as well as functional characterizations of genes involved. A collaborative environment is also provided for the definition and sharing of functional annotation by users. CONCLUSION: The Genopolis Database supports a community in building a common coherent knowledge base and analyse it. This fills a gap between a local database and a public repository, where the development of a common coherent annotation is important. In its current implementation, it provides a uniform coherently annotated dataset on dendritic cells and macrophage differentiation.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Almacenamiento y Recuperación de la Información/métodos , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Interfaz Usuario-Computador , Perfilación de la Expresión Génica
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...