Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
EMBO J ; 42(23): e115008, 2023 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-37964598

RESUMEN

The main goals and challenges for the life science communities in the Open Science framework are to increase reuse and sustainability of data resources, software tools, and workflows, especially in large-scale data-driven research and computational analyses. Here, we present key findings, procedures, effective measures and recommendations for generating and establishing sustainable life science resources based on the collaborative, cross-disciplinary work done within the EOSC-Life (European Open Science Cloud for Life Sciences) consortium. Bringing together 13 European life science research infrastructures, it has laid the foundation for an open, digital space to support biological and medical research. Using lessons learned from 27 selected projects, we describe the organisational, technical, financial and legal/ethical challenges that represent the main barriers to sustainability in the life sciences. We show how EOSC-Life provides a model for sustainable data management according to FAIR (findability, accessibility, interoperability, and reusability) principles, including solutions for sensitive- and industry-related resources, by means of cross-disciplinary training and best practices sharing. Finally, we illustrate how data harmonisation and collaborative work facilitate interoperability of tools, data, solutions and lead to a better understanding of concepts, semantics and functionalities in the life sciences.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Investigación Biomédica , Programas Informáticos , Flujo de Trabajo
2.
Histochem Cell Biol ; 160(3): 211-221, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37537341

RESUMEN

Biological imaging is one of the primary tools by which we understand living systems across scales from atoms to organisms. Rapid advances in imaging technology have increased both the spatial and temporal resolutions at which we examine those systems, as well as enabling visualisation of larger tissue volumes. These advances have huge potential but also generate ever increasing amounts of imaging data that must be stored and analysed. Public image repositories provide a critical scientific service through open data provision, supporting reproducibility of scientific results, access to reference imaging datasets and reuse of data for new scientific discovery and acceleration of image analysis methods development. The scale and scope of imaging data provides both challenges and opportunities for open sharing of image data. In this article, we provide a perspective influenced by decades of provision of open data resources for biological information, suggesting areas to focus on and a path towards global interoperability.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Reproducibilidad de los Resultados
3.
Nucleic Acids Res ; 49(D1): D1502-D1506, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33211879

RESUMEN

ArrayExpress (https://www.ebi.ac.uk/arrayexpress) is an archive of functional genomics data at EMBL-EBI, established in 2002, initially as an archive for publication-related microarray data and was later extended to accept sequencing-based data. Over the last decade an increasing share of biological experiments involve multiple technologies assaying different biological modalities, such as epigenetics, and RNA and protein expression, and thus the BioStudies database (https://www.ebi.ac.uk/biostudies) was established to deal with such multimodal data. Its central concept is a study, which typically is associated with a publication. BioStudies stores metadata describing the study, provides links to the relevant databases, such as European Nucleotide Archive (ENA), as well as hosts the types of data for which specialized databases do not exist. With BioStudies now fully functional, we are able to further harmonize the archival data infrastructure at EMBL-EBI, and ArrayExpress is being migrated to BioStudies. In future, all functional genomics data will be archived at BioStudies. The process will be seamless for the users, who will continue to submit data using the online tool Annotare and will be able to query and download data largely in the same manner as before. Nevertheless, some technical aspects, particularly programmatic access, will change. This update guides the users through these changes.


Asunto(s)
Bases de Datos Genéticas , Epigénesis Genética , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Animales , Línea Celular , Metilación de ADN , Perfilación de la Expresión Génica , Humanos , Internet , Metadatos , Especificidad de Órganos , Plantas/genética , Análisis de la Célula Individual , Programas Informáticos
4.
Eur Respir J ; 60(2)2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35086829

RESUMEN

The Human Cell Atlas (HCA) consortium aims to establish an atlas of all organs in the healthy human body at single-cell resolution to increase our understanding of basic biological processes that govern development, physiology and anatomy, and to accelerate diagnosis and treatment of disease. The Lung Biological Network of the HCA aims to generate the Human Lung Cell Atlas as a reference for the cellular repertoire, molecular cell states and phenotypes, and cell-cell interactions that characterise normal lung homeostasis in healthy lung tissue. Such a reference atlas of the healthy human lung will facilitate mapping the changes in the cellular landscape in disease. The discovAIR project is one of six pilot actions for the HCA funded by the European Commission in the context of the H2020 framework programme. discovAIR aims to establish the first draft of an integrated Human Lung Cell Atlas, combining single-cell transcriptional and epigenetic profiling with spatially resolving techniques on matched tissue samples, as well as including a number of chronic and infectious diseases of the lung. The integrated Human Lung Cell Atlas will be available as a resource for the wider respiratory community, including basic and translational scientists, clinical medicine, and the private sector, as well as for patients with lung disease and the interested lay public. We anticipate that the Human Lung Cell Atlas will be the founding stone for a more detailed understanding of the pathogenesis of lung diseases, guiding the design of novel diagnostics and preventive or curative interventions.


Asunto(s)
Enfermedades Pulmonares , Pulmón , Humanos , Proteómica , Tórax
5.
Nat Methods ; 15(11): 984, 2018 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-30287931

RESUMEN

This paper was originally published under standard Nature America Inc. copyright. As of the date of this correction, the Resource is available online as an open-access paper with a CC-BY license. No other part of the paper has been changed.

6.
Nucleic Acids Res ; 47(D1): D711-D715, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30357387

RESUMEN

ArrayExpress (https://www.ebi.ac.uk/arrayexpress) is an archive of functional genomics data from a variety of technologies assaying functional modalities of a genome, such as gene expression or promoter occupancy. The number of experiments based on sequencing technologies, in particular RNA-seq experiments, has been increasing over the last few years and submissions of sequencing data have overtaken microarray experiments in the last 12 months. Additionally, there is a significant increase in experiments investigating single cells, rather than bulk samples, known as single-cell RNA-seq. To accommodate these trends, we have substantially changed our submission tool Annotare which, along with raw and processed data, collects all metadata necessary to interpret these experiments. Selected datasets are re-processed and loaded into our sister resource, the value-added Expression Atlas (and its component Single Cell Expression Atlas), which not only enables users to interpret the data easily but also serves as a test for data quality. With an increasing number of studies that combine different assay modalities (multi-omics experiments), a new more general archival resource the BioStudies Database has been developed, which will eventually supersede ArrayExpress. Data submissions will continue unchanged; all existing ArrayExpress data will be incorporated into BioStudies and the existing accession numbers and application programming interfaces will be maintained.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de la Célula Individual/métodos , Programas Informáticos , Bases de Datos Genéticas , RNA-Seq/métodos
7.
Nat Methods ; 14(8): 775-781, 2017 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-28775673

RESUMEN

Access to primary research data is vital for the advancement of science. To extend the data types supported by community repositories, we built a prototype Image Data Resource (IDR) that collects and integrates imaging data acquired across many different imaging modalities. IDR links data from several imaging modalities, including high-content screening, super-resolution and time-lapse microscopy, digital pathology, public genetic or chemical databases, and cell and tissue phenotypes expressed using controlled ontologies. Using this integration, IDR facilitates the analysis of gene networks and reveals functional interactions that are inaccessible to individual studies. To enable re-analysis, we also established a computational resource based on Jupyter notebooks that allows remote access to the entire IDR. IDR is also an open source platform that others can use to publish their own image data. Thus IDR provides both a novel on-line resource and a software infrastructure that promotes and extends publication and re-analysis of scientific image data.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Interpretación de Imagen Asistida por Computador/métodos , Difusión de la Información/métodos , Programas Informáticos , Interfaz Usuario-Computador , Algoritmos , Edición , Integración de Sistemas
8.
Arch Toxicol ; 94(7): 2435-2461, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32632539

RESUMEN

Hazard assessment, based on new approach methods (NAM), requires the use of batteries of assays, where individual tests may be contributed by different laboratories. A unified strategy for such collaborative testing is presented. It details all procedures required to allow test information to be usable for integrated hazard assessment, strategic project decisions and/or for regulatory purposes. The EU-ToxRisk project developed a strategy to provide regulatorily valid data, and exemplified this using a panel of > 20 assays (with > 50 individual endpoints), each exposed to 19 well-known test compounds (e.g. rotenone, colchicine, mercury, paracetamol, rifampicine, paraquat, taxol). Examples of strategy implementation are provided for all aspects required to ensure data validity: (i) documentation of test methods in a publicly accessible database; (ii) deposition of standard operating procedures (SOP) at the European Union DB-ALM repository; (iii) test readiness scoring accoding to defined criteria; (iv) disclosure of the pipeline for data processing; (v) link of uncertainty measures and metadata to the data; (vi) definition of test chemicals, their handling and their behavior in test media; (vii) specification of the test purpose and overall evaluation plans. Moreover, data generation was exemplified by providing results from 25 reporter assays. A complete evaluation of the entire test battery will be described elsewhere. A major learning from the retrospective analysis of this large testing project was the need for thorough definitions of the above strategy aspects, ideally in form of a study pre-registration, to allow adequate interpretation of the data and to ensure overall scientific/toxicological validity.


Asunto(s)
Documentación , Procesamiento Automatizado de Datos/legislación & jurisprudencia , Regulación Gubernamental , Pruebas de Toxicidad , Toxicología/legislación & jurisprudencia , Animales , Células Cultivadas , Europa (Continente) , Humanos , Formulación de Políticas , Reproducibilidad de los Resultados , Estudios Retrospectivos , Medición de Riesgo , Terminología como Asunto , Pez Cebra/embriología
9.
Nucleic Acids Res ; 46(D1): D1266-D1270, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29069414

RESUMEN

BioStudies (www.ebi.ac.uk/biostudies) is a new public database that organizes data from biological studies. Typically, but not exclusively, a study is associated with a publication. BioStudies offers a simple way to describe the study structure, and provides flexible data deposition tools and data access interfaces. The actual data can be stored either in BioStudies or remotely, or both. BioStudies imports supplementary data from Europe PMC, and is a resource for authors and publishers for packaging data during the manuscript preparation process. It also can support data management needs of collaborative projects. The growth in multiomics experiments and other multi-faceted approaches to life sciences research mean that studies result in a diversity of data outputs in multiple locations. BioStudies presents a solution to ensuring that all these data and the associated publication(s) can be found coherently in the longer term.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Bases de Datos Factuales , Animales , Humanos , Internet , Programas Informáticos
12.
Nucleic Acids Res ; 43(Database issue): D1113-6, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25361974

RESUMEN

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42,000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Programas Informáticos
13.
BMC Med Inform Decis Mak ; 17(1): 30, 2017 03 23.
Artículo en Inglés | MEDLINE | ID: mdl-28330491

RESUMEN

BACKGROUND: Translational researchers need robust IT solutions to access a range of data types, varying from public data sets to pseudonymised patient information with restricted access, provided on a case by case basis. The reason for this complication is that managing access policies to sensitive human data must consider issues of data confidentiality, identifiability, extent of consent, and data usage agreements. All these ethical, social and legal aspects must be incorporated into a differential management of restricted access to sensitive data. METHODS: In this paper we present a pilot system that uses several common open source software components in a novel combination to coordinate access to heterogeneous biomedical data repositories containing open data (open access) as well as sensitive data (restricted access) in the domain of biobanking and biosample research. Our approach is based on a digital identity federation and software to manage resource access entitlements. RESULTS: Open source software components were assembled and configured in such a way that they allow for different ways of restricted access according to the protection needs of the data. We have tested the resulting pilot infrastructure and assessed its performance, feasibility and reproducibility. CONCLUSIONS: Common open source software components are sufficient to allow for the creation of a secure system for differential access to sensitive data. The implementation of this system is exemplary for researchers facing similar requirements for restricted access data. Here we report experience and lessons learnt of our pilot implementation, which may be useful for similar use cases. Furthermore, we discuss possible extensions for more complex scenarios.


Asunto(s)
Bancos de Muestras Biológicas/normas , Investigación Biomédica/normas , Seguridad Computacional/normas , Conjuntos de Datos como Asunto , Investigación Biomédica Traslacional/normas , Humanos , Proyectos Piloto
15.
Bioinformatics ; 31(16): 2736-40, 2015 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-25861964

RESUMEN

MOTIVATION: The Cellular Phenotype Database (CPD) is a repository for data derived from high-throughput systems microscopy studies. The aims of this resource are: (i) to provide easy access to cellular phenotype and molecular localization data for the broader research community; (ii) to facilitate integration of independent phenotypic studies by means of data aggregation techniques, including use of an ontology and (iii) to facilitate development of analytical methods in this field. RESULTS: In this article we present CPD, its data structure and user interface, propose a minimal set of information describing RNA interference experiments, and suggest a generic schema for management and aggregation of outputs from phenotypic or molecular localization experiments. The database has a flexible structure for management of data from heterogeneous sources of systems microscopy experimental outputs generated by a variety of protocols and technologies and can be queried by gene, reagent, gene attribute, study keywords, phenotype or ontology terms. AVAILABILITY AND IMPLEMENTATION: CPD is developed as part of the Systems Microscopy Network of Excellence and is accessible at http://www.ebi.ac.uk/fg/sym. CONTACT: jes@ebi.ac.uk or ugis@ebi.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Células/citología , Bases de Datos como Asunto , Microscopía/métodos , Fenotipo , Estadística como Asunto , Interfaz Usuario-Computador
16.
Bioinformatics ; 31(9): 1505-7, 2015 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-25505093

RESUMEN

MOTIVATION: The field of toxicogenomics (the application of '-omics' technologies to risk assessment of compound toxicities) has expanded in the last decade, partly driven by new legislation, aimed at reducing animal testing in chemical risk assessment but mainly as a result of a paradigm change in toxicology towards the use and integration of genome wide data. Many research groups worldwide have generated large amounts of such toxicogenomics data. However, there is no centralized repository for archiving and making these data and associated tools for their analysis easily available. RESULTS: The Data Infrastructure for Chemical Safety Assessment (diXa) is a robust and sustainable infrastructure storing toxicogenomics data. A central data warehouse is connected to a portal with links to chemical information and molecular and phenotype data. diXa is publicly available through a user-friendly web interface. New data can be readily deposited into diXa using guidelines and templates available online. Analysis descriptions and tools for interrogating the data are available via the diXa portal. AVAILABILITY AND IMPLEMENTATION: http://www.dixa-fp7.eu CONTACT: d.hendrickx@maastrichtuniversity.nl; info@dixa-fp7.eu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Bases de Datos de Compuestos Químicos , Toxicogenética , Animales , Perfilación de la Expresión Génica , Humanos , Metabolómica , Proteómica , Ratas
17.
Nucleic Acids Res ; 42(Database issue): D50-2, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24265224

RESUMEN

The BioSamples database at the EBI (http://www.ebi.ac.uk/biosamples) provides an integration point for BioSamples information between technology specific databases at the EBI, projects such as ENCODE and reference collections such as cell lines. The database delivers a unified query interface and API to query sample information across EBI's databases and provides links back to assay databases. Sample groups are used to manage related samples, e.g. those from an experimental submission, or a single reference collection. Infrastructural improvements include a new user interface with ontological and key word queries, a new query API, a new data submission API, complete RDF data download and a supporting SPARQL endpoint, accessioning at the point of submission to the European Nucleotide Archive and European Genotype Phenotype Archives and improved query response times.


Asunto(s)
Bases de Datos Genéticas , Línea Celular , Europa (Continente) , Humanos , Internet , Neoplasias/genética , Integración de Sistemas
18.
Nucleic Acids Res ; 41(Database issue): D987-90, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23193272

RESUMEN

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.


Asunto(s)
Bases de Datos Genéticas , Genómica , Análisis por Micromatrices , Bases de Datos Genéticas/estadística & datos numéricos , Bases de Datos Genéticas/tendencias , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Programas Informáticos , Interfaz Usuario-Computador
19.
Nucleic Acids Res ; 40(Database issue): D64-70, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22096232

RESUMEN

The BioSample Database (http://www.ebi.ac.uk/biosamples) is a new database at EBI that stores information about biological samples used in molecular experiments, such as sequencing, gene expression or proteomics. The goals of the BioSample Database include: (i) recording and linking of sample information consistently within EBI databases such as ENA, ArrayExpress and PRIDE; (ii) minimizing data entry efforts for EBI database submitters by enabling submitting sample descriptions once and referencing them later in data submissions to assay databases and (iii) supporting cross database queries by sample characteristics. Each sample in the database is assigned an accession number. The database includes a growing set of reference samples, such as cell lines, which are repeatedly used in experiments and can be easily referenced from any database by their accession numbers. Accession numbers for the reference samples will be exchanged with a similar database at NCBI. The samples in the database can be queried by their attributes, such as sample types, disease names or sample providers. A simple tab-delimited format facilitates submissions of sample information to the database, initially via email to biosamples@ebi.ac.uk.


Asunto(s)
Bases de Datos Genéticas , Línea Celular , Expresión Génica , Genómica , Proteómica , Análisis de Secuencia , Integración de Sistemas , Interfaz Usuario-Computador
20.
PLoS Genet ; 7(9): e1002270, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21931564

RESUMEN

We have performed a metabolite quantitative trait locus (mQTL) study of the (1)H nuclear magnetic resonance spectroscopy ((1)H NMR) metabolome in humans, building on recent targeted knowledge of genetic drivers of metabolic regulation. Urine and plasma samples were collected from two cohorts of individuals of European descent, with one cohort comprised of female twins donating samples longitudinally. Sample metabolite concentrations were quantified by (1)H NMR and tested for association with genome-wide single-nucleotide polymorphisms (SNPs). Four metabolites' concentrations exhibited significant, replicable association with SNP variation (8.6×10(-11)

Asunto(s)
Estudio de Asociación del Genoma Completo , Redes y Vías Metabólicas/genética , Metaboloma/genética , Sitios de Carácter Cuantitativo/genética , Selección Genética , Acetiltransferasas/genética , Acetiltransferasas/metabolismo , Dimetilaminas/sangre , Dimetilaminas/metabolismo , Femenino , Haplotipos , Humanos , Isobutiratos/metabolismo , Isobutiratos/orina , Espectroscopía de Resonancia Magnética , Metilaminas/metabolismo , Metilaminas/orina , Polimorfismo de Nucleótido Simple
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA