Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
1.
EMBO J ; 42(23): e115008, 2023 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-37964598

RESUMO

The main goals and challenges for the life science communities in the Open Science framework are to increase reuse and sustainability of data resources, software tools, and workflows, especially in large-scale data-driven research and computational analyses. Here, we present key findings, procedures, effective measures and recommendations for generating and establishing sustainable life science resources based on the collaborative, cross-disciplinary work done within the EOSC-Life (European Open Science Cloud for Life Sciences) consortium. Bringing together 13 European life science research infrastructures, it has laid the foundation for an open, digital space to support biological and medical research. Using lessons learned from 27 selected projects, we describe the organisational, technical, financial and legal/ethical challenges that represent the main barriers to sustainability in the life sciences. We show how EOSC-Life provides a model for sustainable data management according to FAIR (findability, accessibility, interoperability, and reusability) principles, including solutions for sensitive- and industry-related resources, by means of cross-disciplinary training and best practices sharing. Finally, we illustrate how data harmonisation and collaborative work facilitate interoperability of tools, data, solutions and lead to a better understanding of concepts, semantics and functionalities in the life sciences.


Assuntos
Disciplinas das Ciências Biológicas , Pesquisa Biomédica , Software , Fluxo de Trabalho
2.
Histochem Cell Biol ; 160(3): 211-221, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37537341

RESUMO

Biological imaging is one of the primary tools by which we understand living systems across scales from atoms to organisms. Rapid advances in imaging technology have increased both the spatial and temporal resolutions at which we examine those systems, as well as enabling visualisation of larger tissue volumes. These advances have huge potential but also generate ever increasing amounts of imaging data that must be stored and analysed. Public image repositories provide a critical scientific service through open data provision, supporting reproducibility of scientific results, access to reference imaging datasets and reuse of data for new scientific discovery and acceleration of image analysis methods development. The scale and scope of imaging data provides both challenges and opportunities for open sharing of image data. In this article, we provide a perspective influenced by decades of provision of open data resources for biological information, suggesting areas to focus on and a path towards global interoperability.


Assuntos
Processamento de Imagem Assistida por Computador , Reprodutibilidade dos Testes
3.
Nucleic Acids Res ; 49(D1): D1502-D1506, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33211879

RESUMO

ArrayExpress (https://www.ebi.ac.uk/arrayexpress) is an archive of functional genomics data at EMBL-EBI, established in 2002, initially as an archive for publication-related microarray data and was later extended to accept sequencing-based data. Over the last decade an increasing share of biological experiments involve multiple technologies assaying different biological modalities, such as epigenetics, and RNA and protein expression, and thus the BioStudies database (https://www.ebi.ac.uk/biostudies) was established to deal with such multimodal data. Its central concept is a study, which typically is associated with a publication. BioStudies stores metadata describing the study, provides links to the relevant databases, such as European Nucleotide Archive (ENA), as well as hosts the types of data for which specialized databases do not exist. With BioStudies now fully functional, we are able to further harmonize the archival data infrastructure at EMBL-EBI, and ArrayExpress is being migrated to BioStudies. In future, all functional genomics data will be archived at BioStudies. The process will be seamless for the users, who will continue to submit data using the online tool Annotare and will be able to query and download data largely in the same manner as before. Nevertheless, some technical aspects, particularly programmatic access, will change. This update guides the users through these changes.


Assuntos
Bases de Dados Genéticas , Epigênese Genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Animais , Linhagem Celular , Metilação de DNA , Perfilação da Expressão Gênica , Humanos , Internet , Metadados , Especificidade de Órgãos , Plantas/genética , Análise de Célula Única , Software
4.
Eur Respir J ; 60(2)2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35086829

RESUMO

The Human Cell Atlas (HCA) consortium aims to establish an atlas of all organs in the healthy human body at single-cell resolution to increase our understanding of basic biological processes that govern development, physiology and anatomy, and to accelerate diagnosis and treatment of disease. The Lung Biological Network of the HCA aims to generate the Human Lung Cell Atlas as a reference for the cellular repertoire, molecular cell states and phenotypes, and cell-cell interactions that characterise normal lung homeostasis in healthy lung tissue. Such a reference atlas of the healthy human lung will facilitate mapping the changes in the cellular landscape in disease. The discovAIR project is one of six pilot actions for the HCA funded by the European Commission in the context of the H2020 framework programme. discovAIR aims to establish the first draft of an integrated Human Lung Cell Atlas, combining single-cell transcriptional and epigenetic profiling with spatially resolving techniques on matched tissue samples, as well as including a number of chronic and infectious diseases of the lung. The integrated Human Lung Cell Atlas will be available as a resource for the wider respiratory community, including basic and translational scientists, clinical medicine, and the private sector, as well as for patients with lung disease and the interested lay public. We anticipate that the Human Lung Cell Atlas will be the founding stone for a more detailed understanding of the pathogenesis of lung diseases, guiding the design of novel diagnostics and preventive or curative interventions.


Assuntos
Pneumopatias , Pulmão , Humanos , Proteômica , Tórax
5.
Nat Methods ; 15(11): 984, 2018 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-30287931

RESUMO

This paper was originally published under standard Nature America Inc. copyright. As of the date of this correction, the Resource is available online as an open-access paper with a CC-BY license. No other part of the paper has been changed.

6.
Nucleic Acids Res ; 47(D1): D711-D715, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30357387

RESUMO

ArrayExpress (https://www.ebi.ac.uk/arrayexpress) is an archive of functional genomics data from a variety of technologies assaying functional modalities of a genome, such as gene expression or promoter occupancy. The number of experiments based on sequencing technologies, in particular RNA-seq experiments, has been increasing over the last few years and submissions of sequencing data have overtaken microarray experiments in the last 12 months. Additionally, there is a significant increase in experiments investigating single cells, rather than bulk samples, known as single-cell RNA-seq. To accommodate these trends, we have substantially changed our submission tool Annotare which, along with raw and processed data, collects all metadata necessary to interpret these experiments. Selected datasets are re-processed and loaded into our sister resource, the value-added Expression Atlas (and its component Single Cell Expression Atlas), which not only enables users to interpret the data easily but also serves as a test for data quality. With an increasing number of studies that combine different assay modalities (multi-omics experiments), a new more general archival resource the BioStudies Database has been developed, which will eventually supersede ArrayExpress. Data submissions will continue unchanged; all existing ArrayExpress data will be incorporated into BioStudies and the existing accession numbers and application programming interfaces will be maintained.


Assuntos
Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Célula Única/métodos , Software , Bases de Dados Genéticas , RNA-Seq/métodos
7.
Nat Methods ; 14(8): 775-781, 2017 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28775673

RESUMO

Access to primary research data is vital for the advancement of science. To extend the data types supported by community repositories, we built a prototype Image Data Resource (IDR) that collects and integrates imaging data acquired across many different imaging modalities. IDR links data from several imaging modalities, including high-content screening, super-resolution and time-lapse microscopy, digital pathology, public genetic or chemical databases, and cell and tissue phenotypes expressed using controlled ontologies. Using this integration, IDR facilitates the analysis of gene networks and reveals functional interactions that are inaccessible to individual studies. To enable re-analysis, we also established a computational resource based on Jupyter notebooks that allows remote access to the entire IDR. IDR is also an open source platform that others can use to publish their own image data. Thus IDR provides both a novel on-line resource and a software infrastructure that promotes and extends publication and re-analysis of scientific image data.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Interpretação de Imagem Assistida por Computador/métodos , Disseminação de Informação/métodos , Software , Interface Usuário-Computador , Algoritmos , Editoração , Integração de Sistemas
8.
Arch Toxicol ; 94(7): 2435-2461, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32632539

RESUMO

Hazard assessment, based on new approach methods (NAM), requires the use of batteries of assays, where individual tests may be contributed by different laboratories. A unified strategy for such collaborative testing is presented. It details all procedures required to allow test information to be usable for integrated hazard assessment, strategic project decisions and/or for regulatory purposes. The EU-ToxRisk project developed a strategy to provide regulatorily valid data, and exemplified this using a panel of > 20 assays (with > 50 individual endpoints), each exposed to 19 well-known test compounds (e.g. rotenone, colchicine, mercury, paracetamol, rifampicine, paraquat, taxol). Examples of strategy implementation are provided for all aspects required to ensure data validity: (i) documentation of test methods in a publicly accessible database; (ii) deposition of standard operating procedures (SOP) at the European Union DB-ALM repository; (iii) test readiness scoring accoding to defined criteria; (iv) disclosure of the pipeline for data processing; (v) link of uncertainty measures and metadata to the data; (vi) definition of test chemicals, their handling and their behavior in test media; (vii) specification of the test purpose and overall evaluation plans. Moreover, data generation was exemplified by providing results from 25 reporter assays. A complete evaluation of the entire test battery will be described elsewhere. A major learning from the retrospective analysis of this large testing project was the need for thorough definitions of the above strategy aspects, ideally in form of a study pre-registration, to allow adequate interpretation of the data and to ensure overall scientific/toxicological validity.


Assuntos
Documentação , Processamento Eletrônico de Dados/legislação & jurisprudência , Regulamentação Governamental , Testes de Toxicidade , Toxicologia/legislação & jurisprudência , Animais , Células Cultivadas , Europa (Continente) , Humanos , Formulação de Políticas , Reprodutibilidade dos Testes , Estudos Retrospectivos , Medição de Risco , Terminologia como Assunto , Peixe-Zebra/embriologia
9.
Nucleic Acids Res ; 46(D1): D1266-D1270, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29069414

RESUMO

BioStudies (www.ebi.ac.uk/biostudies) is a new public database that organizes data from biological studies. Typically, but not exclusively, a study is associated with a publication. BioStudies offers a simple way to describe the study structure, and provides flexible data deposition tools and data access interfaces. The actual data can be stored either in BioStudies or remotely, or both. BioStudies imports supplementary data from Europe PMC, and is a resource for authors and publishers for packaging data during the manuscript preparation process. It also can support data management needs of collaborative projects. The growth in multiomics experiments and other multi-faceted approaches to life sciences research mean that studies result in a diversity of data outputs in multiple locations. BioStudies presents a solution to ensuring that all these data and the associated publication(s) can be found coherently in the longer term.


Assuntos
Disciplinas das Ciências Biológicas , Bases de Dados Factuais , Animais , Humanos , Internet , Software
12.
Nucleic Acids Res ; 43(Database issue): D1113-6, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25361974

RESUMO

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42,000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Software
13.
BMC Med Inform Decis Mak ; 17(1): 30, 2017 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-28330491

RESUMO

BACKGROUND: Translational researchers need robust IT solutions to access a range of data types, varying from public data sets to pseudonymised patient information with restricted access, provided on a case by case basis. The reason for this complication is that managing access policies to sensitive human data must consider issues of data confidentiality, identifiability, extent of consent, and data usage agreements. All these ethical, social and legal aspects must be incorporated into a differential management of restricted access to sensitive data. METHODS: In this paper we present a pilot system that uses several common open source software components in a novel combination to coordinate access to heterogeneous biomedical data repositories containing open data (open access) as well as sensitive data (restricted access) in the domain of biobanking and biosample research. Our approach is based on a digital identity federation and software to manage resource access entitlements. RESULTS: Open source software components were assembled and configured in such a way that they allow for different ways of restricted access according to the protection needs of the data. We have tested the resulting pilot infrastructure and assessed its performance, feasibility and reproducibility. CONCLUSIONS: Common open source software components are sufficient to allow for the creation of a secure system for differential access to sensitive data. The implementation of this system is exemplary for researchers facing similar requirements for restricted access data. Here we report experience and lessons learnt of our pilot implementation, which may be useful for similar use cases. Furthermore, we discuss possible extensions for more complex scenarios.


Assuntos
Bancos de Espécimes Biológicos/normas , Pesquisa Biomédica/normas , Segurança Computacional/normas , Conjuntos de Dados como Assunto , Pesquisa Translacional Biomédica/normas , Humanos , Projetos Piloto
15.
Bioinformatics ; 31(16): 2736-40, 2015 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-25861964

RESUMO

MOTIVATION: The Cellular Phenotype Database (CPD) is a repository for data derived from high-throughput systems microscopy studies. The aims of this resource are: (i) to provide easy access to cellular phenotype and molecular localization data for the broader research community; (ii) to facilitate integration of independent phenotypic studies by means of data aggregation techniques, including use of an ontology and (iii) to facilitate development of analytical methods in this field. RESULTS: In this article we present CPD, its data structure and user interface, propose a minimal set of information describing RNA interference experiments, and suggest a generic schema for management and aggregation of outputs from phenotypic or molecular localization experiments. The database has a flexible structure for management of data from heterogeneous sources of systems microscopy experimental outputs generated by a variety of protocols and technologies and can be queried by gene, reagent, gene attribute, study keywords, phenotype or ontology terms. AVAILABILITY AND IMPLEMENTATION: CPD is developed as part of the Systems Microscopy Network of Excellence and is accessible at http://www.ebi.ac.uk/fg/sym. CONTACT: jes@ebi.ac.uk or ugis@ebi.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Células/citologia , Bases de Dados como Assunto , Microscopia/métodos , Fenótipo , Estatística como Assunto , Interface Usuário-Computador
16.
Bioinformatics ; 31(9): 1505-7, 2015 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-25505093

RESUMO

MOTIVATION: The field of toxicogenomics (the application of '-omics' technologies to risk assessment of compound toxicities) has expanded in the last decade, partly driven by new legislation, aimed at reducing animal testing in chemical risk assessment but mainly as a result of a paradigm change in toxicology towards the use and integration of genome wide data. Many research groups worldwide have generated large amounts of such toxicogenomics data. However, there is no centralized repository for archiving and making these data and associated tools for their analysis easily available. RESULTS: The Data Infrastructure for Chemical Safety Assessment (diXa) is a robust and sustainable infrastructure storing toxicogenomics data. A central data warehouse is connected to a portal with links to chemical information and molecular and phenotype data. diXa is publicly available through a user-friendly web interface. New data can be readily deposited into diXa using guidelines and templates available online. Analysis descriptions and tools for interrogating the data are available via the diXa portal. AVAILABILITY AND IMPLEMENTATION: http://www.dixa-fp7.eu CONTACT: d.hendrickx@maastrichtuniversity.nl; info@dixa-fp7.eu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados de Compostos Químicos , Toxicogenética , Animais , Perfilação da Expressão Gênica , Humanos , Metabolômica , Proteômica , Ratos
17.
Nucleic Acids Res ; 42(Database issue): D50-2, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24265224

RESUMO

The BioSamples database at the EBI (http://www.ebi.ac.uk/biosamples) provides an integration point for BioSamples information between technology specific databases at the EBI, projects such as ENCODE and reference collections such as cell lines. The database delivers a unified query interface and API to query sample information across EBI's databases and provides links back to assay databases. Sample groups are used to manage related samples, e.g. those from an experimental submission, or a single reference collection. Infrastructural improvements include a new user interface with ontological and key word queries, a new query API, a new data submission API, complete RDF data download and a supporting SPARQL endpoint, accessioning at the point of submission to the European Nucleotide Archive and European Genotype Phenotype Archives and improved query response times.


Assuntos
Bases de Dados Genéticas , Linhagem Celular , Europa (Continente) , Humanos , Internet , Neoplasias/genética , Integração de Sistemas
18.
Nucleic Acids Res ; 41(Database issue): D987-90, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23193272

RESUMO

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.


Assuntos
Bases de Dados Genéticas , Genômica , Análise em Microsséries , Bases de Dados Genéticas/estatística & dados numéricos , Bases de Dados Genéticas/tendências , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Software , Interface Usuário-Computador
19.
Nucleic Acids Res ; 40(Database issue): D64-70, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22096232

RESUMO

The BioSample Database (http://www.ebi.ac.uk/biosamples) is a new database at EBI that stores information about biological samples used in molecular experiments, such as sequencing, gene expression or proteomics. The goals of the BioSample Database include: (i) recording and linking of sample information consistently within EBI databases such as ENA, ArrayExpress and PRIDE; (ii) minimizing data entry efforts for EBI database submitters by enabling submitting sample descriptions once and referencing them later in data submissions to assay databases and (iii) supporting cross database queries by sample characteristics. Each sample in the database is assigned an accession number. The database includes a growing set of reference samples, such as cell lines, which are repeatedly used in experiments and can be easily referenced from any database by their accession numbers. Accession numbers for the reference samples will be exchanged with a similar database at NCBI. The samples in the database can be queried by their attributes, such as sample types, disease names or sample providers. A simple tab-delimited format facilitates submissions of sample information to the database, initially via email to biosamples@ebi.ac.uk.


Assuntos
Bases de Dados Genéticas , Linhagem Celular , Expressão Gênica , Genômica , Proteômica , Análise de Sequência , Integração de Sistemas , Interface Usuário-Computador
20.
PLoS Genet ; 7(9): e1002270, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21931564

RESUMO

We have performed a metabolite quantitative trait locus (mQTL) study of the (1)H nuclear magnetic resonance spectroscopy ((1)H NMR) metabolome in humans, building on recent targeted knowledge of genetic drivers of metabolic regulation. Urine and plasma samples were collected from two cohorts of individuals of European descent, with one cohort comprised of female twins donating samples longitudinally. Sample metabolite concentrations were quantified by (1)H NMR and tested for association with genome-wide single-nucleotide polymorphisms (SNPs). Four metabolites' concentrations exhibited significant, replicable association with SNP variation (8.6×10(-11)

Assuntos
Estudo de Associação Genômica Ampla , Redes e Vias Metabólicas/genética , Metaboloma/genética , Locos de Características Quantitativas/genética , Seleção Genética , Acetiltransferases/genética , Acetiltransferases/metabolismo , Dimetilaminas/sangue , Dimetilaminas/metabolismo , Feminino , Haplótipos , Humanos , Isobutiratos/metabolismo , Isobutiratos/urina , Espectroscopia de Ressonância Magnética , Metilaminas/metabolismo , Metilaminas/urina , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA