Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
Brief Bioinform ; 25(4)2024 May 23.
Article in English | MEDLINE | ID: mdl-38836701

ABSTRACT

Biomedical data are generated and collected from various sources, including medical imaging, laboratory tests and genome sequencing. Sharing these data for research can help address unmet health needs, contribute to scientific breakthroughs, accelerate the development of more effective treatments and inform public health policy. Due to the potential sensitivity of such data, however, privacy concerns have led to policies that restrict data sharing. In addition, sharing sensitive data requires a secure and robust infrastructure with appropriate storage solutions. Here, we examine and compare the centralized and federated data sharing models through the prism of five large-scale and real-world use cases of strategic significance within the European data sharing landscape: the French Health Data Hub, the BBMRI-ERIC Colorectal Cancer Cohort, the federated European Genome-phenome Archive, the Observational Medical Outcomes Partnership/OHDSI network and the EBRAINS Medical Informatics Platform. Our analysis indicates that centralized models facilitate data linkage, harmonization and interoperability, while federated models facilitate scaling up and legal compliance, as the data typically reside on the data generator's premises, allowing for better control of how data are shared. This comparative study thus offers guidance on the selection of the most appropriate sharing strategy for sensitive datasets and provides key insights for informed decision-making in data sharing efforts.


Subject(s)
Biological Science Disciplines , Information Dissemination , Humans , Medical Informatics/methods
2.
EMBO J ; 42(23): e115008, 2023 Dec 01.
Article in English | MEDLINE | ID: mdl-37964598

ABSTRACT

The main goals and challenges for the life science communities in the Open Science framework are to increase reuse and sustainability of data resources, software tools, and workflows, especially in large-scale data-driven research and computational analyses. Here, we present key findings, procedures, effective measures and recommendations for generating and establishing sustainable life science resources based on the collaborative, cross-disciplinary work done within the EOSC-Life (European Open Science Cloud for Life Sciences) consortium. Bringing together 13 European life science research infrastructures, it has laid the foundation for an open, digital space to support biological and medical research. Using lessons learned from 27 selected projects, we describe the organisational, technical, financial and legal/ethical challenges that represent the main barriers to sustainability in the life sciences. We show how EOSC-Life provides a model for sustainable data management according to FAIR (findability, accessibility, interoperability, and reusability) principles, including solutions for sensitive- and industry-related resources, by means of cross-disciplinary training and best practices sharing. Finally, we illustrate how data harmonisation and collaborative work facilitate interoperability of tools, data, solutions and lead to a better understanding of concepts, semantics and functionalities in the life sciences.


Subject(s)
Biological Science Disciplines , Biomedical Research , Software , Workflow
3.
Sci Data ; 10(1): 291, 2023 05 19.
Article in English | MEDLINE | ID: mdl-37208349

ABSTRACT

The COVID-19 pandemic has highlighted the need for FAIR (Findable, Accessible, Interoperable, and Reusable) data more than any other scientific challenge to date. We developed a flexible, multi-level, domain-agnostic FAIRification framework, providing practical guidance to improve the FAIRness for both existing and future clinical and molecular datasets. We validated the framework in collaboration with several major public-private partnership projects, demonstrating and delivering improvements across all aspects of FAIR and across a variety of datasets and their contexts. We therefore managed to establish the reproducibility and far-reaching applicability of our approach to FAIRification tasks.


Subject(s)
COVID-19 , Datasets as Topic , Humans , Pandemics , Public-Private Sector Partnerships , Reproducibility of Results
4.
Sci Rep ; 12(1): 20989, 2022 12 05.
Article in English | MEDLINE | ID: mdl-36470968

ABSTRACT

For life science infrastructures, sensitive data generate an additional layer of complexity. Cross-domain categorisation and discovery of digital resources related to sensitive data presents major interoperability challenges. To support this FAIRification process, a toolbox demonstrator aiming at support for discovery of digital objects related to sensitive data (e.g., regulations, guidelines, best practice, tools) has been developed. The toolbox is based upon a categorisation system developed and harmonised across a cluster of 6 life science research infrastructures. Three different versions were built, tested by subsequent pilot studies, finally leading to a system with 7 main categories (sensitive data type, resource type, research field, data type, stage in data sharing life cycle, geographical scope, specific topics). 109 resources attached with the tags in pilot study 3 were used as the initial content for the toolbox demonstrator, a software tool allowing searching of digital objects linked to sensitive data with filtering based upon the categorisation system. Important next steps are a broad evaluation of the usability and user-friendliness of the toolbox, extension to more resources, broader adoption by different life-science communities, and a long-term vision for maintenance and sustainability.


Subject(s)
Biological Science Disciplines , Software , Pilot Projects
5.
Open Res Eur ; 2: 80, 2022.
Article in English | MEDLINE | ID: mdl-37767227

ABSTRACT

Large European research consortia in the health sciences face challenges regarding the governance of personal data collected, generated and/or shared during their collective research. A controller in the sense of the GDPR is the entity which decides about purposes and means of the data processing. Case law of the Court of Justice of the European Union (CJEU) and Guidelines of the European Data Protection Board (EDPB) indicate that all partners in the consortium would be joint controllers. This paper summarises the case law, the Guidelines and literature on joint controllership, gives a brief account of a webinar organised on the issue by Lygature and the MLC Foundation. Participants at the webinar agreed in large majority that it would be extreme if all partners in the consortium would become joint controllers. There was less agreement how to disentangle partners who are controllers of a study from those who are not. In order to disentangle responsibilities, we propose a funnel model with consecutive steps acting as sieves in the funnel. It differentiates between two types of partners: all partners who are involved in shaping the project as a whole versus those specific partners who are more closely involved in a sub-study following from the DoA or the use of the data Platform. If the role of the partner would be comparable to that of an outside advisor, that partner would not be a data controller even though the partner is part of the consortium. We propose further nuances for the disentanglement which takes place in various steps. Uncertainty about formal controllership under the GDPR can stifle collaboration in consortia due to concerns over (shared) responsibility and liability. Data subjects' ability to exercise their right can also be affected by this. The funnel model proposes a way out of this conundrum.

7.
F1000Res ; 62017.
Article in English | MEDLINE | ID: mdl-29123641

ABSTRACT

The availability of high-throughput molecular profiling techniques has provided more accurate and informative data for regular clinical studies. Nevertheless, complex computational workflows are required to interpret these data. Over the past years, the data volume has been growing explosively, requiring robust human data management to organise and integrate the data efficiently. For this reason, we set up an ELIXIR implementation study, together with the Translational research IT (TraIT) programme, to design a data ecosystem that is able to link raw and interpreted data. In this project, the data from the TraIT Cell Line Use Case (TraIT-CLUC) are used as a test case for this system. Within this ecosystem, we use the European Genome-phenome Archive (EGA) to store raw molecular profiling data; tranSMART to collect interpreted molecular profiling data and clinical data for corresponding samples; and Galaxy to store, run and manage the computational workflows. We can integrate these data by linking their repositories systematically. To showcase our design, we have structured the TraIT-CLUC data, which contain a variety of molecular profiling data types, for storage in both tranSMART and EGA. The metadata provided allows referencing between tranSMART and EGA, fulfilling the cycle of data submission and discovery; we have also designed a data flow from EGA to Galaxy, enabling reanalysis of the raw data in Galaxy. In this way, users can select patient cohorts in tranSMART, trace them back to the raw data and perform (re)analysis in Galaxy. Our conclusion is that the majority of metadata does not necessarily need to be stored (redundantly) in both databases, but that instead FAIR persistent identifiers should be available for well-defined data ontology levels: study, data access committee, physical sample, data sample and raw data file. This approach will pave the way for the stable linkage and reuse of data.

8.
Sci Data ; 3: 160018, 2016 Mar 15.
Article in English | MEDLINE | ID: mdl-26978244

ABSTRACT

There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders-representing academia, industry, funding agencies, and scholarly publishers-have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.


Subject(s)
Data Collection , Data Curation , Research Design , Database Management Systems , Guidelines as Topic , Reproducibility of Results
9.
F1000Res ; 52016.
Article in English | MEDLINE | ID: mdl-28232859

ABSTRACT

High-throughput molecular profiling techniques are routinely generating vast amounts of data for translational medicine studies. Secure access controlled systems are needed to manage, store, transfer and distribute these data due to its personally identifiable nature. The European Genome-phenome Archive (EGA) was created to facilitate access and management to long-term archival of bio-molecular data. Each data provider is responsible for ensuring a Data Access Committee is in place to grant access to data stored in the EGA. Moreover, the transfer of data during upload and download is encrypted. ELIXIR, a European research infrastructure for life-science data, initiated a project (2016 Human Data Implementation Study) to understand and document the ELIXIR requirements for secure management of controlled-access data. As part of this project, a full ecosystem was designed to connect archived raw experimental molecular profiling data with interpreted data and the computational workflows, using the CTMM Translational Research IT (CTMM-TraIT) infrastructure http://www.ctmm-trait.nl as an example. Here we present the first outcomes of this project, a framework to enable the download of EGA data to a Galaxy server in a secure way. Galaxy provides an intuitive user interface for molecular biologists and bioinformaticians to run and design data analysis workflows. More specifically, we developed a tool -- ega_download_streamer - that can download data securely from EGA into a Galaxy server, which can subsequently be further processed. This tool will allow a user within the browser to run an entire analysis containing sensitive data from EGA, and to make this analysis available for other researchers in a reproducible manner, as shown with a proof of concept study.  The tool ega_download_streamer is available in the Galaxy tool shed: https://toolshed.g2.bx.psu.edu/view/yhoogstrate/ega_download_streamer.

10.
J Chem Inf Model ; 52(6): 1438-49, 2012 Jun 25.
Article in English | MEDLINE | ID: mdl-22640375

ABSTRACT

Drug discovery teams continuously have to decide which compounds to progress and which experiments to perform next, but the data required to make informed decisions is often scattered, inaccessible, or inconsistent. In particular, data tend to be stored and represented in a compound-centric or assay-centric manner rather than project-centric as often needed for effective use in drug discovery teams. The Integrated Project Views (IPV) system has been created to fill this gap; it integrates and consolidates data from various sources in a project-oriented manner. Its automatic gathering and updating of project data not only ensures that the information is comprehensive and available on a timely basis, but also improves the data consistency. Due to the lack of suitable off-the-shelf solutions, we were prompted to develop custom functionality and algorithms geared specifically to our drug discovery decision making process. In 10 years of usage, the resulting IPV application has become very well-accepted and appreciated, which is perhaps best evidenced by the observation that standalone Excel spreadsheets are largely eliminated from project team meetings.


Subject(s)
Decision Support Techniques , Drug Discovery , Group Processes , Algorithms
11.
Drug Discov Today ; 16(13-14): 555-68, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21605698

ABSTRACT

The difference between biologically active molecules and drugs is that the latter balance an array of related and unrelated properties required for administration to patients. Inevitability, during optimization, some of these multiple factors will conflict. Although informatics has a crucial role in addressing the challenges of modern compound optimization, it is arguably still undervalued and underutilized. We present here some of the basic requirements of multi-parameter drug design, the crucial role of informatics and examples of favorable practice. The most crucial of these best practices are the need for informaticians to align their technologies and insights directly to discovery projects and for all scientists in drug discovery to become more proficient in the use of in silico methods.


Subject(s)
Computational Biology/methods , Computer Simulation , Drug Design , Drug Discovery/methods , Humans , Models, Molecular
SELECTION OF CITATIONS
SEARCH DETAIL
...