RESUMEN
Rare disease patients are more likely to receive a rapid molecular diagnosis nowadays thanks to the wide adoption of next-generation sequencing. However, many cases remain undiagnosed even after exome or genome analysis, because the methods used missed the molecular cause in a known gene, or a novel causative gene could not be identified and/or confirmed. To address these challenges, the RD-Connect Genome-Phenome Analysis Platform (GPAP) facilitates the collation, discovery, sharing, and analysis of standardized genome-phenome data within a collaborative environment. Authorized clinicians and researchers submit pseudonymised phenotypic profiles encoded using the Human Phenotype Ontology, and raw genomic data which is processed through a standardized pipeline. After an optional embargo period, the data are shared with other platform users, with the objective that similar cases in the system and queries from peers may help diagnose the case. Additionally, the platform enables bidirectional discovery of similar cases in other databases from the Matchmaker Exchange network. To facilitate genome-phenome analysis and interpretation by clinical researchers, the RD-Connect GPAP provides a powerful user-friendly interface and leverages tens of information sources. As a result, the resource has already helped diagnose hundreds of rare disease patients and discover new disease causing genes.
Asunto(s)
Genómica , Enfermedades Raras , Exoma , Estudios de Asociación Genética , Genómica/métodos , Humanos , Fenotipo , Enfermedades Raras/diagnóstico , Enfermedades Raras/genéticaRESUMEN
The Solve-RD project brings together clinicians, scientists, and patient representatives from 51 institutes spanning 15 countries to collaborate on genetically diagnosing ("solving") rare diseases (RDs). The project aims to significantly increase the diagnostic success rate by co-analyzing data from thousands of RD cases, including phenotypes, pedigrees, exome/genome sequencing, and multiomics data. Here we report on the data infrastructure devised and created to support this co-analysis. This infrastructure enables users to store, find, connect, and analyze data and metadata in a collaborative manner. Pseudonymized phenotypic and raw experimental data are submitted to the RD-Connect Genome-Phenome Analysis Platform and processed through standardized pipelines. Resulting files and novel produced omics data are sent to the European Genome-Phenome Archive, which adds unique file identifiers and provides long-term storage and controlled access services. MOLGENIS "RD3" and Café Variome "Discovery Nexus" connect data and metadata and offer discovery services, and secure cloud-based "Sandboxes" support multiparty data analysis. This successfully deployed and useful infrastructure design provides a blueprint for other projects that need to analyze large amounts of heterogeneous data.
Asunto(s)
Enfermedades Raras , Enfermedades Raras/genética , Humanos , Bases de Datos Genéticas , Fenotipo , Metadatos , Biología Computacional/métodos , Genómica/métodosRESUMEN
The Solve-RD project objectives include solving undiagnosed rare diseases (RD) through collaborative research on shared genome-phenome datasets. The RD-Connect Genome-Phenome Analysis Platform (GPAP), for data collation and analysis, and the European Genome-Phenome Archive (EGA), for file storage, are two key components of the Solve-RD infrastructure. Clinical researchers can identify candidate genetic variants within the RD-Connect GPAP and, thanks to the developments presented here as part of joint ELIXIR activities, are able to remotely visualize the corresponding alignments stored at the EGA. The Global Alliance for Genomics and Health (GA4GH) htsget streaming application programming interface (API) is used to retrieve alignment slices, which are rendered by an integrated genome viewer (IGV) instance embedded in the GPAP. As a result, it is no longer necessary for over 11,000 datasets to download large alignment files to visualize them locally. This work highlights the advantages, from both the user and infrastructure perspectives, of implementing interoperability standards for establishing federated genomics data networks.
RESUMEN
Associations between human genetic variation and clinical phenotypes have become a foundation of biomedical research. Most repositories of these data seek to be disease-agnostic and therefore lack disease-focused views. The Type 2 Diabetes Knowledge Portal (T2DKP) is a public resource of genetic datasets and genomic annotations dedicated to type 2 diabetes (T2D) and related traits. Here, we seek to make the T2DKP more accessible to prospective users and more useful to existing users. First, we evaluate the T2DKP's comprehensiveness by comparing its datasets with those of other repositories. Second, we describe how researchers unfamiliar with human genetic data can begin using and correctly interpreting them via the T2DKP. Third, we describe how existing users can extend their current workflows to use the full suite of tools offered by the T2DKP. We finally discuss the lessons offered by the T2DKP toward the goal of democratizing access to complex disease genetic results.
Asunto(s)
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Acceso a la Información , Estudios Prospectivos , Genómica/métodos , FenotipoRESUMEN
For the first time in Europe hundreds of rare disease (RD) experts team up to actively share and jointly analyse existing patient's data. Solve-RD is a Horizon 2020-supported EU flagship project bringing together >300 clinicians, scientists, and patient representatives of 51 sites from 15 countries. Solve-RD is built upon a core group of four European Reference Networks (ERNs; ERN-ITHACA, ERN-RND, ERN-Euro NMD, ERN-GENTURIS) which annually see more than 270,000 RD patients with respective pathologies. The main ambition is to solve unsolved rare diseases for which a molecular cause is not yet known. This is achieved through an innovative clinical research environment that introduces novel ways to organise expertise and data. Two major approaches are being pursued (i) massive data re-analysis of >19,000 unsolved rare disease patients and (ii) novel combined -omics approaches. The minimum requirement to be eligible for the analysis activities is an inconclusive exome that can be shared with controlled access. The first preliminary data re-analysis has already diagnosed 255 cases form 8393 exomes/genome datasets. This unprecedented degree of collaboration focused on sharing of data and expertise shall identify many new disease genes and enable diagnosis of many so far undiagnosed patients from all over Europe.
Asunto(s)
Enfermedades Genéticas Congénitas/genética , Difusión de la Información , Colaboración Intersectorial , Enfermedades Raras/genética , Conferencias de Consenso como Asunto , Europa (Continente) , Enfermedades Genéticas Congénitas/diagnóstico , Pruebas Genéticas/métodos , Humanos , Enfermedades Raras/diagnóstico , Secuenciación del Exoma/métodosRESUMEN
The volume of genomics and health data is growing rapidly, driven by sequencing for both research and clinical use. However, under current practices, the data is fragmented into many distinct datasets, and researchers must go through a separate application process for each dataset. This is time-consuming both for the researchers and the data stewards, and it reduces the velocity of research and new discoveries that could improve human health. We propose to simplify this process, by introducing a standard Library Card that identifies and authenticates researchers across all participating datasets. Each researcher would only need to apply once to establish their bona fides as a qualified researcher, and could then use the Library Card to access a wide range of datasets that use a compatible data access policy and authentication protocol.
RESUMEN
OBJECTIVES: We examined major issues associated with sharing of individual clinical trial data and developed a consensus document on providing access to individual participant data from clinical trials, using a broad interdisciplinary approach. DESIGN AND METHODS: This was a consensus-building process among the members of a multistakeholder task force, involving a wide range of experts (researchers, patient representatives, methodologists, information technology experts, and representatives from funders, infrastructures and standards development organisations). An independent facilitator supported the process using the nominal group technique. The consensus was reached in a series of three workshops held over 1 year, supported by exchange of documents and teleconferences within focused subgroups when needed. This work was set within the Horizon 2020-funded project CORBEL (Coordinated Research Infrastructures Building Enduring Life-science Services) and coordinated by the European Clinical Research Infrastructure Network. Thus, the focus was on non-commercial trials and the perspective mainly European. OUTCOME: We developed principles and practical recommendations on how to share data from clinical trials. RESULTS: The task force reached consensus on 10 principles and 50 recommendations, representing the fundamental requirements of any framework used for the sharing of clinical trials data. The document covers the following main areas: making data sharing a reality (eg, cultural change, academic incentives, funding), consent for data sharing, protection of trial participants (eg, de-identification), data standards, rights, types and management of access (eg, data request and access models), data management and repositories, discoverability, and metadata. CONCLUSIONS: The adoption of the recommendations in this document would help to promote and support data sharing and reuse among researchers, adequately inform trial participants and protect their rights, and provide effective and efficient systems for preparing, storing and accessing data. The recommendations now need to be implemented and tested in practice. Further work needs to be done to integrate these proposals with those from other geographical areas and other academic domains.
Asunto(s)
Investigación Biomédica/normas , Ensayos Clínicos como Asunto , Consenso , Difusión de la Información/métodos , Comités Consultivos , HumanosRESUMEN
High-throughput molecular profiling techniques are routinely generating vast amounts of data for translational medicine studies. Secure access controlled systems are needed to manage, store, transfer and distribute these data due to its personally identifiable nature. The European Genome-phenome Archive (EGA) was created to facilitate access and management to long-term archival of bio-molecular data. Each data provider is responsible for ensuring a Data Access Committee is in place to grant access to data stored in the EGA. Moreover, the transfer of data during upload and download is encrypted. ELIXIR, a European research infrastructure for life-science data, initiated a project (2016 Human Data Implementation Study) to understand and document the ELIXIR requirements for secure management of controlled-access data. As part of this project, a full ecosystem was designed to connect archived raw experimental molecular profiling data with interpreted data and the computational workflows, using the CTMM Translational Research IT (CTMM-TraIT) infrastructure http://www.ctmm-trait.nl as an example. Here we present the first outcomes of this project, a framework to enable the download of EGA data to a Galaxy server in a secure way. Galaxy provides an intuitive user interface for molecular biologists and bioinformaticians to run and design data analysis workflows. More specifically, we developed a tool -- ega_download_streamer - that can download data securely from EGA into a Galaxy server, which can subsequently be further processed. This tool will allow a user within the browser to run an entire analysis containing sensitive data from EGA, and to make this analysis available for other researchers in a reproducible manner, as shown with a proof of concept study. The tool ega_download_streamer is available in the Galaxy tool shed: https://toolshed.g2.bx.psu.edu/view/yhoogstrate/ega_download_streamer.