RESUMO
BACKGROUND: It is necessary to harmonize and standardize data variables used in case report forms (CRFs) of clinical studies to facilitate the merging and sharing of the collected patient data across several clinical studies. This is particularly true for clinical studies that focus on infectious diseases. Public health may be highly dependent on the findings of such studies. Hence, there is an elevated urgency to generate meaningful, reliable insights, ideally based on a high sample number and quality data. The implementation of core data elements and the incorporation of interoperability standards can facilitate the creation of harmonized clinical data sets. OBJECTIVE: This study's objective was to compare, harmonize, and standardize variables focused on diagnostic tests used as part of CRFs in 6 international clinical studies of infectious diseases in order to, ultimately, then make available the panstudy common data elements (CDEs) for ongoing and future studies to foster interoperability and comparability of collected data across trials. METHODS: We reviewed and compared the metadata that comprised the CRFs used for data collection in and across all 6 infectious disease studies under consideration in order to identify CDEs. We examined the availability of international semantic standard codes within the Systemized Nomenclature of Medicine - Clinical Terms, the National Cancer Institute Thesaurus, and the Logical Observation Identifiers Names and Codes system for the unambiguous representation of diagnostic testing information that makes up the CDEs. We then proposed 2 data models that incorporate semantic and syntactic standards for the identified CDEs. RESULTS: Of 216 variables that were considered in the scope of the analysis, we identified 11 CDEs to describe diagnostic tests (in particular, serology and sequencing) for infectious diseases: viral lineage/clade; test date, type, performer, and manufacturer; target gene; quantitative and qualitative results; and specimen identifier, type, and collection date. CONCLUSIONS: The identification of CDEs for infectious diseases is the first step in facilitating the exchange and possible merging of a subset of data across clinical studies (and with that, large research projects) for possible shared analysis to increase the power of findings. The path to harmonization and standardization of clinical study data in the interest of interoperability can be paved in 2 ways. First, a map to standard terminologies ensures that each data element's (variable's) definition is unambiguous and that it has a single, unique interpretation across studies. Second, the exchange of these data is assisted by "wrapping" them in a standard exchange format, such as Fast Health care Interoperability Resources or the Clinical Data Interchange Standards Consortium's Clinical Data Acquisition Standards Harmonization Model.
Assuntos
Doenças Transmissíveis , Semântica , Humanos , Doenças Transmissíveis/diagnóstico , Elementos de Dados ComunsRESUMO
The COVID-19 pandemic has led to tremendous investment in clinical studies to generate much-needed knowledge on the prevention, diagnosis, treatment and long-term effects of the disease. Case report forms, comprised of questions and answers (variables), are commonly used to collect data in clinical trials. Maximizing the value of study data depends on data quality and on the ability to easily pool and share data from several sources. ISARIC, in collaboration with the WHO, has created a case report form that is available for use by the scientific community to collect COVID-19 trial data. One of such research initiatives collecting and analyzing multi-country and multi-cohort COVID-19 study data is the Horizon 2020 project ORCHESTRA. Following the ISO/TS 21564:2019 standard, a mapping between five ORCHESTRA studies' variables and the ISARIC Freestanding Follow-Up Survey elements was created. Measures of correspondence of shared semantic domain of 0 (perfect match), 1 (fully inclusive match), 2 (partial match), 4 (transformation required) or 4* (not present in ORCHESTRA) as compared to the target code system, ORCHESTRA study variables, were assigned to each of the elements in the ISARIC FUP case report form (CRF) which was considered the source code system. Of the ISARIC FUP CRF's variables, around 34% were found to show an exact match with corresponding variables in ORCHESTRA studies and about 33% showed a non-inclusive overlap. Matching variables provided information on patient demographics, COVID-19 testing, hospital admission and symptoms. More in-depth details are covered in ORCHESTRA variables with regards to treatment and comorbidities. ORCHESTRA's Long-Term Sequelae and Fragile population studies' CRFs include 32 and 27 variables respectively which were evaluated as a perfect match to variables in the ISARIC FUP CRF. Our study serves as an example of the kind of maps between case report form variables from different research projects needed to link ongoing COVID-19 research efforts and facilitate collaboration and data sharing. To enable data aggregation across two data systems, the information they contain needs to be connected through a map to determine compatibility and transformation needs. Combining data from various clinical studies can increase the power of analytical insights.
Assuntos
Teste para COVID-19 , COVID-19 , Humanos , Seguimentos , Pandemias , Semântica , COVID-19/epidemiologia , FadigaRESUMO
PURPOSE: Treatment outcomes for hepatoblastoma have improved markedly in the contemporary treatment era, principally due to therapy intensification, with overall survival increasing from 35% in the 1970s to 90% at present. Unfortunately, these advancements are accompanied by an increased incidence of toxicities. A detailed analysis of age as a prognostic factor may support individualized risk-based therapy stratification. METHODS: We evaluated 1605 patients with hepatoblastoma included in the CHIC database to assess the relationship between event-free survival (EFS) and age at diagnosis. Further analysis included the age distribution of additional risk factors and the interaction of age with other known prognostic factors. RESULTS: Risk for an event increases progressively with increasing age at diagnosis. This pattern could not be attributed to the differential distribution of other known risk factors across age. Newborns and infants are not at increased risk of treatment failure. The interaction between age and other adverse risk factors demonstrates an attenuation of prognostic relevance with increasing age in the following categories: metastatic disease, AFP < 100 ng/mL, and tumor rupture. CONCLUSION: Risk for an event increased with advancing age at diagnosis. Increased age attenuates the prognostic influence of metastatic disease, low AFP, and tumor rupture. Age could be used to modify recommended chemotherapy intensity.
Assuntos
Bases de Dados Factuais , Hepatoblastoma , Neoplasias Hepáticas , Adolescente , Idade de Início , Criança , Pré-Escolar , Intervalo Livre de Doença , Feminino , Hepatoblastoma/diagnóstico , Hepatoblastoma/mortalidade , Hepatoblastoma/patologia , Hepatoblastoma/terapia , Humanos , Incidência , Lactente , Recém-Nascido , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/mortalidade , Neoplasias Hepáticas/patologia , Neoplasias Hepáticas/terapia , Masculino , Metástase Neoplásica , Estudos Prospectivos , Fatores de Risco , Taxa de SobrevidaRESUMO
BACKGROUND: The current COVID-19 pandemic has led to a surge of research activity. While this research provides important insights, the multitude of studies results in an increasing fragmentation of information. To ensure comparability across projects and institutions, standard datasets are needed. Here, we introduce the "German Corona Consensus Dataset" (GECCO), a uniform dataset that uses international terminologies and health IT standards to improve interoperability of COVID-19 data, in particular for university medicine. METHODS: Based on previous work (e.g., the ISARIC-WHO COVID-19 case report form) and in coordination with experts from university hospitals, professional associations and research initiatives, data elements relevant for COVID-19 research were collected, prioritized and consolidated into a compact core dataset. The dataset was mapped to international terminologies, and the Fast Healthcare Interoperability Resources (FHIR) standard was used to define interoperable, machine-readable data formats. RESULTS: A core dataset consisting of 81 data elements with 281 response options was defined, including information about, for example, demography, medical history, symptoms, therapy, medications or laboratory values of COVID-19 patients. Data elements and response options were mapped to SNOMED CT, LOINC, UCUM, ICD-10-GM and ATC, and FHIR profiles for interoperable data exchange were defined. CONCLUSION: GECCO provides a compact, interoperable dataset that can help to make COVID-19 research data more comparable across studies and institutions. The dataset will be further refined in the future by adding domain-specific extension modules for more specialized use cases.
Assuntos
Pesquisa Biomédica , COVID-19 , Conjuntos de Dados como Assunto , Medicina , Consenso , Humanos , PandemiasRESUMO
BACKGROUND: Comparative assessment of treatment results in paediatric hepatoblastoma trials has been hampered by small patient numbers and the use of multiple disparate staging systems by the four major trial groups. To address this challenge, we formed a global coalition, the Children's Hepatic tumors International Collaboration (CHIC), with the aim of creating a common approach to staging and risk stratification in this rare cancer. METHODS: The CHIC steering committee-consisting of leadership from the four major cooperative trial groups (the International Childhood Liver Tumours Strategy Group, Children's Oncology Group, the German Society for Paediatric Oncology and Haematology, and the Japanese Study Group for Paediatric Liver Tumours)-created a shared international database that includes comprehensive data from 1605 children treated in eight multicentre hepatoblastoma trials over 25 years. Diagnostic factors found to be most prognostic on initial analysis were PRETreatment EXTent of disease (PRETEXT) group; age younger than 3 years, 3-7 years, and 8 years or older; α fetoprotein (AFP) concentration of 100 ng/mL or lower and 101-1000 ng/mL; and the PRETEXT annotation factors metastatic disease (M), macrovascular involvement of all hepatic veins (V) or portal bifurcation (P), contiguous extrahepatic tumour (E), multifocal tumour (F), and spontaneous rupture (R). We defined five clinically relevant backbone groups on the basis of established prognostic factors: PRETEXT I/II, PRETEXT III, PRETEXT IV, metastatic disease, and AFP concentration of 100 ng/mL or lower at diagnosis. We then carried the additional factors into a hierarchical backwards elimination multivariable analysis and used the results to create a new international staging system. RESULTS: Within each backbone group, we identified constellations of factors that were most predictive of outcome in that group. The robustness of candidate models was then interrogated using the bootstrapping procedure. Using the clinically established PRETEXT groups I, II, III, and IV as our stems, we created risk stratification trees based on 5 year event-free survival and clinical applicability. We defined and adopted four risk groups: very low, low, intermediate, and high. INTERPRETATION: We have created a unified global approach to risk stratification in children with hepatoblastoma on the basis of rigorous statistical interrogation of what is, to the best of our knowledge, the largest dataset ever assembled for this rare paediatric tumour. This achievement provides the structural framework for further collaboration and prospective international cooperative study, such as the Paediatric Hepatic International Tumour Trial (PHITT). FUNDING: European Network for Cancer Research in Children and Adolescents, funded through the Framework Program 7 of the European Commission (grant number 261474); Children's Oncology Group CureSearch grant contributed by the Hepatoblastoma Foundation; Practical Research for Innovative Cancer Control and Project Promoting Clinical Trials for Development of New Drugs and Medical Devices, Japan Agency for Medical Research; and Swiss Cancer Research grant.
Assuntos
Hepatoblastoma/secundário , Neoplasias Hepáticas/patologia , Estadiamento de Neoplasias/normas , Adolescente , Criança , Pré-Escolar , Terapia Combinada , Comportamento Cooperativo , Bases de Dados Factuais , Feminino , Seguimentos , Hepatoblastoma/terapia , Humanos , Lactente , Recém-Nascido , Agências Internacionais , Japão , Neoplasias Hepáticas/terapia , Metástase Linfática , Masculino , Prognóstico , Estudos Prospectivos , Fatores de Risco , Taxa de Sobrevida , alfa-Fetoproteínas/metabolismoRESUMO
The motivation behind this research is to perform a privacy-preserving analysis of data located at remote sites and in different jurisdictions with no possibility of sharing individual-level information. Here, we present key findings from requirements analysis and a resulting federated data analysis workflow built using open-source research software, where patient-level information is securely stored and never exposed during the analysis process. We present additional improvements to further strengthen the security of the workflow. We emphasize and showcase the use of data harmonization in the analysis. The data analysis is done using the R language for statistical computing and DataSHIELD libraries for non-disclosive analysis of sensitive data. The workflow was validated against two data analysis scenarios, confirming the results obtained with a centralized analysis approach. The clinical datasets are part of the large Pan-European SARS-Cov-2 cohort, collected and managed by the ORCHESTRA project. We demonstrate the viability of establishing a cross-border federated data analysis framework and conducting an analysis without exposing patient-level information, achieving results equivalent to centralized non-secure analysis. However, it is vital to ensure requirements associated with data harmonization, anonymization and IT infrastructure to maintain availability, usability and data security.
Assuntos
Segurança Computacional , Fluxo de Trabalho , Humanos , COVID-19/prevenção & controle , Confidencialidade , Software , SARS-CoV-2 , Registros Eletrônicos de SaúdeRESUMO
Background: The ORCHESTRA project, funded by the European Commission, aims to create a pan-European cohort built on existing and new large-scale population cohorts to help rapidly advance the knowledge related to the prevention of the SARS-CoV-2 infection and the management of COVID-19 and its long-term sequelae. The integration and analysis of the very heterogeneous health data pose the challenge of building an innovative technological infrastructure as the foundation of a dedicated framework for data management that should address the regulatory requirements such as the General Data Protection Regulation (GDPR). Methods: The three participating Supercomputing European Centres (CINECA - Italy, CINES - France and HLRS - Germany) designed and deployed a dedicated infrastructure to fulfil the functional requirements for data management to ensure sensitive biomedical data confidentiality/privacy, integrity, and security. Besides the technological issues, many methodological aspects have been considered: Berlin Institute of Health (BIH), Charité provided its expertise both for data protection, information security, and data harmonisation/standardisation. Results: The resulting infrastructure is based on a multi-layer approach that integrates several security measures to ensure data protection. A centralised Data Collection Platform has been established in the Italian National Hub while, for the use cases in which data sharing is not possible due to privacy restrictions, a distributed approach for Federated Analysis has been considered. A Data Portal is available as a centralised point of access for non-sensitive data and results, according to findability, accessibility, interoperability, and reusability (FAIR) data principles. This technological infrastructure has been used to support significative data exchange between population cohorts and to publish important scientific results related to SARS-CoV-2. Conclusions: Considering the increasing demand for data usage in accordance with the requirements of the GDPR regulations, the experience gained in the project and the infrastructure released for the ORCHESTRA project can act as a model to manage future public health threats. Other projects could benefit from the results achieved by ORCHESTRA by building upon the available standardisation of variables, design of the architecture, and process used for GDPR compliance.
RESUMO
Automation of surveillance of infectious diseases-where algorithms are applied to routine care data to replace manual decisions-likely reduces workload and improves quality of surveillance. However, various barriers limit large-scale implementation of automated surveillance (AS). Current implementation strategies for AS in surveillance networks include central implementation (i.e. collecting all data centrally, and central algorithm application for case ascertainment) or local implementation (i.e. local algorithm application and sharing surveillance results with the network coordinating center). In this perspective, we explore whether current challenges can be solved by federated AS. In federated AS, scripts for analyses are developed centrally and applied locally. We focus on the potential of federated AS in the context of healthcare associated infections (AS-HAI) and of severe acute respiratory illness (AS-SARI). AS-HAI and AS-SARI have common and specific requirements, but both would benefit from decreased local surveillance burden, alignment of AS and increased central and local oversight, and improved access to data while preserving privacy. Federated AS combines some benefits of a centrally implemented system, such as standardization and alignment of an easily scalable methodology, with some of the benefits of a locally implemented system including (near) real-time access to data and flexibility in algorithms, meeting different information needs and improving sustainability, and allowance of a broader range of clinically relevant case-definitions. From a global perspective, it can promote the development of automated surveillance where it is not currently possible and foster international collaboration.The necessary transformation of source data likely will place a significant burden on healthcare facilities. However, this may be outweighed by the potential benefits: improved comparability of surveillance results, flexibility and reuse of data for multiple purposes. Governance and stakeholder agreement to address accuracy, accountability, transparency, digital literacy, and data protection, warrants clear attention to create acceptance of the methodology. In conclusion, federated automated surveillance seems a potential solution for current barriers of large-scale implementation of AS-HAI and AS-SARI. Prerequisites for successful implementation include validation of results and evaluation requirements of network participants to govern understanding and acceptance of the methodology.
Assuntos
Algoritmos , Humanos , Infecção Hospitalar/prevenção & controle , Automação , Monitoramento Epidemiológico , Infecções Respiratórias/epidemiologia , Infecções Respiratórias/prevenção & controleRESUMO
Within the HORIZON 2020 project ORCHESTRA, patient data from numerous clinical studies in Europe related to COVID-19 were harmonized to create new knowledge on the disease. In this article, we describe the ecosystem that was established for the management of data collected and contributed by project partners. Study protocols elements were mapped to interoperability standards to establish a common terminology. That served as the basis of identifying common concepts used across several studies. Harmonized data were used to perform analysis directly on a central database and also through federated analysis when data was not permitted to leave the local server(s). This ecosystem facilitates the answering of research questions and generation of new knowledge available for the scientific community.
Assuntos
Gerenciamento de Dados , Humanos , Bases de Dados Factuais , Europa (Continente)RESUMO
The COVID-19 pandemic has made it clear: sharing and exchanging data among research institutions is crucial in order to efficiently respond to global health threats. This can be facilitated by defining health data models based on interoperability standards. In Germany, a national effort is in progress to create common data models using international healthcare IT standards. In this context, collaborative work on a data set module for microbiology is of particular importance as the WHO has declared antimicrobial resistance one of the top global public health threats that humanity is facing. In this article, we describe how we developed a common model for microbiology data in an interdisciplinary collaborative effort and how we make use of the standard HL7 FHIR and terminologies such as SNOMED CT or LOINC to ensure syntactic and semantic interoperability. The use of international healthcare standards qualifies our data model to be adopted beyond the environment where it was first developed and used at an international level.
Assuntos
COVID-19 , Humanos , Pandemias , Alemanha , Instalações de Saúde , Ciências HumanasRESUMO
ORCHESTRA ("Connecting European Cohorts to Increase Common and Effective Response To SARS-CoV-2 Pandemic") is an EU-funded project which aims to help rapidly advance the knowledge related to the prevention of the SARS-CoV-2 infection and the management of COVID-19 and its long-term sequelae. Here, we describe the early results of this project, focusing on the strengths of multiple, international, historical and prospective cohort studies and highlighting those results which are of potential relevance for vaccination strategies, such as the necessity of a vaccine booster dose after a primary vaccination course in hematologic cancer patients and in solid organ transplant recipients to elicit a higher antibody titer, and the protective effect of vaccination on severe COVID-19 clinical manifestation and on the emergence of post-COVID-19 conditions. Valuable data regarding epidemiological variations, risk factors of SARS-CoV-2 infection and its sequelae, and vaccination efficacy in different subpopulations can support further defining public health vaccination policies.
RESUMO
The European project ORCHESTRA intends to create a new pan-European cohort to rapidly advance the knowledge of the effects and treatment of COVID-19. Establishing processes that facilitate the merging of heterogeneous clusters of retrospective data was an essential challenge. In addition, data from new ORCHESTRA prospective studies have to be compatible with earlier collected information to be efficiently combined. In this article, we describe how we utilized and contributed to existing standard terminologies to create consistent semantic representation of over 2500 COVID-19-related variables taken from three ORCHESTRA studies. The goal is to enable the semantic interoperability of data within the existing project studies and to create a common basis of standardized elements available for the design of new COVID-19 studies. We also identified 743 variables that were commonly used in two of the three prospective ORCHESTRA studies and can therefore be directly combined for analysis purposes. Additionally, we actively contributed to global interoperability by submitting new concept requests to the terminology Standards Development Organizations.
RESUMO
HiGHmed is a German Consortium where eight University Hospitals have agreed to the cross-institutional data exchange through novel medical informatics solutions. The HiGHmed Use Case Infection Control group has modelled a set of infection-related data in the openEHR format. In order to establish interoperability with the other German Consortia belonging to the same national initiative, we mapped the openEHR information to the Fast Healthcare Interoperability Resources (FHIR) format recommended within the initiative. FHIR enables fast exchange of data thanks to the discrete and independent data elements into which information is organized. Furthermore, to explore the possibility of maximizing analysis capabilities for our data set, we subsequently mapped the FHIR elements to the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM). The OMOP data model is designed to support the conduct of research to identify and evaluate associations between interventions and outcomes caused by these interventions. Mapping across standard allows to exploit their peculiarities while establishing and/or maintaining interoperability. This article provides an overview of our experience in mapping infection control related data across three different standards openEHR, FHIR and OMOP CDM.
Assuntos
Informática Médica , Registros Eletrônicos de Saúde , Hospitais Universitários , HumanosRESUMO
Infectious diseases due to microbial resistance pose a worldwide threat that calls for data sharing and the rapid reuse of medical data from health care to research. The integration of pathogen-related data from different hospitals can yield intelligent infection control systems that detect potentially dangerous germs as early as possible. Within the use case Infection Control of the German HiGHmed Project, eight university hospitals have agreed to share their data to enable analysis of various data sources. Data sharing among different hospitals requires interoperability standards that define the structure and the terminology of the information to be exchanged. This article presents the work performed at the University Hospital Charité and Berlin Institute of Health towards a standard model to exchange microbiology data. Fast Healthcare Interoperability Resources (FHIR) is a standard for fast information exchange that allows to model healthcare information, based on information packets called resources, which can be customized into so-called profiles to match use case- specific needs. We show how we created the specific profiles for microbiology data. The model was implemented using FHIR for the structure definition, and the international standards SNOMED CT and LOINC for the terminology services.
Assuntos
Logical Observation Identifiers Names and Codes , Systematized Nomenclature of Medicine , Academias e Institutos , Atenção à Saúde , Humanos , Disseminação de InformaçãoRESUMO
INTRODUCTION: Contemporary state-of-the-art management of cancer is increasingly defined by individualized treatment strategies. For very rare tumors, like hepatoblastoma, the development of biologic markers, and the identification of reliable prognostic risk factors for tailoring treatment, remains very challenging. The Children's Hepatic tumors International Collaboration (CHIC) is a novel international response to this challenge. METHODS: Four multicenter trial groups in the world, who have performed prospective controlled studies of hepatoblastoma over the past two decades (COG; SIOPEL; GPOH; and JPLT), joined forces to form the CHIC consortium. With the support of the data management group CINECA, CHIC developed a centralized online platform where data from eight completed hepatoblastoma trials were merged to form a database of 1605 hepatoblastoma cases treated between 1988 and 2008. The resulting dataset is described and the relationships between selected patient and tumor characteristics, and risk for adverse disease outcome (event-free survival; EFS) are examined. RESULTS: Significantly increased risk for EFS-event was noted for advanced PRETEXT group, macrovascular venous or portal involvement, contiguous extrahepatic disease, primary tumor multifocality and tumor rupture at enrollment. Higher age (≥ 8 years), low AFP (<100 ng/ml) and metastatic disease were associated with the worst outcome. CONCLUSION: We have identified novel prognostic factors for hepatoblastoma, as well as confirmed established factors, that will be used to develop a future common global risk stratification system. The mechanics of developing the globally accessible web-based portal, building and refining the database, and performing this first statistical analysis has laid the foundation for future collaborative efforts. This is an important step for refining of the risk based grouping and approach to future treatment stratification, thus we think our collaboration offers a template for others to follow in the study of rare tumors and diseases.
Assuntos
Comportamento Cooperativo , Bases de Dados Factuais , Hepatoblastoma , Cooperação Internacional , Neoplasias Hepáticas , Adolescente , Fatores Etários , Criança , Pré-Escolar , Bases de Dados Factuais/estatística & dados numéricos , Intervalo Livre de Doença , Feminino , Hepatoblastoma/diagnóstico , Hepatoblastoma/mortalidade , Hepatoblastoma/terapia , Humanos , Lactente , Recém-Nascido , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/mortalidade , Neoplasias Hepáticas/terapia , Masculino , Medição de Risco , Fatores de Risco , Análise de Sobrevida , Fatores de Tempo , Resultado do Tratamento , Adulto JovemRESUMO
Data that has been collected in the course of clinical trials are potentially valuable for additional scientific research questions in so called secondary use scenarios. This is of particular importance in rare disease areas like paediatric oncology. If data from several research projects need to be connected, so called Core Datasets can be used to define which information needs to be extracted from every involved source system. In this work, the utility of the Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM) as a format for Core Datasets was evaluated and a web tool was developed which received Source ODM XML files and--via Extensible Stylesheet Language Transformation (XSLT)--generated standardized Core Dataset ODM XML files. Using this tool, data from different source systems were extracted and pooled for joined analysis in a proof-of-concept study, facilitating both, basic syntactic and semantic interoperability.