Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
Add more filters










Publication year range
1.
Stud Health Technol Inform ; 316: 1472-1476, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176482

ABSTRACT

This study advances the utility of synthetic study data in hematology, particularly for Acute Myeloid Leukemia (AML), by facilitating its integration into healthcare systems and research platforms through standardization into the Observational Medical Outcomes Partnership (OMOP) and Fast Healthcare Interoperability Resources (FHIR) formats. In our previous work, we addressed the need for high-quality patient data and used CTAB-GAN+ and Normalizing Flow (NFlow) to synthesize data from 1606 patients across four multicenter AML clinical trials. We published the generated synthetic cohorts, that accurately replicate the distributions of key demographic, laboratory, molecular, and cytogenetic variables, alongside patient outcomes, demonstrating high fidelity and usability. The conversion to the OMOP format opens avenues for comparative observational multi-center research by enabling seamless combination with related OMOP datasets, thereby broadening the scope of AML research. Similarly, standardization into FHIR facilitates further developments of applications, e.g. via the SMART-on-FHIR platform, offering realistic test data. This effort aims to foster a more collaborative research environment and facilitate the development of innovative tools and applications in AML care and research.


Subject(s)
Leukemia, Myeloid, Acute , Humans , Hematology , Health Information Interoperability , Electronic Health Records , Outcome Assessment, Health Care
2.
Stud Health Technol Inform ; 316: 1555-1559, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176503

ABSTRACT

Predicting resource utilization can help to optimize the distribution of limited resources in the healthcare system. This requires different climatic and medical data from different sources, which can lead to problems with interoperability. In the paper we describe which data is needed for the prediction and how the data can be made interoperable using OMOP CDM.


Subject(s)
Forecasting , Humans , Electronic Health Records , Hot Temperature , Health Information Interoperability
3.
Stud Health Technol Inform ; 316: 1754-1758, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176555

ABSTRACT

Clinical decision support systems (CDSS) can efficiently support doctors in coping with ever-increasing amounts of data by providing evidence-based recommendations for medical decisions. To integrate the systems into the medical workflow and provide patient-specific recommendations for action in the context of personalized medicine, it is essential to tailor the systems to the context of use. This study aims to present an overview of factors influencing medical decision-making that CDSS must consider. Our approach involves the systematic identification and categorization of contextual factors relevant to medical decision-making. Through extensive literature research and a structured card-sorting workshop, we systematized 774 context factors and mapped them into a model. This model includes six primary entities: the treating physician, the patient, the patient's family, disease treatment, the physician's institution, and professional colleagues, each with their relevant context categories. The developed model could serve as a foundation for communication between developers and physicians, supporting the creation of more context-sensitive CDSS in the future. Ultimately, this could enhance the utilization of CDSS and improve patient care.


Subject(s)
Clinical Decision-Making , Decision Support Systems, Clinical , Humans
4.
Stud Health Technol Inform ; 316: 1161-1162, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176586

ABSTRACT

The evaluation of clinical utility is essential for the successful adoption of new technology in clinical practice. An approach to evaluating clinical utility is presented here using the example of digitized measurement instruments.


Subject(s)
Patient Care , Humans , Patient Care/standards
5.
Stud Health Technol Inform ; 316: 570-574, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176806

ABSTRACT

This paper reports lessons learned during the early phases of the user-centered design process for an explanation user interface for an AI-based clinical decision support system for the intensive care unit. This paper focuses on identifying and verifying physicians' explanation needs in a multi-center, multi-country project. The explanation needs identified through context analysis and user requirements prioritization in an initial center differed from those identified through questionnaire responses from N= 9 physicians after a multi-center project workshop. These results highlight the caution that should be taken when eliciting explanation needs during the user-centered design process.


Subject(s)
Artificial Intelligence , Decision Support Systems, Clinical , User-Computer Interface , User-Centered Design , Humans , Intensive Care Units
6.
Stud Health Technol Inform ; 316: 643-644, 2024 Aug 22.
Article in English | MEDLINE | ID: mdl-39176823

ABSTRACT

The integration of artificial intelligence (AI) algorithms into clinical practice holds immense potential to improve patient care, but widespread adoption still faces significant challenges, including interoperability issues. We propose a concept for the agile development of an IT platform to integrate AI-based applications into clinical workflows for a use case in ophthalmology.


Subject(s)
Artificial Intelligence , Systems Integration , Ophthalmology , Decision Support Systems, Clinical/organization & administration , Humans , Electronic Health Records , Algorithms , Workflow
7.
Orphanet J Rare Dis ; 19(1): 298, 2024 Aug 14.
Article in English | MEDLINE | ID: mdl-39143600

ABSTRACT

BACKGROUND: Given the geographical sparsity of Rare Diseases (RDs), assembling a cohort is often a challenging task. Common data models (CDM) can harmonize disparate sources of data that can be the basis of decision support systems and artificial intelligence-based studies, leading to new insights in the field. This work is sought to support the design of large-scale multi-center studies for rare diseases. METHODS: In an interdisciplinary group, we derived a list of elements of RDs in three medical domains (endocrinology, gastroenterology, and pneumonology) according to specialist knowledge and clinical guidelines in an iterative process. We then defined a RDs data structure that matched all our data elements and built Extract, Transform, Load (ETL) processes to transfer the structure to a joint CDM. To ensure interoperability of our developed CDM and its subsequent usage for further RDs domains, we ultimately mapped it to Observational Medical Outcomes Partnership (OMOP) CDM. We then included a fourth domain, hematology, as a proof-of-concept and mapped an acute myeloid leukemia (AML) dataset to the developed CDM. RESULTS: We have developed an OMOP-based rare diseases common data model (RD-CDM) using data elements from the three domains (endocrinology, gastroenterology, and pneumonology) and tested the CDM using data from the hematology domain. The total study cohort included 61,697 patients. After aligning our modules with those of Medical Informatics Initiative (MII) Core Dataset (CDS) modules, we leveraged its ETL process. This facilitated the seamless transfer of demographic information, diagnoses, procedures, laboratory results, and medication modules from our RD-CDM to the OMOP. For the phenotypes and genotypes, we developed a second ETL process. We finally derived lessons learned for customizing our RD-CDM for different RDs. DISCUSSION: This work can serve as a blueprint for other domains as its modularized structure could be extended towards novel data types. An interdisciplinary group of stakeholders that are actively supporting the project's progress is necessary to reach a comprehensive CDM. CONCLUSION: The customized data structure related to our RD-CDM can be used to perform multi-center studies to test data-driven hypotheses on a larger scale and take advantage of the analytical tools offered by the OHDSI community.


Subject(s)
Rare Diseases , Humans
8.
Dtsch Arztebl Int ; (Forthcoming)2024 08 23.
Article in English | MEDLINE | ID: mdl-39109409

ABSTRACT

BACKGROUND: We studied whether an individualized digital decision aid can improve decision-making quality for or against knee arthroplasty. METHODS: An app-based decision aid (EKIT tool) was developed and studied in a stepped-wedge, cluster-randomized trial. Consecutive patients with knee osteoarthritis who were candidates for knee replacement were included in 10 centers in Germany. All subjects were asked via app on a tablet about their symptoms, prior treatments, and preferences and goals for treatment. For the subjects in the intervention group, the EKIT tool was used in the doctor-patient discussion to visualize the individual disease burden and degree of fulfillment of the indication criteria, and structured information on knee arthroplasty was provided. In the control group, the discussion was conducted without the EKIT tool in accordance with the local standard in each participating center. The primary endpoint was the quality of the patient's decision on the basis of the discussion of indications, as measured with the Hip and Knee Quality Decision Instrument (HK-DQI). (Registration number: ClinicalTrials.gov:NCT04837053). RESULTS: 1092 patients were included, and data from 1055 patients were analyzed (616 in the intervention group and 439 in the control group). Good decision quality, as rated by the HK-DQI, was achieved by 86.0% of patients in the intervention group and 67.4% of patients in the control group (relative risk, 1.24; 95 % confidence interval, [1.15; 1.33]). CONCLUSION: A digital decision aid significantly improved the quality of decision-making for or against knee replacement surgery. The widespread use of this instrument may have an even larger effect, as this trial was conducted mainly in hospitals with high case numbers.

9.
Digit Health ; 10: 20552076241265219, 2024.
Article in English | MEDLINE | ID: mdl-39130526

ABSTRACT

Objective: Unlocking the potential of routine medical data for clinical research requires the analysis of data from multiple healthcare institutions. However, according to German data protection regulations, data can often not leave the individual institutions and decentralized approaches are needed. Decentralized studies face challenges regarding coordination, technical infrastructure, interoperability and regulatory compliance. Rare diseases are an important prototype research focus for decentralized data analyses, as patients are rare by definition and adequate cohort sizes can only be reached if data from multiple sites is combined. Methods: Within the project "Collaboration on Rare Diseases", decentralized studies focusing on four rare diseases (cystic fibrosis, phenylketonuria, Kawasaki disease, multisystem inflammatory syndrome in children) were conducted at 17 German university hospitals. Therefore, a data management process for decentralized studies was developed by an interdisciplinary team of experts from medicine, public health and data science. Along the process, lessons learned were formulated and discussed. Results: The process consists of eight steps and includes sub-processes for the definition of medical use cases, script development and data management. The lessons learned include on the one hand the organization and administration of the studies (collaboration of experts, use of standardized forms and publication of project information), and on the other hand the development of scripts and analysis (dependency on the database, use of standards and open source tools, feedback loops, anonymization). Conclusions: This work captures central challenges and describes possible solutions and can hence serve as a solid basis for the implementation and conduction of similar decentralized studies.

10.
Sci Rep ; 14(1): 16239, 2024 07 14.
Article in English | MEDLINE | ID: mdl-39004643

ABSTRACT

Aiming to apply automatic arousal detection to support sleep laboratories, we evaluated an optimized, state-of-the-art approach using data from daily work in our university hospital sleep laboratory. Therefore, a machine learning algorithm was trained and evaluated on 3423 polysomnograms of people with various sleep disorders. The model architecture is a U-net that accepts 50 Hz signals as input. We compared this algorithm with models trained on publicly available datasets, and evaluated these models using our clinical dataset, particularly with regard to the effects of different sleep disorders. In an effort to evaluate clinical relevance, we designed a metric based on the error of the predicted arousal index. Our models achieve an area under the precision recall curve (AUPRC) of up to 0.83 and F1 scores of up to 0.81. The model trained on our data showed no age or gender bias and no significant negative effect regarding sleep disorders on model performance compared to healthy sleep. In contrast, models trained on public datasets showed a small to moderate negative effect (calculated using Cohen's d) of sleep disorders on model performance. Therefore, we conclude that state-of-the-art arousal detection on our clinical data is possible with our model architecture. Thus, our results support the general recommendation to use a clinical dataset for training if the model is to be applied to clinical data.


Subject(s)
Arousal , Machine Learning , Polysomnography , Sleep Wake Disorders , Sleep , Humans , Arousal/physiology , Polysomnography/methods , Female , Male , Middle Aged , Sleep Wake Disorders/diagnosis , Sleep Wake Disorders/physiopathology , Adult , Sleep/physiology , Algorithms , Aged
11.
Health Informatics J ; 30(2): 14604582241259322, 2024.
Article in English | MEDLINE | ID: mdl-38855877

ABSTRACT

Patients with rare diseases commonly suffer from severe symptoms as well as chronic and sometimes life-threatening effects. Not only the rarity of the diseases but also the poor documentation of rare diseases often leads to an immense delay in diagnosis. One of the main problems here is the inadequate coding with common classifications such as the International Statistical Classification of Diseases and Related Health Problems. Instead, the ORPHAcode enables precise naming of the diseases. So far, just few approaches report in detail how the technical implementation of the ORPHAcode is done in clinical practice and for research. We present a concept and implementation of storing and mapping of ORPHAcodes. The Transition Database for Rare Diseases contains all the information of the Orphanet catalog and serves as the basis for documentation in the clinical information system as well as for monitoring Key Performance Indicators for rare diseases at the hospital. The five-step process (especially using open source tools and the DataVault 2.0 logic) for set-up the Transition Database allows the approach to be adapted to local conditions as well as to be extended for additional terminologies and ontologies.


Subject(s)
Databases, Factual , Documentation , Rare Diseases , Rare Diseases/classification , Rare Diseases/diagnosis , Humans , Documentation/methods , Documentation/standards , International Classification of Diseases/trends , International Classification of Diseases/standards
12.
Front Med (Lausanne) ; 11: 1377209, 2024.
Article in English | MEDLINE | ID: mdl-38903818

ABSTRACT

Introduction: Obtaining real-world data from routine clinical care is of growing interest for scientific research and personalized medicine. Despite the abundance of medical data across various facilities - including hospitals, outpatient clinics, and physician practices - the intersectoral exchange of information remains largely hindered due to differences in data structure, content, and adherence to data protection regulations. In response to this challenge, the Medical Informatics Initiative (MII) was launched in Germany, focusing initially on university hospitals to foster the exchange and utilization of real-world data through the development of standardized methods and tools, including the creation of a common core dataset. Our aim, as part of the Medical Informatics Research Hub in Saxony (MiHUBx), is to extend the MII concepts to non-university healthcare providers in a more seamless manner to enable the exchange of real-world data among intersectoral medical sites. Methods: We investigated what services are needed to facilitate the provision of harmonized real-world data for cross-site research. On this basis, we designed a Service Platform Prototype that hosts services for data harmonization, adhering to the globally recognized Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) international standard communication format and the Observational Medical Outcomes Partnership (OMOP) common data model (CDM). Leveraging these standards, we implemented additional services facilitating data utilization, exchange and analysis. Throughout the development phase, we collaborated with an interdisciplinary team of experts from the fields of system administration, software engineering and technology acceptance to ensure that the solution is sustainable and reusable in the long term. Results: We have developed the pre-built packages "ResearchData-to-FHIR," "FHIR-to-OMOP," and "Addons," which provide the services for data harmonization and provision of project-related real-world data in both the FHIR MII Core dataset format (CDS) and the OMOP CDM format as well as utilization and a Service Platform Prototype to streamline data management and use. Conclusion: Our development shows a possible approach to extend the MII concepts to non-university healthcare providers to enable cross-site research on real-world data. Our Service Platform Prototype can thus pave the way for intersectoral data sharing, federated analysis, and provision of SMART-on-FHIR applications to support clinical decision making.

14.
Article in German | MEDLINE | ID: mdl-38750239

ABSTRACT

Health data are extremely important in today's data-driven world. Through automation, healthcare processes can be optimized, and clinical decisions can be supported. For any reuse of data, the quality, validity, and trustworthiness of data are essential, and it is the only way to guarantee that data can be reused sensibly. Specific requirements for the description and coding of reusable data are defined in the FAIR guiding principles for data stewardship. Various national research associations and infrastructure projects in the German healthcare sector have already clearly positioned themselves on the FAIR principles: both the infrastructures of the Medical Informatics Initiative and the University Medicine Network operate explicitly on the basis of the FAIR principles, as do the National Research Data Infrastructure for Personal Health Data and the German Center for Diabetes Research.To ensure that a resource complies with the FAIR principles, the degree of FAIRness should first be determined (so-called FAIR assessment), followed by the prioritization for improvement steps (so-called FAIRification). Since 2016, a set of tools and guidelines have been developed for both steps, based on the different, domain-specific interpretations of the FAIR principles.Neighboring European countries have also invested in the development of a national framework for semantic interoperability in the context of the FAIR (Findable, Accessible, Interoperable, Reusable) principles. Concepts for comprehensive data enrichment were developed to simplify data analysis, for example, in the European Health Data Space or via the Observational Health Data Sciences and Informatics network. With the support of the European Open Science Cloud, among others, structured FAIRification measures have already been taken for German health datasets.


Subject(s)
Electronic Health Records , Humans , Germany , Internationality , National Health Programs
15.
Article in German | MEDLINE | ID: mdl-38753021

ABSTRACT

The digital health progress hubs pilot the extensibility of the concepts and solutions of the Medical Informatics Initiative to improve regional healthcare and research. The six funded projects address different diseases, areas in regional healthcare, and methods of cross-institutional data linking and use. Despite the diversity of the scenarios and regional conditions, the technical, regulatory, and organizational challenges and barriers that the progress hubs encounter in the actual implementation of the solutions are often similar. This results in some common approaches to solutions, but also in political demands that go beyond the Health Data Utilization Act, which is considered a welcome improvement by the progress hubs.In this article, we present the digital progress hubs and discuss achievements, challenges, and approaches to solutions that enable the shared use of data from university hospitals and non-academic institutions in the healthcare system and can make a sustainable contribution to improving medical care and research.


Subject(s)
Hospitals, University , Hospitals, University/organization & administration , Germany , Humans , Medical Record Linkage/methods , Electronic Health Records/trends , Models, Organizational , National Health Programs/trends , National Health Programs/organization & administration , Medical Informatics/organization & administration , Medical Informatics/trends , Digital Health
16.
NPJ Digit Med ; 7(1): 76, 2024 Mar 20.
Article in English | MEDLINE | ID: mdl-38509224

ABSTRACT

Clinical research relies on high-quality patient data, however, obtaining big data sets is costly and access to existing data is often hindered by privacy and regulatory concerns. Synthetic data generation holds the promise of effectively bypassing these boundaries allowing for simplified data accessibility and the prospect of synthetic control cohorts. We employed two different methodologies of generative artificial intelligence - CTAB-GAN+ and normalizing flows (NFlow) - to synthesize patient data derived from 1606 patients with acute myeloid leukemia, a heterogeneous hematological malignancy, that were treated within four multicenter clinical trials. Both generative models accurately captured distributions of demographic, laboratory, molecular and cytogenetic variables, as well as patient outcomes yielding high performance scores regarding fidelity and usability of both synthetic cohorts (n = 1606 each). Survival analysis demonstrated close resemblance of survival curves between original and synthetic cohorts. Inter-variable relationships were preserved in univariable outcome analysis enabling explorative analysis in our synthetic data. Additionally, training sample privacy is safeguarded mitigating possible patient re-identification, which we quantified using Hamming distances. We provide not only a proof-of-concept for synthetic data generation in multimodal clinical data for rare diseases, but also full public access to synthetic data sets to foster further research.

17.
JMIR Med Inform ; 12: e52967, 2024 Feb 14.
Article in English | MEDLINE | ID: mdl-38354027

ABSTRACT

BACKGROUND: Multisite clinical studies are increasingly using real-world data to gain real-world evidence. However, due to the heterogeneity of source data, it is difficult to analyze such data in a unified way across clinics. Therefore, the implementation of Extract-Transform-Load (ETL) or Extract-Load-Transform (ELT) processes for harmonizing local health data is necessary, in order to guarantee the data quality for research. However, the development of such processes is time-consuming and unsustainable. A promising way to ease this is the generalization of ETL/ELT processes. OBJECTIVE: In this work, we investigate existing possibilities for the development of generic ETL/ELT processes. Particularly, we focus on approaches with low development complexity by using descriptive metadata and structural metadata. METHODS: We conducted a literature review following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We used 4 publication databases (ie, PubMed, IEEE Explore, Web of Science, and Biomed Center) to search for relevant publications from 2012 to 2022. The PRISMA flow was then visualized using an R-based tool (Evidence Synthesis Hackathon). All relevant contents of the publications were extracted into a spreadsheet for further analysis and visualization. RESULTS: Regarding the PRISMA guidelines, we included 33 publications in this literature review. All included publications were categorized into 7 different focus groups (ie, medicine, data warehouse, big data, industry, geoinformatics, archaeology, and military). Based on the extracted data, ontology-based and rule-based approaches were the 2 most used approaches in different thematic categories. Different approaches and tools were chosen to achieve different purposes within the use cases. CONCLUSIONS: Our literature review shows that using metadata-driven (MDD) approaches to develop an ETL/ELT process can serve different purposes in different thematic categories. The results show that it is promising to implement an ETL/ELT process by applying MDD approach to automate the data transformation from Fast Healthcare Interoperability Resources to Observational Medical Outcomes Partnership Common Data Model. However, the determining of an appropriate MDD approach and tool to implement such an ETL/ELT process remains a challenge. This is due to the lack of comprehensive insight into the characterizations of the MDD approaches presented in this study. Therefore, our next step is to evaluate the MDD approaches presented in this study and to determine the most appropriate MDD approaches and the way to integrate them into the ETL/ELT process. This could verify the ability of using MDD approaches to generalize the ETL process for harmonizing medical data.

18.
BMC Med Inform Decis Mak ; 24(1): 58, 2024 Feb 26.
Article in English | MEDLINE | ID: mdl-38408983

ABSTRACT

BACKGROUND: To gain insight into the real-life care of patients in the healthcare system, data from hospital information systems and insurance systems are required. Consequently, linking clinical data with claims data is necessary. To ensure their syntactic and semantic interoperability, the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) from the Observational Health Data Sciences and Informatics (OHDSI) community was chosen. However, there is no detailed guide that would allow researchers to follow a generic process for data harmonization, i.e. the transformation of local source data into the standardized OMOP CDM format. Thus, the aim of this paper is to conceptualize a generic data harmonization process for OMOP CDM. METHODS: For this purpose, we conducted a literature review focusing on publications that address the harmonization of clinical or claims data in OMOP CDM. Subsequently, the process steps used and their chronological order as well as applied OHDSI tools were extracted for each included publication. The results were then compared to derive a generic sequence of the process steps. RESULTS: From 23 publications included, a generic data harmonization process for OMOP CDM was conceptualized, consisting of nine process steps: dataset specification, data profiling, vocabulary identification, coverage analysis of vocabularies, semantic mapping, structural mapping, extract-transform-load-process, qualitative and quantitative data quality analysis. Furthermore, we identified seven OHDSI tools which supported five of the process steps. CONCLUSIONS: The generic data harmonization process can be used as a step-by-step guide to assist other researchers in harmonizing source data in OMOP CDM.


Subject(s)
Medical Informatics , Vocabulary , Humans , Databases, Factual , Data Science , Semantics , Electronic Health Records
19.
PLoS One ; 19(1): e0297039, 2024.
Article in English | MEDLINE | ID: mdl-38295046

ABSTRACT

BACKGROUND: The COVID-19 pandemic revealed a need for better collaboration among research, care, and management in Germany as well as globally. Initially, there was a high demand for broad data collection across Germany, but as the pandemic evolved, localized data became increasingly necessary. Customized dashboards and tools were rapidly developed to provide timely and accurate information. In Saxony, the DISPENSE project was created to predict short-term hospital bed capacity demands, and while it was successful, continuous adjustments and the initial monolithic system architecture of the application made it difficult to customize and scale. METHODS: To analyze the current state of the DISPENSE tool, we conducted an in-depth analysis of the data processing steps and identified data flows underlying users' metrics and dashboards. We also conducted a workshop to understand the different views and constraints of specific user groups, and brought together and clustered the information according to content-related service areas to determine functionality-related service groups. Based on this analysis, we developed a concept for the system architecture, modularized the main services by assigning specialized applications and integrated them into the existing system, allowing for self-service reporting and evaluation of the expert groups' needs. RESULTS: We analyzed the applications' dataflow and identified specific user groups. The functionalities of the monolithic application were divided into specific service groups for data processing, data storage, predictions, content visualization, and user management. After composition and implementation, we evaluated the new system architecture against the initial requirements by enabling self-service reporting to the users. DISCUSSION: By modularizing the monolithic application and creating a more flexible system, the challenges of rapidly changing requirements, growing need for information, and high administrative efforts were addressed. CONCLUSION: We demonstrated an improved adaptation towards the needs of various user groups, increased efficiency, and reduced burden on administrators, while also enabling self-service functionalities and specialization of single applications on individual service groups.


Subject(s)
Information Storage and Retrieval , Pandemics , Humans , Data Collection , Germany
20.
Sci Rep ; 14(1): 2287, 2024 01 27.
Article in English | MEDLINE | ID: mdl-38280887

ABSTRACT

The emergence of collaborations, which standardize and combine multiple clinical databases across different regions, provide a wealthy source of data, which is fundamental for clinical prediction models, such as patient-level predictions. With the aid of such large data pools, researchers are able to develop clinical prediction models for improved disease classification, risk assessment, and beyond. To fully utilize this potential, Machine Learning (ML) methods are commonly required to process these large amounts of data on disease-specific patient cohorts. As a consequence, the Observational Health Data Sciences and Informatics (OHDSI) collaborative develops a framework to facilitate the application of ML models for these standardized patient datasets by using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM). In this study, we compare the feasibility of current web-based OHDSI approaches, namely ATLAS and "Patient-level Prediction" (PLP), against a native solution (R based) to conduct such ML-based patient-level prediction analyses in OMOP. This will enable potential users to select the most suitable approach for their investigation. Each of the applied ML solutions was individually utilized to solve the same patient-level prediction task. Both approaches went through an exemplary benchmarking analysis to assess the weaknesses and strengths of the PLP R-Package. In this work, the performance of this package was subsequently compared versus the commonly used native R-package called Machine Learning in R 3 (mlr3), and its sub-packages. The approaches were evaluated on performance, execution time, and ease of model implementation. The results show that the PLP package has shorter execution times, which indicates great scalability, as well as intuitive code implementation, and numerous possibilities for visualization. However, limitations in comparison to native packages were depicted in the implementation of specific ML classifiers (e.g., Lasso), which may result in a decreased performance for real-world prediction problems. The findings here contribute to the overall effort of developing ML-based prediction models on a clinical scale and provide a snapshot for future studies that explicitly aim to develop patient-level prediction models in OMOP CDM.


Subject(s)
Machine Learning , Medical Informatics , Humans , Databases, Factual , Electronic Health Records
SELECTION OF CITATIONS
SEARCH DETAIL