Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 102
Filtrar
1.
Stud Health Technol Inform ; 316: 1465-1466, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176480

RESUMEN

Key Research Areas (KRAs) were identified to establish a semantic interoperability framework for intensive medicine data in Europe. These include assessing common data model value, ensuring smooth data interoperability, supporting data standardization for efficient dataset use, and defining anonymization requirements to balance data protection and innovation.


Asunto(s)
Registros Electrónicos de Salud , Europa (Continente) , Humanos , Interoperabilidad de la Información en Salud , Cuidados Críticos , Seguridad Computacional , Semántica
2.
Stud Health Technol Inform ; 316: 1577-1581, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176509

RESUMEN

Hospital laboratory results are a significant data source in Clinical Data Ware-houses (CDW). To ensure comparability across healthcare organizations and for use in research studies, the results need to be interoperable. The LOINC (Logical Observation Identifiers, Names, and Codes) terminology provides a unique identifier for local codes for lab tests, enabling interoperability. However, in real-world, events occur over time and can disrupt the distribution of lab result values. For example, new equipment may be added to the analysis pipeline, a machine may be replaced, formulas may evolve due to new scientific knowledge, and legacy terminologies may be adopted. This article proposes a pipeline for creating an automated dashboard to monitor these events and data quality. We used automatic change point detection methods such as PELT for event detection in lab results. For a given LOINC code, we create a dashboard that summarizes the number of local codes mapped, and the number of patients (by sex, age, and hospital service) associated with the code. Finally, the dashboard enables the visualization of time events that disrupt the signal distribution. The biologists were able to explain to us the changes for several biological assays.


Asunto(s)
Data Warehousing , Humanos , Logical Observation Identifiers Names and Codes , Sistemas de Información en Laboratorio Clínico , Registros Electrónicos de Salud , Interfaz Usuario-Computador
3.
Stud Health Technol Inform ; 316: 1584-1588, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176511

RESUMEN

This study assesses the effectiveness of the Observational Medical Outcomes Partnership common data model (OMOP CDM) in standardising Continuous Renal Replacement Therapy (CRRT) data from intensive care units (ICU) of two French university hospitals. Our objective was to extract and standardise data from various sources, enabling the development of predictive models for CRRT weaning that are agnostic to the data's origin. Data for 1,696 ICU stays from the two data sources were extracted, transformed, and loaded into the OMOP format after semantic alignment of 46 CRRT standard concepts. Although the OMOP CDM demonstrated potential in harmonising CRRT data, we encountered challenges related to data variability and the lack of standard concepts. Despite these challenges, our study supports the promise of the OMOP CDM for ICU data standardization, suggesting that further refinement and adaptation could significantly improve clinical decision making and patient outcomes in critical care settings.


Asunto(s)
Unidades de Cuidados Intensivos , Humanos , Francia , Unidades de Cuidados Intensivos/normas , Terapia de Reemplazo Renal Continuo , Exactitud de los Datos , Cuidados Críticos/normas , Terapia de Reemplazo Renal/normas
4.
Stud Health Technol Inform ; 316: 1605-1606, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176517

RESUMEN

This paper presents the development of a visualization dashboard for quality indicators in intensive care units (ICUs), using the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). The dashboard enables the user to visualize quality indicator data using histograms, pie charts and tables. Our project uses the OMOP CDM, ensuring a seamless implementation of our dashboard across various hospitals. Future directions for our research include expanding the dashboard to incorporate additional quality indicators and evaluating clinicians' feedback on its effectiveness.


Asunto(s)
Unidades de Cuidados Intensivos , Indicadores de Calidad de la Atención de Salud , Unidades de Cuidados Intensivos/normas , Cuidados Críticos/normas , Humanos , Interfaz Usuario-Computador , Evaluación de Resultado en la Atención de Salud , Benchmarking
5.
Stud Health Technol Inform ; 316: 1679-1683, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176533

RESUMEN

The Ouest Data Hub (ODH) a project lead by GCS HUGO which is a cooperation group of University Hospitals in the French Grand Ouest region represents a groundbreaking initiative in this territory, advancing health data sharing and reuse to support research driven by real-world health data. Central to its structure are the Clinical Data Warehouses (CDWs) and Clinical Data Centers (CDCs), essential for analytics and as the linchpin of the ODH's status as an interregional Learning Health System. Aimed at fostering innovation and research, the ODH's collaborative and multi-institutional model effectively utilizes both local and shared resources. Yet, the path is not without challenges, especially in data quality and interoperability, where ongoing harmonization and standard adherence are critical. In 2023, this facilitated access to extensive health data from over 9.3 million patient records, demonstrating the ODH's capacity for both monocentric and multicentric research across various clinical fields, in close collaboration with physicians. The integration of healthcare professionals is crucial, ensuring data's clinical relevance and guiding accurate interpretations. Future expansions of the ODH to new hospitals and data types promise to enhance its model further, already inspiring similar frameworks across France. This scalable model for health data ecosystems showcases the ODH's potential as a foundation for national and supranational data sharing efforts.


Asunto(s)
Difusión de la Información , Francia , Humanos , Registros Electrónicos de Salud , Data Warehousing , Investigación Biomédica
6.
Stud Health Technol Inform ; 316: 1739-1743, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176549

RESUMEN

Continuous unfractionated heparin is widely used in intensive care, yet its complex pharmacokinetic properties complicate the determination of appropriate doses. To address this challenge, we developed machine learning models to predict over- and under-dosing, based on anti-Xa results, using a monocentric retrospective dataset. The random forest model achieved a mean AUROC of 0.80 [0.77-0.83], while the XGB model reached a mean AUROC of 0.80 [0.76-0.83]. Feature importance was employed to enhance the interpretability of the model, a critical factor for clinician acceptance. After prospective validation, machine learning models such as those developed in this study could be implemented within a computerized physician order entry (CPOE) as a clinical decision support system (CDSS).


Asunto(s)
Anticoagulantes , Sistemas de Apoyo a Decisiones Clínicas , Heparina , Unidades de Cuidados Intensivos , Aprendizaje Automático , Heparina/uso terapéutico , Humanos , Anticoagulantes/uso terapéutico , Sistemas de Entrada de Órdenes Médicas , Estudios Retrospectivos
7.
Stud Health Technol Inform ; 316: 1373-1377, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176636

RESUMEN

The ONCO-FAIR project's initial experimentation aims to enhance data interoperability in oncology chemotherapy treatments, adhering to the FAIR principles. This study focuses on integrating the HL7 FHIR standard to address interoperability challenges within chemotherapy data exchange. Collaborating with healthcare institutions in Rennes, the research team assessed the limitations of current standards such as PN13, mCODE, and OSIRIS, leading to the customization of twelve FHIR resources complemented by two chemotherapy-specific extensions. The methodological approach follows the Integrating the Healthcare Enterprise (IHE) framework, organizing the process into four key stages to ensure the effectiveness and relevance of health data reuse for research. This framework facilitated the identification of chemotherapy-specific needs, the evaluation of existing standards, and data modeling through a FHIR implementation guide. The article underscores the importance of upstream interoperability for aligning chemotherapy software with clinical data warehouse infrastructure, showcasing the proposed solution's capability to overcome interoperability barriers and promote data reuse in line with FAIR principles. Furthermore, it discusses future directions, including extending this approach to other oncology data categories and enhancing downstream interoperability with health data sharing platforms.


Asunto(s)
Interoperabilidad de la Información en Salud , Humanos , Interoperabilidad de la Información en Salud/normas , Antineoplásicos/uso terapéutico , Oncología Médica/normas , Estándar HL7/normas , Registros Electrónicos de Salud , Neoplasias/tratamiento farmacológico , Data Warehousing
8.
Stud Health Technol Inform ; 316: 221-225, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176713

RESUMEN

This paper introduces a novel approach aimed at enhancing the accessibility of clinical data warehouses (CDWs) for external users, particularly researchers and biomedical companies interested in developing and testing their solutions. The primary focus is on proposing a clinical data catalogue designed to elucidate the contents of CDWs, facilitating biomedical project launch and completion. The catalogue is designed to address three fundamental inquiries that external users may have regarding CDWs: "What data is available, how much data is present, and how was it generated?" Additionally, the paper showcases a prototype of the catalogue through a visualization example, utilizing data from the CDW of Rennes University Hospital.


Asunto(s)
Data Warehousing , Registros Electrónicos de Salud , Humanos
9.
Stud Health Technol Inform ; 316: 611-615, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176816

RESUMEN

Secure extraction of Personally Identifiable Information (PII) from Electronic Health Records (EHRs) presents significant privacy and security challenges. This study explores the application of Federated Learning (FL) to overcome these challenges within the context of French EHRs. By utilizing a multilingual BERT model in an FL simulation involving 20 hospitals, each represented by a unique medical department or pole, we compared the performance of two setups: individual models, where each hospital uses only its own training and validation data without engaging in the FL process, and federated models, where multiple hospitals collaborate to train a global FL model. Our findings demonstrate that FL models not only preserve data confidentiality but also outperform the individual models. In fact, the Global FL model achieved an F1 score of 75,7%, slightly comparable to that of the Centralized approach at 78,5%. This research underscores the potential of FL in extracting PIIs from EHRs, encouraging its broader adoption in health data analysis.


Asunto(s)
Seguridad Computacional , Confidencialidad , Registros Electrónicos de Salud , Aprendizaje Automático , Francia , Humanos , Registros de Salud Personal
10.
Stud Health Technol Inform ; 316: 1979-1983, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176881

RESUMEN

Electronic health data concerning implantable medical devices (IMD) opens opportunities for dynamic real-world monitoring to assess associated risks related to implanted materials. Due to population ageing and expanding demands, total hip, knee, and shoulder arthroplasties are increasing. Automating the collection and analysis of orthopedic device features could benefit physicians and public health policies enabling early issue detection, IMD monitoring and patient safety assessment. A machine learning tool using natural language processing (NLP) was developed for the automated extraction of operation information from medical reports in orthopedics. A corpus of 959 orthopaedic operative reports from 5 centres was manually annotated using the Prodigy software® with a strong inter-annotator agreement of 0.80. Data to extract concerned key clinical and procedure information (n= 9) selected by a multidisciplinary group based on the French health authority checklist. Performances parameters of the NLP model estimated an overall strong precision and recall of respectively 97.0 and 96.0 with a F1-score 96.3. Systematic monitoring of orthopedic devices could be ensured by an automated tool, leveraging clinical data warehouses. Traceability of medical devices with implantation modalities will allow detection of implant factors leading to complications. The evidence from real-world data could provide concrete and dynamic insights to surgeons and infectious disease specialists concerning implant follow-up, guiding therapeutic decision-making, and informing public health policymakers. The tool will be applied on clinical data warehouses to automate information extraction and presentation, providing feedback on mandatory information completion and contents of operative reports to support improvements, and thereafter implant research projects.


Asunto(s)
Registros Electrónicos de Salud , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Francia , Humanos , Procedimientos Ortopédicos
11.
Stud Health Technol Inform ; 316: 813-817, 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39176916

RESUMEN

The application of machine learning algorithms in clinical decision support systems (CDSS) holds great promise for advancing patient care, yet practical implementation faces significant evaluation challenges. Through a scoping review, we investigate the common definitions of ground truth to collect clinically relevant reference values, as well as the typical metrics and combinations employed for assessing trueness. Our analysis reveals that ground truth definition is mostly not in accordance with the standard ISO expectation and that used combination of metrics does not usually cover all aspects of CDSS trueness, particularly neglecting the negative class perspective.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Aprendizaje Automático , Humanos
12.
BMC Med Inform Decis Mak ; 24(1): 54, 2024 Feb 16.
Artículo en Inglés | MEDLINE | ID: mdl-38365677

RESUMEN

BACKGROUND: Electronic health records (EHRs) contain valuable information for clinical research; however, the sensitive nature of healthcare data presents security and confidentiality challenges. De-identification is therefore essential to protect personal data in EHRs and comply with government regulations. Named entity recognition (NER) methods have been proposed to remove personal identifiers, with deep learning-based models achieving better performance. However, manual annotation of training data is time-consuming and expensive. The aim of this study was to develop an automatic de-identification pipeline for all kinds of clinical documents based on a distant supervised method to significantly reduce the cost of manual annotations and to facilitate the transfer of the de-identification pipeline to other clinical centers. METHODS: We proposed an automated annotation process for French clinical de-identification, exploiting data from the eHOP clinical data warehouse (CDW) of the CHU de Rennes and national knowledge bases, as well as other features. In addition, this paper proposes an assisted data annotation solution using the Prodigy annotation tool. This approach aims to reduce the cost required to create a reference corpus for the evaluation of state-of-the-art NER models. Finally, we evaluated and compared the effectiveness of different NER methods. RESULTS: A French de-identification dataset was developed in this work, based on EHRs provided by the eHOP CDW at Rennes University Hospital, France. The dataset was rich in terms of personal information, and the distribution of entities was quite similar in the training and test datasets. We evaluated a Bi-LSTM + CRF sequence labeling architecture, combined with Flair + FastText word embeddings, on a test set of manually annotated clinical reports. The model outperformed the other tested models with a significant F1 score of 96,96%, demonstrating the effectiveness of our automatic approach for deidentifying sensitive information. CONCLUSIONS: This study provides an automatic de-identification pipeline for clinical notes, which can facilitate the reuse of EHRs for secondary purposes such as clinical research. Our study highlights the importance of using advanced NLP techniques for effective de-identification, as well as the need for innovative solutions such as distant supervision to overcome the challenge of limited annotated data in the medical domain.


Asunto(s)
Aprendizaje Profundo , Humanos , Anonimización de la Información , Registros Electrónicos de Salud , Análisis Costo-Beneficio , Confidencialidad , Procesamiento de Lenguaje Natural
13.
Eur Heart J Open ; 4(1): oead133, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38196848

RESUMEN

Aims: Patients presenting symptoms of heart failure with preserved ejection fraction (HFpEF) are not a homogenous population. Different phenotypes can differ in prognosis and optimal management strategies. We sought to identify phenotypes of HFpEF by using the medical information database from a large university hospital centre using machine learning. Methods and results: We explored the use of clinical variables from electronic health records in addition to echocardiography to identify different phenotypes of patients with HFpEF. The proposed methodology identifies four phenotypic clusters based on both clinical and echocardiographic characteristics, which have differing prognoses (death and cardiovascular hospitalization). Conclusion: This work demonstrated that artificial intelligence-derived phenotypes could be used as a tool for physicians to assess risk and to target therapies that may improve outcomes.

14.
Stud Health Technol Inform ; 302: 342-343, 2023 May 18.
Artículo en Inglés | MEDLINE | ID: mdl-37203675

RESUMEN

In France and in other countries, we observed a significant growth in human polyvalent immunoglobulins (PvIg) usage. PvIg is manufactured from plasma collected from numeral donors, and its production is complex. Supply tensions have been observed for several years, and it is necessary to limit their consumption. Therefore, French Health Authority (FHA) provided guidelines in June 2018 to restrict their usage. This research aims to assess the guidelines' impact of the FHA on the use of PvIg. We analyzed data from Rennes University Hospital, where all PvIg prescriptions are reported electronically with quantity, rhythm, and indication. From the clinical data warehouses of RUH, we extracted comorbidities and lab results to evaluate the more complex guidelines. We globally noticed a reduction in the consumption of PvIg after the guidelines. Compliance with the recommended quantities and rhythms have also been observed. By combining two sources of data, we have been able to show an impact of FHA's guidelines on the consumption of PvIg.


Asunto(s)
Data Warehousing , Inmunoglobulinas , Humanos , Prescripciones de Medicamentos , Comorbilidad , Francia
15.
Health Informatics J ; 29(1): 14604582221146709, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36964666

RESUMEN

Defining profiles of patients that could benefit from relevant anti-cancer treatments is essential. An increasing number of specific criteria are necessary to be eligible to specific anti-cancer therapies. This study aimed to develop an automated algorithm able to detect patient and tumor characteristics to reduce the time-consuming prescreening for trial inclusions without delay. Hence, 640 anonymized multidisciplinary team meetings (MTM) reports concerning lung cancers from one French teaching hospital data warehouse between 2018 and 2020 were annotated. To automate the extraction of eight major eligibility criteria, corresponding to 52 classes, regular expressions were implemented. The RegEx's evaluation gave a F1-score of 93% in average, a positive predictive value (precision) of 98% and sensitivity (recall) of 92%. However, in MTM, fill rates variabilities among patient and tumor information remained important (from 31% to 100%). Genetic mutations and rearrangement test results were the least reported characteristics and also the hardest to automatically extract. To ease prescreening in clinical trials, the PreScIOUs study demonstrated the additional value of rule based and machine learning based methods applied on lung cancer MTM reports.


Asunto(s)
Neoplasias Pulmonares , Procesamiento de Lenguaje Natural , Humanos , Neoplasias Pulmonares/terapia , Registros Electrónicos de Salud , Algoritmos , Grupo de Atención al Paciente
16.
JMIR Public Health Surveill ; 9: e34982, 2023 01 31.
Artículo en Inglés | MEDLINE | ID: mdl-36719726

RESUMEN

BACKGROUND: Disease surveillance systems capable of producing accurate real-time and short-term forecasts can help public health officials design timely public health interventions to mitigate the effects of disease outbreaks in affected populations. In France, existing clinic-based disease surveillance systems produce gastroenteritis activity information that lags real time by 1 to 3 weeks. This temporal data gap prevents public health officials from having a timely epidemiological characterization of this disease at any point in time and thus leads to the design of interventions that do not take into consideration the most recent changes in dynamics. OBJECTIVE: The goal of this study was to evaluate the feasibility of using internet search query trends and electronic health records to predict acute gastroenteritis (AG) incidence rates in near real time, at the national and regional scales, and for long-term forecasts (up to 10 weeks). METHODS: We present 2 different approaches (linear and nonlinear) that produce real-time estimates, short-term forecasts, and long-term forecasts of AG activity at 2 different spatial scales in France (national and regional). Both approaches leverage disparate data sources that include disease-related internet search activity, electronic health record data, and historical disease activity. RESULTS: Our results suggest that all data sources contribute to improving gastroenteritis surveillance for long-term forecasts with the prominent predictive power of historical data owing to the strong seasonal dynamics of this disease. CONCLUSIONS: The methods we developed could help reduce the impact of the AG peak by making it possible to anticipate increased activity by up to 10 weeks.


Asunto(s)
Brotes de Enfermedades , Registros Electrónicos de Salud , Humanos , Salud Pública/métodos , Internet , Francia/epidemiología
17.
JMIR Public Health Surveill ; 8(12): e37122, 2022 12 22.
Artículo en Inglés | MEDLINE | ID: mdl-36548023

RESUMEN

BACKGROUND: Traditionally, dengue prevention and control rely on vector control programs and reporting of symptomatic cases to a central health agency. However, case reporting is often delayed, and the true burden of dengue disease is often underestimated. Moreover, some countries do not have routine control measures for vector control. Therefore, researchers are constantly assessing novel data sources to improve traditional surveillance systems. These studies are mostly carried out in big territories and rarely in smaller endemic regions, such as Martinique and the Lesser Antilles. OBJECTIVE: The aim of this study was to determine whether heterogeneous real-world data sources could help reduce reporting delays and improve dengue monitoring in Martinique island, a small endemic region. METHODS: Heterogenous data sources (hospitalization data, entomological data, and Google Trends) and dengue surveillance reports for the last 14 years (January 2007 to February 2021) were analyzed to identify associations with dengue outbreaks and their time lags. RESULTS: The dengue hospitalization rate was the variable most strongly correlated with the increase in dengue positivity rate by real-time reverse transcription polymerase chain reaction (Pearson correlation coefficient=0.70) with a time lag of -3 weeks. Weekly entomological interventions were also correlated with the increase in dengue positivity rate by real-time reverse transcription polymerase chain reaction (Pearson correlation coefficient=0.59) with a time lag of -2 weeks. The most correlated query from Google Trends was the "Dengue" topic restricted to the Martinique region (Pearson correlation coefficient=0.637) with a time lag of -3 weeks. CONCLUSIONS: Real-word data are valuable data sources for dengue surveillance in smaller territories. Many of these sources precede the increase in dengue cases by several weeks, and therefore can help to improve the ability of traditional surveillance systems to provide an early response in dengue outbreaks. All these sources should be better integrated to improve the early response to dengue outbreaks and vector-borne diseases in smaller endemic territories.


Asunto(s)
Brotes de Enfermedades , Humanos , Estudios Retrospectivos , Martinica/epidemiología
18.
JMIR Med Inform ; 10(11): e36711, 2022 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-36318244

RESUMEN

BACKGROUND: Often missing from or uncertain in a biomedical data warehouse (BDW), vital status after discharge is central to the value of a BDW in medical research. The French National Mortality Database (FNMD) offers open-source nominative records of every death. Matching large-scale BDWs records with the FNMD combines multiple challenges: absence of unique common identifiers between the 2 databases, names changing over life, clerical errors, and the exponential growth of the number of comparisons to compute. OBJECTIVE: We aimed to develop a new algorithm for matching BDW records to the FNMD and evaluated its performance. METHODS: We developed a deterministic algorithm based on advanced data cleaning and knowledge of the naming system and the Damerau-Levenshtein distance (DLD). The algorithm's performance was independently assessed using BDW data of 3 university hospitals: Lille, Nantes, and Rennes. Specificity was evaluated with living patients on January 1, 2016 (ie, patients with at least 1 hospital encounter before and after this date). Sensitivity was evaluated with patients recorded as deceased between January 1, 2001, and December 31, 2020. The DLD-based algorithm was compared to a direct matching algorithm with minimal data cleaning as a reference. RESULTS: All centers combined, sensitivity was 11% higher for the DLD-based algorithm (93.3%, 95% CI 92.8-93.9) than for the direct algorithm (82.7%, 95% CI 81.8-83.6; P<.001). Sensitivity was superior for men at 2 centers (Nantes: 87%, 95% CI 85.1-89 vs 83.6%, 95% CI 81.4-85.8; P=.006; Rennes: 98.6%, 95% CI 98.1-99.2 vs 96%, 95% CI 94.9-97.1; P<.001) and for patients born in France at all centers (Nantes: 85.8%, 95% CI 84.3-87.3 vs 74.9%, 95% CI 72.8-77.0; P<.001). The DLD-based algorithm revealed significant differences in sensitivity among centers (Nantes, 85.3% vs Lille and Rennes, 97.3%, P<.001). Specificity was >98% in all subgroups. Our algorithm matched tens of millions of death records from BDWs, with parallel computing capabilities and low RAM requirements. We used the Inseehop open-source R script for this measurement. CONCLUSIONS: Overall, sensitivity/recall was 11% higher using the DLD-based algorithm than that using the direct algorithm. This shows the importance of advanced data cleaning and knowledge of a naming system through DLD use. Statistically significant differences in sensitivity between groups could be found and must be considered when performing an analysis to avoid differential biases. Our algorithm, originally conceived for linking a BDW with the FNMD, can be used to match any large-scale databases. While matching operations using names are considered sensitive computational operations, the Inseehop package released here is easy to run on premises, thereby facilitating compliance with cybersecurity local framework. The use of an advanced deterministic matching algorithm such as the DLD-based algorithm is an insightful example of combining open-source external data to improve the usage value of BDWs.

19.
JMIR Med Inform ; 10(10): e38936, 2022 Oct 17.
Artículo en Inglés | MEDLINE | ID: mdl-36251369

RESUMEN

BACKGROUND: Despite the many opportunities data reuse offers, its implementation presents many difficulties, and raw data cannot be reused directly. Information is not always directly available in the source database and needs to be computed afterwards with raw data for defining an algorithm. OBJECTIVE: The main purpose of this article is to present a standardized description of the steps and transformations required during the feature extraction process when conducting retrospective observational studies. A secondary objective is to identify how the features could be stored in the schema of a data warehouse. METHODS: This study involved the following 3 main steps: (1) the collection of relevant study cases related to feature extraction and based on the automatic and secondary use of data; (2) the standardized description of raw data, steps, and transformations, which were common to the study cases; and (3) the identification of an appropriate table to store the features in the Observation Medical Outcomes Partnership (OMOP) common data model (CDM). RESULTS: We interviewed 10 researchers from 3 French university hospitals and a national institution, who were involved in 8 retrospective and observational studies. Based on these studies, 2 states (track and feature) and 2 transformations (track definition and track aggregation) emerged. "Track" is a time-dependent signal or period of interest, defined by a statistical unit, a value, and 2 milestones (a start event and an end event). "Feature" is time-independent high-level information with dimensionality identical to the statistical unit of the study, defined by a label and a value. The time dimension has become implicit in the value or name of the variable. We propose the 2 tables "TRACK" and "FEATURE" to store variables obtained in feature extraction and extend the OMOP CDM. CONCLUSIONS: We propose a standardized description of the feature extraction process. The process combined the 2 steps of track definition and track aggregation. By dividing the feature extraction into these 2 steps, difficulty was managed during track definition. The standardization of tracks requires great expertise with regard to the data, but allows the application of an infinite number of complex transformations. On the contrary, track aggregation is a very simple operation with a finite number of possibilities. A complete description of these steps could enhance the reproducibility of retrospective studies.

20.
Pharmaceutics ; 14(7)2022 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-35890305

RESUMEN

Direct oral anticoagulants and vitamin K antagonists are considered as potentially inappropriate medications (PIM) in several situations according to Beers Criteria. Drug-drug interactions (DDI) occurring specifically with these oral anticoagulants considered PIM (PIM-DDI) is an issue since it could enhance their inappropriate character and lead to adverse drug events, such as bleeding events. The aim of this study was (1) to describe the prevalence of oral anticoagulants as PIM, DDI and PIM-DDI in elderly patients in primary care and during hospitalization and (2) to evaluate their potential impact on the clinical outcomes by predicting hospitalization for bleeding events using machine learning methods. This retrospective study based on the linkage between a primary care database and a hospital data warehouse allowed us to display the oral anticoagulant treatment pathway. The prevalence of PIM was similar between primary care and hospital setting (22.9% and 20.9%), whereas the prevalence of DDI and PIM-DDI were slightly higher during hospitalization (47.2% vs. 58.9% and 19.5% vs. 23.5%). Concerning mechanisms, combined with CYP3A4-P-gp interactions as PIM-DDI, were among the most prevalent in patients with bleeding events. Although PIM, DDI and PIM-DDI did not appeared as major predictors of bleeding events, they should be considered since they are the only factors that can be optimized by pharmacist and clinicians.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA