Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 289
Filter
1.
J Am Med Inform Assoc ; 31(6): 1280-1290, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38573195

ABSTRACT

OBJECTIVE: To develop and validate a natural language processing (NLP) pipeline that detects 18 conditions in French clinical notes, including 16 comorbidities of the Charlson index, while exploring a collaborative and privacy-enhancing workflow. MATERIALS AND METHODS: The detection pipeline relied both on rule-based and machine learning algorithms, respectively, for named entity recognition and entity qualification, respectively. We used a large language model pre-trained on millions of clinical notes along with annotated clinical notes in the context of 3 cohort studies related to oncology, cardiology, and rheumatology. The overall workflow was conceived to foster collaboration between studies while respecting the privacy constraints of the data warehouse. We estimated the added values of the advanced technologies and of the collaborative setting. RESULTS: The pipeline reached macro-averaged F1-score positive predictive value, sensitivity, and specificity of 95.7 (95%CI 94.5-96.3), 95.4 (95%CI 94.0-96.3), 96.0 (95%CI 94.0-96.7), and 99.2 (95%CI 99.0-99.4), respectively. F1-scores were superior to those observed using alternative technologies or non-collaborative settings. The models were shared through a secured registry. CONCLUSIONS: We demonstrated that a community of investigators working on a common clinical data warehouse could efficiently and securely collaborate to develop, validate and use sensitive artificial intelligence models. In particular, we provided an efficient and robust NLP pipeline that detects conditions mentioned in clinical notes.


Subject(s)
Electronic Health Records , Machine Learning , Natural Language Processing , Workflow , Humans , Data Warehousing , Algorithms , France , Confidentiality
2.
JCO Clin Cancer Inform ; 8: e2300193, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38621193

ABSTRACT

PURPOSE: In the United States, a comprehensive national breast cancer registry (CR) does not exist. Thus, care and coverage decisions are based on data from population subsets, other countries, or models. We report a prototype real-world research data mart to assess mortality, morbidity, and costs for breast cancer diagnosis and treatment. METHODS: With institutional review board approval and Health Insurance Portability and Accountability Act (HIPPA) compliance, a multidisciplinary clinical and research data warehouse (RDW) expert group curated demographic, risk, imaging, pathology, treatment, and outcome data from the electronic health records (EHR), radiology (RIS), and CR for patients having breast imaging and/or a diagnosis of breast cancer in our institution from January 1, 2004, to December 31, 2020. Domains were defined by prebuilt views to extract data denormalized according to requirements from the existing RDW using an export, transform, load pattern. Data dictionaries were included. Structured query language was used for data cleaning. RESULTS: Five-hundred eighty-nine elements (EHR 311, RIS 211, and CR 67) were mapped to 27 domains; all, except one containing CR elements, had cancer and noncancer cohort views, resulting in a total of 53 views (average 12 elements/view; range, 4-67). EHR and RIS queries returned 497,218 patients with 2,967,364 imaging examinations and associated visit details. Cancer biology, treatment, and outcome details for 15,619 breast cancer cases were imported from the CR of our primary breast care facility for this prototype mart. CONCLUSION: Institutional real-world data marts enable comprehensive understanding of care outcomes within an organization. As clinical data sources become increasingly structured, such marts may be an important source for future interinstitution analysis and potentially an opportunity to create robust real-world results that could be used to support evidence-based national policy and care decisions for breast cancer.


Subject(s)
Breast Neoplasms , Humans , United States/epidemiology , Female , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/therapy , Data Warehousing , Electronic Health Records , Registries , Diagnostic Imaging
3.
Stud Health Technol Inform ; 313: 198-202, 2024 Apr 26.
Article in English | MEDLINE | ID: mdl-38682530

ABSTRACT

Secondary use of clinical health data implies a prior integration of mostly heterogenous and multidimensional data sets. A clinical data warehouse addresses the technological and organizational framework conditions required for this, by making any data available for analysis. However, users of a data warehouse often do not have a comprehensive overview of all available data and only know about their own data in their own systems - a situation which is also referred to as 'data siloed state'. This problem can be addressed and ultimately solved by implementation of a data catalog. Its core function is a search engine, which allows for searching the metadata collected from different data sources and thereby accessing all data there is. With this in mind, we conducted an explorative online market survey followed by vendor comparison as a pre-requisite for system selection of a data catalog. Assessment of vendor performance was based on seven predetermined and weighted selection criteria. Although three vendors achieved the highest score, results were lying closely together. Detailed investigations and test installations are needed for further narrowing down the selection process.


Subject(s)
Data Warehousing , Electronic Health Records , Search Engine , Humans , Information Storage and Retrieval/methods , Metadata
4.
BMC Med Imaging ; 24(1): 67, 2024 Mar 20.
Article in English | MEDLINE | ID: mdl-38504179

ABSTRACT

BACKGROUND: Clinical data warehouses provide access to massive amounts of medical images, but these images are often heterogeneous. They can for instance include images acquired both with or without the injection of a gadolinium-based contrast agent. Harmonizing such data sets is thus fundamental to guarantee unbiased results, for example when performing differential diagnosis. Furthermore, classical neuroimaging software tools for feature extraction are typically applied only to images without gadolinium. The objective of this work is to evaluate how image translation can be useful to exploit a highly heterogeneous data set containing both contrast-enhanced and non-contrast-enhanced images from a clinical data warehouse. METHODS: We propose and compare different 3D U-Net and conditional GAN models to convert contrast-enhanced T1-weighted (T1ce) into non-contrast-enhanced (T1nce) brain MRI. These models were trained using 230 image pairs and tested on 77 image pairs from the clinical data warehouse of the Greater Paris area. RESULTS: Validation using standard image similarity measures demonstrated that the similarity between real and synthetic T1nce images was higher than between real T1nce and T1ce images for all the models compared. The best performing models were further validated on a segmentation task. We showed that tissue volumes extracted from synthetic T1nce images were closer to those of real T1nce images than volumes extracted from T1ce images. CONCLUSION: We showed that deep learning models initially developed with research quality data could synthesize T1nce from T1ce images of clinical quality and that reliable features could be extracted from the synthetic images, thus demonstrating the ability of such methods to help exploit a data set coming from a clinical data warehouse.


Subject(s)
Data Warehousing , Gadolinium , Humans , Brain/diagnostic imaging , Magnetic Resonance Imaging/methods , Neuroimaging/methods , Image Processing, Computer-Assisted/methods
5.
Med Image Anal ; 93: 103073, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38176355

ABSTRACT

Containing the medical data of millions of patients, clinical data warehouses (CDWs) represent a great opportunity to develop computational tools. Magnetic resonance images (MRIs) are particularly sensitive to patient movements during image acquisition, which will result in artefacts (blurring, ghosting and ringing) in the reconstructed image. As a result, a significant number of MRIs in CDWs are corrupted by these artefacts and may be unusable. Since their manual detection is impossible due to the large number of scans, it is necessary to develop tools to automatically exclude (or at least identify) images with motion in order to fully exploit CDWs. In this paper, we propose a novel transfer learning method from research to clinical data for the automatic detection of motion in 3D T1-weighted brain MRI. The method consists of two steps: a pre-training on research data using synthetic motion, followed by a fine-tuning step to generalise our pre-trained model to clinical data, relying on the labelling of 4045 images. The objectives were both (1) to be able to exclude images with severe motion, (2) to detect mild motion artefacts. Our approach achieved excellent accuracy for the first objective with a balanced accuracy nearly similar to that of the annotators (balanced accuracy>80 %). However, for the second objective, the performance was weaker and substantially lower than that of human raters. Overall, our framework will be useful to take advantage of CDWs in medical imaging and highlight the importance of a clinical validation of models trained on research data.


Subject(s)
Artifacts , Data Warehousing , Humans , Motion , Brain/diagnostic imaging , Magnetic Resonance Imaging
6.
Stud Health Technol Inform ; 310: 1400-1401, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38269666

ABSTRACT

In Japan, oversights of imaging or pathology examination results and diagnoses provided to patients have become a major problem because they affect patient prognosis. We have jointly developed and used the "Anti-Impact Information Leakage Prevention System (AiR)" since December 2019. This system works effectively because its introduction, which uses a data warehouse, has increased versatility and considerably improved the situation of confirmation and communication. We believe this system is working effectively.


Subject(s)
Communication , Data Warehousing , Humans , Japan
7.
Stud Health Technol Inform ; 310: 33-37, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38269760

ABSTRACT

In digital healthcare, data heterogeneity is a reoccurring issue caused by proprietary source systems. It is often overcome by utilizing ETL processes resulting in data warehouses, which ensure common data models for interoperability. Unfortunately, the achieved interoperability is usually limited to an institutional level. The broad solution space to achieve interoperability with different health data standards is part of the problem, resulting in different standards used at various institutions. For cross-institutional use cases like federated feasibility queries, the issue of heterogeneity is reintroduced. This work showcases how the existing German infrastructure for federated feasibility queries based on Hl7 FHIR can be extended to support openEHR without further data transformation. By utilizing an intermediate query format that can be transferred to FHIR Search, CQL, and AQL.


Subject(s)
Data Warehousing , Health Facilities , Humans , Feasibility Studies
8.
JAMA Netw Open ; 7(1): e2353094, 2024 Jan 02.
Article in English | MEDLINE | ID: mdl-38265797

ABSTRACT

Importance: The US Food and Drug Administration approved eteplirsen for Duchenne muscular dystrophy (DMD) in 2016 based on a controversial pivotal study that demonstrated a limited effect on the surrogate measure of dystrophin production. Other DMD treatments in the same class followed. Objective: To assess how patients receiving novel DMD treatments in postapproval clinical settings compare with patients in the clinical trials. Design, Setting, and Participants: This cross-sectional study collected data on patients who initiated 1 of 4 novel DMD treatments (eteplirsen, golodirsen, viltolarsen, and casimersen) using national claims databases of commercially insured (Merative MarketScan and Optum's Clinformatics Data Mart Database [CDM]) and Medicaid patients between September 19, 2016, and March 31, 2022. Patients were followed for 1 year after the date of first use of any novel DMD treatment. In addition, patients in pivotal DMD drug trials were identified for comparison. Exposures: Age, sex, race and ethnicity, region, and DMD stage of patients receiving novel DMD treatment. Main Outcome and Measures: The main outcome was health care costs and drug discontinuation as measured using descriptive statistics. Results: A total of 223 routine care patients initiating novel DMD drugs (58 in MarketScan, 35 in CDM, and 130 in Medicaid) were identified. Among the 106 patients in the pivotal trials, the mean (SD) age was 8.5 (2.0) years (range, 4.0-13.0 years), which was younger than the mean age of patients in routine care (MarketScan: 13.7 [7.0] years [range, 1.8-33.3 years; P < .001]; CDM: 11.9 [5.7] years [range, 0.6-23.6 years; P < .001]; Medicaid: 13.4 [6.5] years [range, 1.8-46.1 years; P < .001]). The proportion of female patients identified in postapproval clinical settings was 2.9% (n = 1) in CDM (vs 34 male patients [97.1%]) and 1.5% (n = 2) in Medicaid (vs 128 male patients [98.5%]), which was not different from the pivotal trials. While nearly all patients in the pivotal trials had DMD disease stage 1 or 2 when initiating the DMD treatments (103 [97.2%]), in the postapproval clinical setting, slightly more than one-third of patients were in disease stage 3 or 4 (MarketScan, 17 [36.2%; P < .001]; CDM, 13 [41.9%; P < .001]; Medicaid, 54 [47.0%; P < .001]). The payer's cost for novel DMD treatments varied across the databases, with a mean (SD) of $634 764 ($607 101) in MarketScan, $482 749 ($582 350) in CDM, and $384 023 ($1 165 730) in Medicaid. Approximately one-third of routine care patients discontinued the treatments after approximately 7 months (mean [SD], 6.1 [4.4], 6.9 [3.9], and 7.2 [4.3] months in MarketScan, CDM, and Medicaid, respectively). Conclusions and Relevance: These findings raise questions about the translation of DMD drug trial findings to routine care settings, with patients in routine care discontinuing the treatment within 1 year and payers incurring substantial expenses for these medications. More data are needed on whether these high costs are accompanied by corresponding clinical benefits.


Subject(s)
Muscular Dystrophy, Duchenne , United States , Humans , Female , Male , Infant , Child, Preschool , Child , Adolescent , Young Adult , Adult , Cross-Sectional Studies , Data Warehousing , Behavior Therapy , Databases, Factual
9.
J Neuroophthalmol ; 44(1): 10-15, 2024 Mar 01.
Article in English | MEDLINE | ID: mdl-37505911

ABSTRACT

BACKGROUND: Although significant progress has been made in improving the rate of survival for pediatric optic pathway gliomas (OPGs), data describing the methods of diagnosis and treatment for OPGs are limited in the modern era. This retrospective study aims to provide an epidemiological overview in the pediatric population and an update on eye care resource utilization in OPG patients using big data analysis. METHODS: Using the OptumLabs Data Warehouse, 9-11 million children from 2016 to 2021 assessed the presence of an OPG claim. This data set was analyzed for demographic distribution data and clinical data including average ages for computed tomography (CT), MRI, strabismus, and related treatment (surgery, chemotherapy, and radiation), as well as yearly rates for optical coherence tomography (OCT) and visual field (VF) examinations. RESULTS: Five hundred fifty-one unique patients ranging in age from 0 to 17 years had an OPG claim, with an estimated prevalence of 4.6-6.1 per 100k. Among the 476 OPG patients with at least 6 months of follow-up, 88.9% had at least one MRI and 15.3% had at least one CT. Annual rates for OCT and VF testing were similar (1.26 vs 1.35 per year), although OCT was ordered for younger patients (mean age = 9.2 vs 11.7 years, respectively). During the study period, 14.1% of OPG patients had chemotherapy, 6.1% had either surgery or radiation, and 81.7% had no treatment. CONCLUSIONS: This study updates OPG demographics for the modern era and characterizes the burden of the treatment course for pediatric OPG patients using big data analysis of a commercial claims database. OPGs had a prevalence of about 0.005% occurring equally in boys and girls. Most did not receive treatment, and the average child had at least one claim for OCT or VF per year for clinical monitoring. This study is limited to only commercially insured children, who represent approximately half of the general child population.


Subject(s)
Neurofibromatosis 1 , Optic Nerve Glioma , Male , Female , Child , Humans , Infant, Newborn , Infant , Child, Preschool , Adolescent , Retrospective Studies , Prevalence , Data Warehousing , Optic Nerve Glioma/diagnosis , Optic Nerve Glioma/epidemiology , Optic Nerve Glioma/therapy , Visual Fields , Neurofibromatosis 1/diagnosis
10.
Korean J Anesthesiol ; 77(1): 58-65, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37935575

ABSTRACT

BACKGROUND: To enhance perioperative outcomes, a perioperative registry that integrates high-quality real-world data throughout the perioperative period is essential. Singapore General Hospital established the Perioperative and Anesthesia Subject Area Registry (PASAR) to unify data from the preoperative, intraoperative, and postoperative stages. This study presents the methodology employed to create this database. METHODS: Since 2016, data from surgical patients have been collected from the hospital electronic medical record systems, de-identified, and stored securely in compliance with privacy and data protection laws. As a representative sample, data from initiation in 2016 to December 2022 were collected. RESULTS: As of December 2022, PASAR data comprise 26 tables, encompassing 153,312 patient admissions and 168,977 operation sessions. For this period, the median age of the patients was 60.0 years, sex distribution was balanced, and the majority were Chinese. Hypertension and cardiovascular comorbidities were also prevalent. Information including operation type and time, intensive care unit (ICU) length of stay, and 30-day and 1-year mortality rates were collected. Emergency surgeries resulted in longer ICU stays, but shorter operation times than elective surgeries. CONCLUSIONS: The PASAR provides a comprehensive and automated approach to gathering high-quality perioperative patient data.


Subject(s)
Anesthesia , Data Warehousing , Humans , Middle Aged , Elective Surgical Procedures , Patient Admission , Registries
11.
J Surg Res ; 294: 220-227, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37913729

ABSTRACT

INTRODUCTION: Clinical publications use mortality as a hard end point. It is unknown how many patient deaths are under-reported in institutional databases. The objective of this study was to query mortality in our patient cohort from our data warehouse and compare these deaths to those identified in different databases. METHODS: We passed the first/last name and date of birth of 134 patients through online mortality search engines (Find a Grave Index, US Cemetery and Funeral Home Collection, etc.) to assess their ability to capture patient deaths and compared that to deaths recorded from our institutional data warehouse. RESULTS: Our institutional data warehouse found approximately one-third of the total patient mortalities. After the Social Security Death Index, we found that the Find a Grave Index captured the most mortalities missed by the institutional data warehouse. These results highlight the advantages of incorporating readily available search engines into institutional data warehouses for the accurate collection of patient mortalities, particularly those that occur outside of index operative admission. CONCLUSIONS: The incorporation of the mortality search engines significantly augmented the capture of patient deaths. Our approach may be useful for tailored patient outreach and reporting mortalities with institutional data.


Subject(s)
Data Warehousing , Search Engine , Humans , Databases, Factual
12.
Acta Biomed ; 94(S3): e2023121, 2023 08 30.
Article in English | MEDLINE | ID: mdl-37695185

ABSTRACT

Digital health records can provide advantages to healthcare practice, policy, and research. Several countries have established population-based digitalised data collection, integrated through data linkage techniques. In Lombardy (Italy), a regional population-based registry was established in the 2000s. It collects data from the social and health sector, anonymised immediately after their acquisition and restructured in a single repository. Data can be used for public health interest, planning, monitoring, services evaluation, and research. Indeed, data can also be provided to universities and other scientific institutes. The availability of such data enables to explore the epidemiology of infectious, chronic, and rare diseases. Thus, epidemiological research can support policymakers to tackle public health threats. However, analysis of electronic health records comes along with several challenges, including data inaccuracy, incompleteness, and biases. Researchers should take into consideration limits and barriers related to quality of data. Moreover, health data use must adhere to the national and European privacy legislation, at times limiting the potential of data integration. Therefore, even if big data drives innovation and scientific knowledge, ethical issues regarding privacy should be considered in public debate.


Subject(s)
Data Warehousing , Public Health , Humans , Policy , Data Collection , Electronic Health Records
13.
J Patient Saf ; 19(8): 501-507, 2023 Dec 01.
Article in English | MEDLINE | ID: mdl-37712829

ABSTRACT

OBJECTIVES: The aims of the study are to identify fall risk factors and to establish automatic risk assessments based on clinical data from electronic medical records of hospitalized patients. METHODS: In this retrospective case-control study, we reviewed the electronic medical records of 1454 patients (292 and 1162 patients in the fall and nonfall groups, respectively) who were hospitalized at a 1800-bed tertiary hospital in South Korea between January 1, 2017, and December 31, 2017. Patients' age, sex, and clinical department were matched, and all laboratory reports, clinical flow sheets, and nursing initial assessment records of case from the Clinical Data Warehouse system were analyzed. The collated patient records data were analyzed using SAS (version 9.4) and logistic regression. RESULTS: Overall, 65 risk factors, including low body mass index, low blood pressure, low albumin levels, high fasting blood sugar level, low red blood cell counts, and high potassium levels, that significantly increased the incidence of falls were identified. Falls were also associated with 21 items from the clinical flow sheet and nursing initial assessment, including frequent bowel movements, 24-hour urine tests, imaging tests, biopsy, pain, intravenous tubes, unclear consciousness, and taking medication. CONCLUSIONS: Fall risk factors identified via the Clinical Data Warehouse can be used to build an automated detection system to detect fall risk in electronic medical records, enabling nurses to assess the fall risk in addition to using the fall scale.


Subject(s)
Accidental Falls , Inpatients , Humans , Case-Control Studies , Data Warehousing , Retrospective Studies , Risk Assessment/methods , Risk Factors , Tertiary Care Centers , Male , Female
14.
BMC Med Inform Decis Mak ; 23(1): 183, 2023 09 15.
Article in English | MEDLINE | ID: mdl-37715195

ABSTRACT

BACKGROUND: Aggregate electronic data repositories and population-level cross-sectional surveys play a critical role in HIV programme monitoring and surveillance for data-driven decision-making. However, these data sources have inherent limitations including inability to respond to public health priorities in real-time and to longitudinally follow up clients for ascertainment of long-term outcomes. Electronic medical records (EMRs) have tremendous potential to bridge these gaps when harnessed into a centralised data repository. We describe the evolution of EMRs and the development of a centralised national data warehouse (NDW) repository. Further, we describe the distribution and representativeness of data from the NDW and explore its potential for population-level surveillance of HIV testing, care and treatment in Kenya. MAIN BODY: Health information systems in Kenya have evolved from simple paper records to web-based EMRs with features that support data transmission to the NDW. The NDW design includes four layers: data warehouse application programming interface (DWAPI), central staging, integration service, and data visualization application. The number of health facilities uploading individual-level data to the NDW increased from 666 in 2016 to 1,516 in 2020, covering 41 of 47 counties in Kenya. By the end of 2020, the NDW hosted longitudinal data from 1,928,458 individuals ever started on antiretroviral therapy (ART). In 2020, there were 936,869 individuals who were active on ART in the NDW, compared to 1,219,276 individuals on ART reported in the aggregate-level Kenya Health Information System (KHIS), suggesting 77% coverage. The proportional distribution of individuals on ART by counties in the NDW was consistent with that from KHIS, suggesting representativeness and generalizability at the population level. CONCLUSION: The NDW presents opportunities for individual-level HIV programme monitoring and surveillance because of its longitudinal design and its ability to respond to public health priorities in real-time. A comparison with estimates from KHIS demonstrates that the NDW has high coverage and that the data maybe representative and generalizable at the population-level. The NDW is therefore a unique and complementary resource for HIV programme monitoring and surveillance with potential to strengthen timely data driven decision-making towards HIV epidemic control in Kenya. DATABASE LINK: ( https://dwh.nascop.org/ ).


Subject(s)
Data Warehousing , Electronic Health Records , Humans , Cross-Sectional Studies , Kenya/epidemiology , HIV Testing
15.
Article in English | MEDLINE | ID: mdl-37681826

ABSTRACT

BACKGROUND: Cannabis is the main illicit psychoactive substance used in French childbearing women and very few data are available about adverse events (AEs) related to its use during pregnancy. The aim of this study was to evaluate the association between recreational cannabis use during pregnancy and adverse outcomes from a real-world clinical data warehouse. METHODS: Data from the Poitiers University Hospital warehouse were analyzed between 1 January 2010 and 31 December 2019. Logistic regression models were used to evaluate associations between outcomes in three prenatal user groups: cannabis alone ± tobacco (C ± T) (n = 123), tobacco alone (T) (n = 191) and controls (CTRL) (n = 355). RESULTS: Pregnant women in the C ± T group were younger (mean age: 25.5 ± 5.7 years), had lower pre-pregnancy body mass index (22.8 ± 5.5 kg/m2), more psychiatric history (17.5%) and were more likely to benefit from universal free health-care coverage (18.2%) than those in the T and CTRL groups. Cannabis use increases the occurrence of voluntary interruption of pregnancy, at least one AE during pregnancy, at least one neonatal AE, the composite adverse pregnancy outcome over 28, prematurity and small for gestational age. CONCLUSION: Given the trivialization of recreational cannabis use during pregnancy, there is an urgent need to communicate on AEs of cannabis use during pregnancy.


Subject(s)
Cannabis , Hallucinogens , Infant, Newborn , Female , Humans , Pregnancy , Young Adult , Adult , Cannabis/adverse effects , Data Warehousing , Body Mass Index , Health Facilities
16.
J Med Internet Res ; 25: e49593, 2023 09 11.
Article in English | MEDLINE | ID: mdl-37615085

ABSTRACT

BACKGROUND: The use of real-world data (RWD) warehouses for research in Asia is on the rise, but current trends remain largely unexplored. Given the varied economic and health care landscapes in different Asian countries, understanding these trends can offer valuable insights. OBJECTIVE: We sought to discern the contemporary landscape of linked RWD warehouses and explore their trends and patterns in 3 Asian countries with contrasting economies and health care systems: Taiwan, India, and Thailand. METHODS: Using a systematic scoping review methodology, we conducted an exhaustive literature search on PubMed with filters for the English language and the past 5 years. The search combined Medical Subject Heading terms and specific keywords. Studies were screened against strict eligibility criteria to identify eligible studies using RWD databases from more than one health care facility in at least 1 of the 3 target countries. RESULTS: Our search yielded 2277 studies, of which 833 (36.6%) met our criteria. Overall, single-country studies (SCS) dominated at 89.4% (n=745), with cross-country collaboration studies (CCCS) being at 10.6% (n=88). However, the country-wise breakdown showed that of all the SCS, 623 (83.6%) were from Taiwan, 81 (10.9%) from India, and 41 (5.5%) from Thailand. Among the total studies conducted in each country, India at 39.1% (n=133) and Thailand at 43.1% (n=72) had a significantly higher percentage of CCCS compared to Taiwan at 7.6% (n=51). Over a 5-year span from 2017 to 2022, India and Thailand experienced an annual increase in RWD studies by approximately 18.2% and 13.8%, respectively, while Taiwan's contributions remained consistent. Comparative effectiveness research (CER) was predominant in Taiwan (n=410, or 65.8% of SCS) but less common in India (n=12, or 14.8% of SCS) and Thailand (n=11, or 26.8% of SCS). CER percentages in CCCS were similar across the 3 countries, ranging from 19.2% (n=10) to 29% (n=9). The type of RWD source also varied significantly across countries, with India demonstrating a high reliance on electronic medical records or electronic health records at 55.6% (n=45) of SCS and Taiwan showing an increasing trend in their use over the period. Registries were used in 26 (83.9%) CCCS and 31 (75.6%) SCS from Thailand but in <50% of SCS from Taiwan and India. Health insurance/administrative claims data were used in most of the SCS from Taiwan (n=458, 73.5%). There was a consistent predominant focus on cardiology/metabolic disorders in all studies, with a noticeable increase in oncology and infectious disease research from 2017 to 2022. CONCLUSIONS: This review provides a comprehensive understanding of the evolving landscape of RWD research in Taiwan, India, and Thailand. The observed differences and trends emphasize the unique economic, clinical, and research settings in each country, advocating for tailored strategies for leveraging RWD for future health care research and decision-making. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.2196/43741.


Subject(s)
Biomedical Research , Data Warehousing , Databases, Factual , Humans , Asian , India , Taiwan , Thailand
17.
Sci Data ; 10(1): 545, 2023 08 21.
Article in English | MEDLINE | ID: mdl-37604823

ABSTRACT

During the past decade, cognitive neuroscience has been calling for population diversity to address the challenge of validity and generalizability, ushering in a new era of population neuroscience. The developing Chinese Color Nest Project (devCCNP, 2013-2022), the first ten-year stage of the lifespan CCNP (2013-2032), is a two-stages project focusing on brain-mind development. The project aims to create and share a large-scale, longitudinal and multimodal dataset of typically developing children and adolescents (ages 6.0-17.9 at enrolment) in the Chinese population. The devCCNP houses not only phenotypes measured by demographic, biophysical, psychological and behavioural, cognitive, affective, and ocular-tracking assessments but also neurotypes measured with magnetic resonance imaging (MRI) of brain morphometry, resting-state function, naturalistic viewing function and diffusion structure. This Data Descriptor introduces the first data release of devCCNP including a total of 864 visits from 479 participants. Herein, we provided details of the experimental design, sampling strategies, and technical validation of the devCCNP resource. We demonstrate and discuss the potential of a multicohort longitudinal design to depict normative brain growth curves from the perspective of developmental population neuroscience. The devCCNP resource is shared as part of the "Chinese Data-sharing Warehouse for In-vivo Imaging Brain" in the Chinese Color Nest Project (CCNP) - Lifespan Brain-Mind Development Data Community ( https://ccnp.scidb.cn ) at the Science Data Bank.


Subject(s)
Asian People , Brain , Humans , Brain/diagnostic imaging , China , Data Warehousing , Databases, Factual , Neurosciences
19.
JCO Clin Cancer Inform ; 7: e2300019, 2023 08.
Article in English | MEDLINE | ID: mdl-37607323

ABSTRACT

PURPOSE: The goal of this study was to use real-world data sources that may be faster and more complete than self-reported data alone, and timelier than cancer registries, to ascertain breast cancer cases in the ongoing screening trial, the WISDOM Study. METHODS: We developed a data warehouse procedural process (DWPP) to identify breast cancer cases from a subgroup of WISDOM participants (n = 11,314) who received breast-related care from a University of California Health Center in the period 2012-2021 by searching electronic health records (EHRs) in the University of California Data Warehouse (UCDW). Incident breast cancer diagnoses identified by the DWPP were compared with those identified by self-report via annual follow-up online questionnaires. RESULTS: Our study identified 172 participants with confirmed breast cancer diagnoses in the period 2016-2021 by the following sources: 129 (75%) by both self-report and DWPP, 23 (13%) by DWPP alone, and 20 (12%) by self-report only. Among those with International Classification of Diseases 10th revision cancer diagnostic codes, no diagnosis was confirmed in 18% of participants. CONCLUSION: For diagnoses that occurred ≥20 months before the January 1, 2022, UCDW data pull, WISDOM self-reported data via annual questionnaire achieved high accuracy (96%), as confirmed by the cancer registry. More rapid cancer ascertainment can be achieved by combining self-reported data with EHR data from a health system data warehouse registry, particularly to address self-reported questionnaire issues such as timing delays (ie, time lag between participant diagnoses and the submission of their self-reported questionnaire typically ranges from a month to a year) and lack of response. Although cancer registry reporting often is not as timely, it does not require verification as does the DWPP or self-report from annual questionnaires.


Subject(s)
Breast Neoplasms , Humans , Female , Self Report , Breast Neoplasms/diagnosis , Breast Neoplasms/epidemiology , Electronic Health Records , Breast , Data Warehousing
20.
Med Image Anal ; 89: 102903, 2023 10.
Article in English | MEDLINE | ID: mdl-37523918

ABSTRACT

A variety of algorithms have been proposed for computer-aided diagnosis of dementia from anatomical brain MRI. These approaches achieve high accuracy when applied to research data sets but their performance on real-life clinical routine data has not been evaluated yet. The aim of this work was to study the performance of such approaches on clinical routine data, based on a hospital data warehouse, and to compare the results to those obtained on a research data set. The clinical data set was extracted from the hospital data warehouse of the Greater Paris area, which includes 39 different hospitals. The research set was composed of data from the Alzheimer's Disease Neuroimaging Initiative data set. In the clinical set, the population of interest was identified by exploiting the diagnostic codes from the 10th revision of the International Classification of Diseases that are assigned to each patient. We studied how the imbalance of the training sets, in terms of contrast agent injection and image quality, may bias the results. We demonstrated that computer-aided diagnosis performance was strongly biased upwards (over 17 percent points of balanced accuracy) by the confounders of image quality and contrast agent injection, a phenomenon known as the Clever Hans effect or shortcut learning. When these biases were removed, the performance was very poor. In any case, the performance was considerably lower than on the research data set. Our study highlights that there are still considerable challenges for translating dementia computer-aided diagnosis systems to clinical routine.


Subject(s)
Alzheimer Disease , Contrast Media , Humans , Data Warehousing , Brain/diagnostic imaging , Magnetic Resonance Imaging/methods , Alzheimer Disease/diagnostic imaging , Machine Learning , Computers
SELECTION OF CITATIONS
SEARCH DETAIL
...