RESUMO
OBJECTIVES: Accurate record linkage (RL) enables consolidation and de-duplication of data from disparate datasets, resulting in more comprehensive and complete patient data. However, conducting RL with low quality or unfit data can waste institutional resources on poor linkage results. We aim to evaluate data linkability to enhance the effectiveness of record linkage. MATERIALS AND METHODS: We describe a systematic approach using data fitness ("linkability") measures, defined as metrics that characterize the availability, discriminatory power, and distribution of potential variables for RL. We used the isolation forest algorithm to detect abnormal linkability values from 188 sites in Indiana and Colorado, and manually reviewed the data to understand the cause of anomalies. RESULT: We calculated 10 linkability metrics for 11 potential linkage variables (LVs) across 188 sites for a total of 20 680 linkability metrics. Potential LVs such as first name, last name, date of birth, and sex have low missing data rates, while Social Security Number vary widely in completeness among all sites. We investigated anomalous linkability values to identify the cause of many records having identical values in certain LVs, issues with placeholder values disguising data missingness, and orphan records. DISCUSSION: The fitness of a variable for RL is determined by its availability and its discriminatory power to uniquely identify individuals. These results highlight the need for awareness of placeholder values, which inform the selection of variables and methods to optimize RL performance. CONCLUSION: Evaluating linkability measures using the isolation forest algorithm to highlight anomalous findings can help identify fitness-for-use issues that must be addressed before initiating the RL process to ensure high-quality linkage outcomes.
Assuntos
Algoritmos , Registro Médico Coordenado , Humanos , Registro Médico Coordenado/métodos , Colorado , Registros Eletrônicos de Saúde , Indiana , Confiabilidade dos DadosRESUMO
OBJECTIVE: To describe the use of privacy preserving linkage methods operationally in Australia, and to present insights and key learnings from their implementation. METHODS: Privacy preserving record linkage (PPRL) utilising Bloom filters provides a unique practical mechanism that allows linkage to occur without the release of personally identifiable information (PII), while still ensuring high accuracy. RESULTS: The methodology has received wide uptake within Australia, with four state linkage units with privacy preserving capability. It has enabled access to general practice and private pathology data amongst other, both much sought after datasets previous inaccessible for linkage. CONCLUSION: The Australian experience suggests privacy preserving linkage is a practical solution for improving data access for policy, planning and population health research. It is hoped interest in this methodology internationally continues to grow.
Assuntos
Confidencialidade , Registro Médico Coordenado , Austrália , Registro Médico Coordenado/métodos , Humanos , Confidencialidade/normas , Registros Eletrônicos de Saúde , PrivacidadeRESUMO
Population Health Management - often abbreviated to PHM - is a relatively new approach for healthcare planning, requiring the application of analytical techniques to linked patient level data. Despite expectations for greater uptake of PHM, there is a deficit of available solutions to help health services embed it into routine use. This paper concerns the development, application and use of an interactive tool which can be linked to a healthcare system's data warehouse and employed to readily perform key PHM tasks such as population segmentation, risk stratification, and deriving various performance metrics and descriptive summaries. Developed through open-source code in a large healthcare system in South West England, and used by others around the country, this paper demonstrates the importance of a scalable, purpose-built solution for improving the uptake of PHM in health services.
Assuntos
Registros Eletrônicos de Saúde , Gestão da Saúde da População , Humanos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Inglaterra , Registro Médico Coordenado/métodosRESUMO
BACKGROUND: Precision medicine has become a mainstay of cancer care in recent years. The National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) Program has been an authoritative source of cancer statistics and data since 1973. However, tumor genomic information has not been adequately captured in the cancer surveillance data, which impedes population-based research on molecular subtypes. To address this, the SEER Program has developed and implemented a centralized process to link SEER registries' tumor cases with genomic test results that are provided by molecular laboratories to the registries. METHODS: Data linkages were carried out following operating procedures for centralized linkages established by the SEER Program. The linkages used Match*Pro, a probabilistic linkage software, and were facilitated by the registries' trusted third party (an honest broker). The SEER registries provide to NCI limited datasets that undergo preliminary evaluation prior to their release to the research community. RESULTS: Recently conducted genomic linkages included OncotypeDX Breast Recurrence Score, OncotypeDX Breast Ductal Carcinoma in Situ, OncotypeDX Genomic Prostate Score, Decipher Prostate Genomic Classifier, DecisionDX Uveal Melanoma, DecisionDX Preferentially Expressed Antigen in Melanoma, DecisionDX Melanoma, and germline tests results in Georgia and California SEER registries. CONCLUSIONS: The linkages of cancer cases from SEER registries with genomic test results obtained from molecular laboratories offer an effective approach for data collection in cancer surveillance. By providing de-identified data to the research community, the NCI's SEER Program enables scientists to investigate numerous research inquiries.
Assuntos
Genômica , Neoplasias , Sistema de Registros , Programa de SEER , Humanos , Programa de SEER/estatística & dados numéricos , Estados Unidos/epidemiologia , Neoplasias/genética , Neoplasias/epidemiologia , Neoplasias/diagnóstico , Genômica/métodos , Sistema de Registros/estatística & dados numéricos , Feminino , Masculino , Testes Genéticos/métodos , Testes Genéticos/estatística & dados numéricos , Registro Médico Coordenado/métodos , National Cancer Institute (U.S.)RESUMO
BACKGROUND: The National Cancer Institute funds many large cohort studies that rely on self-reported cancer data requiring medical record validation. This is labor intensive, costly, and prone to underreporting or misreporting of cancer and disparity-related differential response. US population-based central cancer registries identify incident cancer within their catchment area, yielding all malignant neoplasms and benign brain and central nervous system tumors with standardized data fields. This manuscript describes the development, implementation, and features of a system to facilitate linkage between cohort studies and cancer registries and the release of cancer registry data for matched cohort participants. METHODS: The Virtual Pooled Registry-Cancer Linkage System (VPR-CLS) provides an online system to link cohorts with multiple state cancer registries by 1) securely transmitting a study file to registries, 2) providing an optimized linkage algorithm to generate preliminary match counts, and 3) providing a streamlined process and templated forms for submitting and tracking data requests for cohort participants who matched with registries. RESULTS: In 2022, the VPR-CLS launched with 45 registries, covering 95% of the US state populations and Puerto Rico. Registries have linked with 15 studies having 14â273-10.9 million participants. Except in 1 study, linkage sensitivity ranged from 87.0% to 99.9%. Numerous registries have adopted the VPR-CLS templated institutional review board-registry application (n = 39), templated data use agreement (n = 25), and central institutional review board (n = 16). CONCLUSIONS: The VPR-CLS markedly improves ascertainment of cancer outcomes and is the preferred approach for determination of outcomes from cohort studies, postmarketing surveillance, and clinical trials.
Assuntos
Registro Médico Coordenado , Neoplasias , Sistema de Registros , Humanos , Sistema de Registros/estatística & dados numéricos , Neoplasias/epidemiologia , Neoplasias/diagnóstico , Estados Unidos/epidemiologia , Registro Médico Coordenado/métodos , Estudos de Coortes , National Cancer Institute (U.S.)RESUMO
HeXEHRS is a FHIR-based cloud EHR service designed to support healthcare in depopulated areas, powered by digital twin technology. Its core functionalities encompass standard EHR tasks including data exchange for healthcare processes. In the first year of this national project, we present the design and define the functionalities of the system.
Assuntos
Computação em Nuvem , Registros Eletrônicos de Saúde , Registro Médico Coordenado/métodos , HumanosRESUMO
The Valkyrie project aims to develop a demonstration Federated Electronic Health Record for the use of mental health practitioners in Norway. Information for the record is drawn from existing records in Source Systems operating across primary and secondary care. Recording of information in any such system, in response to a healthcare event, triggers the generation of an Encrypted Token, containing summary metadata about the event, clinical coding indicating its clinical context and a locator that can be used to retrieve the full record of the event from the original Source System. The Valkyrie architecture consists of a number of interlinked Security Domains, each with its own private and public keys, through which the Encrypted Tokens are passed. Each Security Domain performs a specific function on a set of Tokens and only has access to the information within each Token that is necessary to perform that function. This paper describes the structure of the Encrypted Token, the function of each Security Domain and the orchestration of the flow of Tokens through the Domains. Together this allows a user to run a Valkyrie Session, in which they can view the content of a patient record, where all content has been drawn in real-time from heterogenous Source Systems (ISO13606- and openEHR-based) and is destroyed when the session terminates.
Assuntos
Blockchain , Segurança Computacional , Registros Eletrônicos de Saúde , Noruega , Humanos , Registro Médico Coordenado/métodosRESUMO
This paper explores the challenges and lessons learned during the mapping of HL7 v2 messages structured using custom schema to openEHR for the Medical Data Integration Center (MeDIC) of the University Hospital, Schleswig-Holstein (UKSH). Missing timestamps in observations, missing units of measurement, inconsistencies in decimal separators and unexpected datatypes were identified as critical inconsistencies in this process. These anomalies highlight the difficulty of automating the transformation of HL7 v2 data to any standard, particularly openEHR, using off-the-shelf tools. Addressing these anomalies is crucial for enhancing data interoperability, supporting evidence-based research, and optimizing clinical decision-making. Implementing proper data quality measures and governance will unlock the potential of integrated clinical data, empowering clinicians and researchers and fostering a robust healthcare ecosystem.
Assuntos
Nível Sete de Saúde , Registros Eletrônicos de Saúde , Interoperabilidade da Informação em Saúde , Alemanha , Integração de Sistemas , Humanos , Registro Médico Coordenado/métodosRESUMO
The integration of tumor-related diagnosis and therapy data is a key factor for cancer-related collaborative projects and research projects on-site. The Medical Data Integration Center (MeDIC) of the University Hospital Schleswig-Holstein, resulting from the Medical Informatics Initiative and Network University Medicine in Germany, has agreed on an openEHR-based data management based on a centralized repository with harmonized annotated data. Consequently, the oncological data should be integrated into the MeDIC to interconnect the information and thus gain added value. A uniform national data set for tumor-related reports is already defined for the cancer registries. Therefore, this work aims to transform the national oncological basis data set for tumor documentation (oBDS) so that it can be stored and utilized properly in the openEHR repository of the MeDIC. In a previous work openEHR templates representing the oncological basis data set were modeled. These templates were used to implement a processing pipeline including a metadata repository, which defines the mappings between the elements, a FHIR terminology service for annotation and validation, resulting in a tool to automatically build openEHR compositions from oBDS data. The prototype proved the feasibility of the referred mapping, integration into the MeDIC is straightforward and the architecture introduced is adaptable to future needs by design.
Assuntos
Neoplasias , Humanos , Alemanha , Neoplasias/terapia , Oncologia , Registros Eletrônicos de Saúde , Registro Médico Coordenado/métodos , Pesquisa BiomédicaRESUMO
Real-world data (RWD) (i.e., data from Electronic Healthcare Records - EHRs, ePrescription systems, patient registries, etc.) gain increasing attention as they could support observational studies on a large scale. OHDSI is one of the most prominent initiatives regarding the harmonization of RWD and the development of relevant tools via the use of a common data model, OMOP-CDM. OMOP-CDM is a crucial step towards syntactic and semantic data interoperability. Still, OMOP-CDM is based on a typical relational database format, and thus, the vision of a fully connected semantically enriched model is not fully realized. This work presents an open-source effort to map the OMOP-CDM model and the data it hosts, to an ontological model using RDF to support the FAIRness of RWD and their interlinking with Linked Open Data (LOD) towards the vision of the Semantic Web.
Assuntos
Registros Eletrônicos de Saúde , Web Semântica , Humanos , Semântica , Registro Médico Coordenado/métodosRESUMO
Secondary use of data for research purposes is especially important in rare diseases (RD), since, per definition, data are sparse. The European Joint Programme on Rare Diseases (EJP RD) aims at developing an RD infrastructure which supports the secondary use of data. Significant amounts of RD data are a) distributed and b) available only in pseudonymised format. Privacy-Preserving Record Linkage (PPRL) concerns the linking of such distributed datasets without disclosing the participant's identities. We present a concept for linking a PPRL Service to the EJP RD Virtual Platform (VP). Level 1 (resource discovery) connection is provided by running an FDP within the PPRL Service. On Level 2 (data discoverability), the PPRL Service can represent both, an individual and a catalog endpoint. Our solution can count patients in PPRL-supporting resources, count duplicates only once, and count only patients registered to multiple resources. Currently, we are preparing the deployment within the EJP RD VP.
Assuntos
Registro Médico Coordenado , Doenças Raras , Humanos , Europa (Continente) , Registro Médico Coordenado/métodos , Confidencialidade , Anônimos e Pseudônimos , Registros Eletrônicos de Saúde , Segurança ComputacionalRESUMO
Over the last decade, the exponential growth in patient data volume and velocity has transformed it into a valuable resource for researchers. Yet, accessing comprehensive, unique patient data sets remains a challenge, particularly when individuals have received treatments across various practices and hospitals. Traditional record linkage methods fall short in adequately protecting patient privacy in these scenarios. Privacy Preserving Record Linkage (PPRL) offers a solution, employing techniques such as data cryptographic methods to identify common patients occurring in multiple datasets, while maintaining the privacy of other patients. This paper proposes an investigation into combined approaches of two common German PPRL tools, namely E-PIX and MainSEL. Each tool, while aiming for 'privacy preservation', employs distinct methods that offer unique advantages and drawbacks. Our research aims to explore these in a combined approach to leverage their respective strengths and mitigate their limitations. We anticipate that this synergistic approach will not only enhance data privacy but also allow for easier synchronisation of research data. This study is particularly pertinent in light of evolving privacy regulations and the increasing complexity of healthcare data management. By advancing PPRL methodologies, we aim to contribute to more robust, privacy-compliant data analysis practices in healthcare research.
Assuntos
Segurança Computacional , Confidencialidade , Registros Eletrônicos de Saúde , Registro Médico Coordenado , Alemanha , Registro Médico Coordenado/métodos , HumanosRESUMO
Healthcare faces significant challenges in exchanging and utilizing health information across diverse providers, necessitating innovative solutions for improved interoperability. This study presents a comprehensive exploration of scalable technical and semantic solutions for patient care integration, emphasizing the implementation of these solutions within the framework of the Fast Healthcare Interoperability Resources (FHIR) standard. Our approach revolves around the development and deployment of Technical Interoperability Suite (TIS) and Semantic Interoperability Suite (SIS) technology solutions to disparate health information systems, predominantly Electronic Health Records (EHRs) into a unified Patient Care Platform, fostering comprehensive data exchange and utilization. The integration process involves importing data from various EHR systems and transforming imported patient data into FHIR-standardized formats. The provided solution supports various functionalities, including automatic and manual importation of patient data, through standard computer-readable templates. The integration of TIS and SIS solutions is underpinned by a robust technological framework, incorporating technologies such as Typescript, Deno, and document-oriented databases such as MongoDB. The effectiveness of our interoperability solutions was validated through deployment in multinational EU projects: ADLIFE and CAREPATH. The scalability and generalizability of our approach underscore its potential for diverse healthcare settings.
Assuntos
Registros Eletrônicos de Saúde , Interoperabilidade da Informação em Saúde , Humanos , Registro Médico Coordenado/métodos , Semântica , Integração de SistemasRESUMO
Our initiative aims to enhance the public health informatics infrastructure for surveillance of maternal and child health (MCH) using data captured from electronic health records (EHRs), public health information systems, and administrative health data. Our work includes development, validation, and application of linkage algorithms across records for mothers and children; integration of data across myriad sources; design of routine surveillance reports; and design of longitudinal studies to examine determinants and outcomes in MCH populations. Our work is conducted in partnership with governmental public health agencies, health care providers, academic institutions, and community-based organizations. Future work will build on the enhanced informatics infrastructure to draw from additional public health data sources and/or expand surveillance efforts to include prioritized MCH outcomes. We will further translate knowledge gained from surveillance into action, working with our partners to improve and sustain better MCH equitably in our population.
Assuntos
Registros Eletrônicos de Saúde , Humanos , Criança , Feminino , Registro Médico Coordenado/métodos , Vigilância em Saúde Pública/métodos , Saúde da Criança , Saúde Materna , Estados UnidosRESUMO
In this paper, we present the preliminary experiments for the development of an ingestion mechanism to move data from Electronic Health Records to machine learning processes, based on the concept of Linked Data and the JSON-LD format.
Assuntos
Registros Eletrônicos de Saúde , Aprendizado de Máquina , Humanos , Registro Médico Coordenado/métodosRESUMO
In a previous study, sepsis was noted as a diagnosis on the home health record only 4% of the time for 165,000 sepsis survivors transitioning from hospital to home health care in America. If sepsis and other conditions are not clearly documented in the transitional care record this can lead to unpreparedness, missed, care, and poor patient outcomes. Our implementation science study discovered a source of this problem regarding the sepsis documentation in 16 hospitals referring to five home care agencies. Together, researchers, hospital, and home care personnel developed and implemented two information technology solutions to address this deficit in seven hospitals. The automated method was more readily adopted and effective in improving information transfer between hospital and home health care.
Assuntos
Registros Eletrônicos de Saúde , Sepse , Sobreviventes , Sepse/terapia , Humanos , Cuidado Transicional , Estados Unidos , Documentação , Continuidade da Assistência ao Paciente , Transferência de Pacientes , Serviços de Assistência Domiciliar , Registro Médico Coordenado/métodosRESUMO
PURPOSE: Real-world data (RWD) collected on patients treated as part of routine clinical care form the basis of cancer clinical registries. Capturing accurate death data can be challenging, with inaccurate survival data potentially compromising the integrity of registry-based research. Here, we explore the utility of data linkage (DL) to state-based registries to enhance the capture of survival outcomes. METHODS: We identified consecutive adult patients with brain tumors treated in the state of Victoria from the Brain Tumour Registry Australia: Innovation and Translation (BRAIN) database, who had no recorded date of death and no follow-up within the last 6 months. Full name and date of birth were used to match patients in the BRAIN registry with those in the Victorian Births, Deaths and Marriages (BDM) registry. Overall survival (OS) outcomes were compared pre- and post-DL. RESULTS: Of the 7,346 clinical registry patients, 5,462 (74%) had no date of death and no follow-up recorded within the last 6 months. Of the 5,462 patients, 1,588 (29%) were matched with a date of death in BDM. Factors associated with an increased number of matches were poor prognosis tumors, older age, and social disadvantage. OS was significantly overestimated pre-DL compared with post-DL for the entire cohort (pre- v post-DL: hazard ratio, 1.43; P < .001; median, 29.9 months v 16.7 months) and for most individual tumor types. This finding was present independent of the tumor prognosis. CONCLUSION: As revealed by linkage with BDM, a high proportion of patients in a brain cancer clinical registry had missing death data, contributed to by informative censoring, inflating OS calculations. DL to pertinent registries on an ongoing basis should be considered to ensure accurate reporting of survival data and interpretation of RWD outcomes.
Assuntos
Confiabilidade dos Dados , Sistema de Registros , Humanos , Feminino , Masculino , Pessoa de Meia-Idade , Idoso , Adulto , Neoplasias Encefálicas/mortalidade , Neoplasias Encefálicas/epidemiologia , Neoplasias Encefálicas/terapia , Registro Médico Coordenado/métodos , Idoso de 80 Anos ou mais , Prognóstico , Armazenamento e Recuperação da InformaçãoRESUMO
Analysis of integrated data often requires record linkage in order to join together the data residing in separate sources. In case linkage errors cannot be avoided, due to the lack a unique identity key that can be used to link the records unequivocally, standard statistical techniques may produce misleading inference if the linked data are treated as if they were true observations. In this paper, we propose methods for categorical data analysis based on linked data that are not prepared by the analyst, such that neither the match-key variables nor the unlinked records are available. The adjustment is based on the proportion of false links in the linked file and our approach allows the probabilities of correct linkage to vary across the records without requiring that one is able to estimate this probability for each individual record. It accommodates also the general situation where unmatched records that cannot possibly be correctly linked exist in all the sources. The proposed methods are studied by simulation and applied to real data.
Assuntos
Simulação por Computador , Registro Médico Coordenado , Modelos Estatísticos , Humanos , Registro Médico Coordenado/métodos , Interpretação Estatística de Dados , ProbabilidadeRESUMO
Children with experience of maltreatment, abuse or neglect have higher prevalence of poor mental health. In the United Kingdom, child protection services identify children at risk of significant harm on the Child Protection Register (CPR) and intervene to reduce risk. Prevalence and incidence of mental health service use among this population of children are not well understood. We analysed records from one Scottish Local Authority's CPR, linked to electronic health records for all children in the broader health board region aged 0-17 years. We described mental health service use among children with a CPR registration using measures of mental health prescribing and referrals to child and adolescent mental health services (CAMHS). We calculated age- and sex-specific incidence rates for comparison with the general population. Between 2012 and 2022, we found 1498 children with a CPR registration, with 69% successfully linked to their health records. 20% were registered before birth and median age at registration was 3 years. Incidence rates in all measures of mental health service use were higher in children with a CPR record across all ages (at outcome) and genders compared to the general population. The largest absolute difference was for boys aged 5-9 with a CPR record, who had 31.8 additional mental health prescriptions per 1000 person-years compared to the general population (50.4 vs. 18.6 prescriptions per 1000 person-years, IRR: 2.7). Girls aged 0-4 years with a CPR registration had the largest relative difference, with a rate of CAMHS referral 5.4 times higher than the general population (12.3 vs. 2.3 per 1000 person-years). Our reproducible record linkage of the CPR to health records reveals an increased risk of mental health service use during childhood. Our findings have relevance to public mental health surveillance, service prioritisation and wider policy aiming to reduce childhood exposure to risk of harm.
Assuntos
Maus-Tratos Infantis , Serviços de Proteção Infantil , Serviços de Saúde Mental , Humanos , Criança , Masculino , Feminino , Adolescente , Pré-Escolar , Serviços de Saúde Mental/estatística & dados numéricos , Lactente , Escócia/epidemiologia , Serviços de Proteção Infantil/estatística & dados numéricos , Maus-Tratos Infantis/estatística & dados numéricos , Sistema de Registros , Recém-Nascido , Incidência , Registros Eletrônicos de Saúde/estatística & dados numéricos , Registro Médico Coordenado/métodosRESUMO
The digital health progress hubs pilot the extensibility of the concepts and solutions of the Medical Informatics Initiative to improve regional healthcare and research. The six funded projects address different diseases, areas in regional healthcare, and methods of cross-institutional data linking and use. Despite the diversity of the scenarios and regional conditions, the technical, regulatory, and organizational challenges and barriers that the progress hubs encounter in the actual implementation of the solutions are often similar. This results in some common approaches to solutions, but also in political demands that go beyond the Health Data Utilization Act, which is considered a welcome improvement by the progress hubs.In this article, we present the digital progress hubs and discuss achievements, challenges, and approaches to solutions that enable the shared use of data from university hospitals and non-academic institutions in the healthcare system and can make a sustainable contribution to improving medical care and research.