Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 47
Filtrar
1.
J Clin Epidemiol ; 175: 111516, 2024 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-39243872

RESUMO

OBJECTIVE: High-quality data entry in clinical trial databases is crucial to the usefulness, validity, and replicability of research findings, as it influences evidence-based medical practice and future research. Our aim is to assess the quality of self-reported data in trial registries and present practical and systematic methods for identifying and evaluating data quality. STUDY DESIGN AND SETTING: We searched ClinicalTrials.Gov (CTG) for interventional total knee arthroplasty (TKA) trials between 2000 and 2015. We extracted required and optional trial information elements and used the CTG's variables' definitions. We performed a literature review on data quality reporting on frameworks, checklists, and overviews of irregularities in healthcare databases. We identified and assessed data quality attributes as follows: consistency, accuracy, completeness, and timeliness. RESULTS: We included 816 interventional TKA trials. Data irregularities varied widely: 0%-100%. Inconsistency ranged from 0% to 36%, and most often nonrandomized labeled allocation was combined with a "single-group" assignment trial design. Inaccuracy ranged from 0% to 100%. Incompleteness ranged from 0% to 61%; 61% of finished TKA trials did not report their outcome. With regard to irregularities in timeliness, 49% of the trials were registered more than 3 months after the start date. CONCLUSION: We found significant variations in the data quality of registered clinical TKA trials. Trial sponsors should be committed to ensuring that the information they provide is reliable, consistent, up-to-date, transparent, and accurate. CTG's users need to be critical when drawing conclusions based on the registered data. We believe this awareness will increase well-informed decisions about published articles and treatment protocols, including replicating and improving trial designs.

2.
Sensors (Basel) ; 24(13)2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-39001065

RESUMO

Accelerometers are mainly used to measure the non-conservative forces at the center of mass of gravity satellites and are the core payloads of gravity satellites. All kinds of disturbances in the satellite platform and the environment will affect the quality of the accelerometer data. This paper focuses on the quality assessment of accelerometer data from the GRACE-FO satellites. Based on the ACC1A data, we focus on the analysis of accelerometer data anomalies caused by various types of disturbances in the satellite platform and environment, including thruster spikes, peaks, twangs, and magnetic torque disturbances. The data characteristics and data accuracy of the accelerometer in different operational states and satellite observation modes are analyzed using accelerometer observation data from different time periods. Finally, the data consistency of the accelerometer is analyzed using the accelerometer transplantation method. The results show that the amplitude spectral density of three-axis linear acceleration is better than the specified accuracy (above 10-1 Hz) in the accelerometer's nominal status. The results are helpful for understanding the characteristics and data accuracy of GRACE-FO accelerometer observations.

3.
J Comp Eff Res ; 13(8): e240095, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38967245

RESUMO

In this update, we discuss recent US FDA guidance offering more specific guidelines on appropriate study design and analysis to support causal inference for non-interventional studies and the launch of the European Medicines Agency (EMA) and the Heads of Medicines Agencies (HMA) public electronic catalogues. We also highlight an article recommending assessing data quality and suitability prior to protocol finalization and a Journal of the American Medical Association-endorsed framework for using causal language when publishing real-world evidence studies. Finally, we explore the potential of large language models to automate the development of health economic models.


Assuntos
Avaliação da Tecnologia Biomédica , Avaliação da Tecnologia Biomédica/métodos , Avaliação da Tecnologia Biomédica/economia , Humanos , Estados Unidos , Pesquisa Comparativa da Efetividade , Projetos de Pesquisa , United States Food and Drug Administration , Modelos Econômicos , Mecanismo de Reembolso
4.
Health Informatics J ; 30(2): 14604582241259336, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38848696

RESUMO

Keeping track of data semantics and data changes in the databases is essential to support retrospective studies and the reproducibility of longitudinal clinical analysis by preventing false conclusions from being drawn from outdated data. A knowledge model combined with a temporal model plays an essential role in organizing the data and improving query expressiveness across time and multiple institutions. This paper presents a modelling framework for temporal relational databases using an ontology to derive a shareable and interoperable data model. The framework is based on: OntoRela an ontology-driven database modelling approach and Unified Historicization Framework a temporal database modelling approach. The method was applied to hospital organizational structures to show the impact of tracking organizational changes on data quality assessment, healthcare activities and data access rights. The paper demonstrated the usefulness of an ontology to provide a formal, interoperable, and reusable definition of entities and their relationships, as well as the adequacy of the temporal database to store, trace, and query data over time.


Assuntos
Bases de Dados Factuais , Humanos , Administração Hospitalar/métodos , Gerenciamento de Dados/métodos
5.
Protein Sci ; 33(4): e4946, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38501481

RESUMO

The two major challenges in synchrotron size-exclusion chromatography coupled in-line with small-angle x-ray scattering (SEC-SAXS) experiments are the overlapping peaks in the elution profile and the fouling of radiation-damaged materials on the walls of the sample cell. In recent years, many post-experimental analyses techniques have been developed and applied to extract scattering profiles from these problematic SEC-SAXS data. Here, we present three modes of data collection at the BioSAXS Beamline 4-2 of the Stanford Synchrotron Radiation Lightsource (SSRL BL4-2). The first mode, the High-Resolution mode, enables SEC-SAXS data collection with excellent sample separation and virtually no additional peak broadening from the UHPLC UV detector to the x-ray position by taking advantage of the low system dispersion of the UHPLC. The small bed volume of the analytical SEC column minimizes sample dilution in the column and facilitates data collection at higher sample concentrations with excellent sample economy equal to or even less than that of the conventional equilibrium SAXS method. Radiation damage problems during SEC-SAXS data collection are evaded by additional cleaning of the sample cell after buffer data collection and avoidance of unnecessary exposures through the use of the x-ray shutter control options, allowing sample data collection with a clean sample cell. Therefore, accurate background subtraction can be performed at a level equivalent to the conventional equilibrium SAXS method without requiring baseline correction, thereby leading to more reliable downstream structural analysis and quicker access to new science. The two other data collection modes, the High-Throughput mode and the Co-Flow mode, add agility to the planning and execution of experiments to efficiently achieve the user's scientific objectives at the SSRL BL4-2.


Assuntos
Síncrotrons , Difração de Raios X , Espalhamento a Baixo Ângulo , Cromatografia em Gel
6.
JMIR Med Inform ; 12: e47744, 2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38446504

RESUMO

BACKGROUND: The importance of real-world evidence is widely recognized in observational oncology studies. However, the lack of interoperable data quality standards in the fragmented health information technology landscape represents an important challenge. Therefore, adopting validated systematic methods for evaluating data quality is important for oncology outcomes research leveraging real-world data (RWD). OBJECTIVE: This study aims to implement real-world time to treatment discontinuation (rwTTD) for a systemic anticancer therapy (SACT) as a new use case for the Use Case Specific Relevance and Quality Assessment, a framework linking data quality and relevance in fit-for-purpose RWD assessment. METHODS: To define the rwTTD use case, we mapped the operational definition of rwTTD to RWD elements commonly available from oncology electronic health record-derived data sets. We identified 20 tasks to check the completeness and plausibility of data elements concerning SACT use, line of therapy (LOT), death date, and length of follow-up. Using descriptive statistics, we illustrated how to implement the Use Case Specific Relevance and Quality Assessment on 2 oncology databases (Data sets A and B) to estimate the rwTTD of an SACT drug (target SACT) for patients with advanced head and neck cancer diagnosed on or after January 1, 2015. RESULTS: A total of 1200 (24.96%) of 4808 patients in Data set A and 237 (5.92%) of 4003 patients in Data set B received the target SACT, suggesting better relevance of the former in estimating the rwTTD of the target SACT. The 2 data sets differed with regard to the terminology used for SACT drugs, LOT format, and target SACT LOT distribution over time. Data set B appeared to have less complete SACT records, longer lags in incorporating the latest data, and incomplete mortality data, suggesting a lack of fitness for estimating rwTTD. CONCLUSIONS: The fit-for-purpose data quality assessment demonstrated substantial variability in the quality of the 2 real-world data sets. The data quality specifications applied for rwTTD estimation can be expanded to support a broad spectrum of oncology use cases.

7.
JMIR Med Inform ; 12: e51560, 2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38446534

RESUMO

BACKGROUND: Health care has not reached the full potential of the secondary use of health data because of-among other issues-concerns about the quality of the data being used. The shift toward digital health has led to an increase in the volume of health data. However, this increase in quantity has not been matched by a proportional improvement in the quality of health data. OBJECTIVE: This review aims to offer a comprehensive overview of the existing frameworks for data quality dimensions and assessment methods for the secondary use of health data. In addition, it aims to consolidate the results into a unified framework. METHODS: A review of reviews was conducted including reviews describing frameworks of data quality dimensions and their assessment methods, specifically from a secondary use perspective. Reviews were excluded if they were not related to the health care ecosystem, lacked relevant information related to our research objective, and were published in languages other than English. RESULTS: A total of 22 reviews were included, comprising 22 frameworks, with 23 different terms for dimensions, and 62 definitions of dimensions. All dimensions were mapped toward the data quality framework of the European Institute for Innovation through Health Data. In total, 8 reviews mentioned 38 different assessment methods, pertaining to 31 definitions of the dimensions. CONCLUSIONS: The findings in this review revealed a lack of consensus in the literature regarding the terminology, definitions, and assessment methods for data quality dimensions. This creates ambiguity and difficulties in developing specific assessment methods. This study goes a step further by assigning all observed definitions to a consolidated framework of 9 data quality dimensions.

8.
Ther Innov Regul Sci ; 58(3): 483-494, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38334868

RESUMO

BACKGROUND: Central monitoring aims at improving the quality of clinical research by pro-actively identifying risks and remediating emerging issues in the conduct of a clinical trial that may have an adverse impact on patient safety and/or the reliability of trial results. This paper, focusing on statistical data monitoring (SDM), is the second of a series that attempts to quantify the impact of central monitoring in clinical trials. MATERIAL AND METHODS: Quality improvement was assessed in studies using SDM from a single large central monitoring platform. The analysis focused on a total of 1111 sites that were identified as at-risk by the SDM tests and for which the study teams conducted a follow-up investigation. These sites were taken from 159 studies conducted by 23 different clinical development organizations (including both sponsor companies and contract research organizations). Two quality improvement metrics were assessed for each selected site, one based on a site data inconsistency score (DIS, overall -log10 P-value of the site compared with all other sites) and the other based on the observed metric value associated with each risk signal. RESULTS: The SDM quality metrics showed improvement in 83% (95% CI, 80-85%) of the sites across therapeutic areas and study phases (primarily phases 2 and 3). In contrast, only 56% (95% CI, 41-70%) of sites showed improvement in 2 historical studies that did not use SDM during study conduct. CONCLUSION: The results of this analysis provide clear quantitative evidence supporting the hypothesis that the use of SDM in central monitoring is leading to improved quality in clinical trial conduct and associated data across participating sites.


Assuntos
Ensaios Clínicos como Assunto , Confiabilidade dos Dados , Melhoria de Qualidade , Humanos , Comitês de Monitoramento de Dados de Ensaios Clínicos , Reprodutibilidade dos Testes , Segurança do Paciente
9.
Methods Mol Biol ; 2739: 275-299, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38006558

RESUMO

This chapter gives a brief overview of how to screen existing host genomic data for the presence of endosymbionts, such as Wolbachia. The various programs used provide test examples, and the corresponding manuals and discussion boards provide invaluable information. Please do consult these resources.


Assuntos
Wolbachia , Genoma Bacteriano , Genômica , Filogenia , Simbiose/genética , Wolbachia/genética
10.
BMC Health Serv Res ; 23(1): 1139, 2023 Oct 23.
Artigo em Inglês | MEDLINE | ID: mdl-37872540

RESUMO

BACKGROUND: In this evaluation, we aim to strengthen Routine Health Information Systems (RHIS) through the digitization of data quality assessment (DQA) processes. We leverage electronic data from the Kenya Health Information System (KHIS) which is based on the District Health Information System version 2 (DHIS2) to perform DQAs at scale. We provide a systematic guide to developing composite data quality scores and use these scores to assess data quality in Kenya. METHODS: We evaluated 187 HIV care facilities with electronic medical records across Kenya. Using quarterly, longitudinal KHIS data from January 2011 to June 2018 (total N = 30 quarters), we extracted indicators encompassing general HIV services including services to prevent mother-to-child transmission (PMTCT). We assessed the accuracy (the extent to which data were correct and free of error) of these data using three data-driven composite scores: 1) completeness score; 2) consistency score; and 3) discrepancy score. Completeness refers to the presence of the appropriate amount of data. Consistency refers to uniformity of data across multiple indicators. Discrepancy (measured on a Z-scale) refers to the degree of alignment (or lack thereof) of data with rules that defined the possible valid values for the data. RESULTS: A total of 5,610 unique facility-quarters were extracted from KHIS. The mean completeness score was 61.1% [standard deviation (SD) = 27%]. The mean consistency score was 80% (SD = 16.4%). The mean discrepancy score was 0.07 (SD = 0.22). A strong and positive correlation was identified between the consistency score and discrepancy score (correlation coefficient = 0.77), whereas the correlation of either score with the completeness score was low with a correlation coefficient of -0.12 (with consistency score) and -0.36 (with discrepancy score). General HIV indicators were more complete, but less consistent, and less plausible than PMTCT indicators. CONCLUSION: We observed a lack of correlation between the completeness score and the other two scores. As such, for a holistic DQA, completeness assessment should be paired with the measurement of either consistency or discrepancy to reflect distinct dimensions of data quality. Given the complexity of the discrepancy score, we recommend the simpler consistency score, since they were highly correlated. Routine use of composite scores on KHIS data could enhance efficiencies in DQA at scale as digitization of health information expands and could be applied to other health sectors beyondHIV clinics.


Assuntos
Confiabilidade dos Dados , Infecções por HIV , Humanos , Feminino , Quênia/epidemiologia , Estudos Retrospectivos , Transmissão Vertical de Doenças Infecciosas/prevenção & controle , Infecções por HIV/diagnóstico , Infecções por HIV/epidemiologia , Infecções por HIV/prevenção & controle , Eletrônica
11.
J Public Health Policy ; 44(4): 523-534, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37726394

RESUMO

Patient surgical registries are essential tools for public health specialists, creating research opportunities through linkage of registry data with healthcare outcomes. However, little is known regarding data error sources in the management of surgical registries. In June 2022, we undertook a scoping study of the empirical literature including publications selected from the PUBMED and EMBASE databases. We selected 48 studies focussing on shared experiences centred around developing surgical patient registries. We identified seven types of data specific challenges, grouped in three categories- data capture, data analysis and result dissemination. Most studies underlined the risk for a high volume of missing data, non-uniform geographic representation, inclusion biases, inappropriate coding, as well as variations in analysis reporting and limitations related to the statistical analysis. Finally, to expand data usability, we discussed cost-effective ways of addressing these limitations, by citing aspects from the protocols followed by established exemplary registries.


Assuntos
Pacientes , Sistema de Registros , Procedimentos Cirúrgicos Operatórios , Humanos
12.
J Med Syst ; 47(1): 23, 2023 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-36781551

RESUMO

Information systems such as Electronic Health Record (EHR) systems are susceptible to data quality (DQ) issues. Given the growing importance of EHR data, there is an increasing demand for strategies and tools to help ensure that available data are fit for use. However, developing reliable data quality assessment (DQA) tools necessary for guiding and evaluating improvement efforts has remained a fundamental challenge. This review examines the state of research on operationalising EHR DQA, mainly automated tooling, and highlights necessary considerations for future implementations. We reviewed 1841 articles from PubMed, Web of Science, and Scopus published between 2011 and 2021. 23 DQA programs deployed in real-world settings to assess EHR data quality (n = 14), and a few experimental prototypes (n = 9), were identified. Many of these programs investigate completeness (n = 15) and value conformance (n = 12) quality dimensions and are backed by knowledge items gathered from domain experts (n = 9), literature reviews and existing DQ measurements (n = 3). A few DQA programs also explore the feasibility of using data-driven techniques to assess EHR data quality automatically. Overall, the automation of EHR DQA is gaining traction, but current efforts are fragmented and not backed by relevant theory. Existing programs also vary in scope, type of data supported, and how measurements are sourced. There is a need to standardise programs for assessing EHR data quality, as current evidence suggests their quality may be unknown.


Assuntos
Confiabilidade dos Dados , Registros Eletrônicos de Saúde , Humanos , Software
13.
JAMIA Open ; 5(4): ooac093, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-36339052

RESUMO

Objective: To gain insights into how data vendor companies (DVs), an important source of de-identified/anonymized licensed patient-related data (D/ALD) used in clinical informatics research in life sciences and the pharmaceutical industry, characterize, conduct, and communicate data quality assessments to researcher purchasers of D/ALD. Materials and Methods: A qualitative study with interviews of DVs executives and decision-makers in data quality assessments (n = 12) and content analysis of interviews transcripts. Results: Data quality, from the perspective of DVs, is characterized by how it is defined, validated, and processed. DVs identify data quality as the main contributor to successful collaborations with life sciences/pharmaceutical research partners. Data quality feedback from clients provides the basis for DVs reviews and inspections of quality processes. DVs value customer interactions, view collaboration, shared common goals, mutual expertise, and communication related to data quality as success factors. Conclusion: Data quality evaluation practices are important. However, no uniform DVs industry standards for data quality assessment were identified. DVs describe their orientation to data quality evaluation as a direct result of not only the complex nature of data sources, but also of techniques, processes, and approaches used to construct data sets. Because real-world data (RWD), eg, patient data from electronic medical records, is used for real-world evidence (RWE) generation, the use of D/ALD will expand and require refinement. The focus on (and rigor in) data quality assessment (particularly in research necessary to make regulatory decisions) will require more structure, standards, and collaboration between DVs, life sciences/pharmaceutical, informaticists, and RWD/RWE policy-making stakeholders.

14.
BMC Med Inform Decis Mak ; 22(1): 213, 2022 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-35953813

RESUMO

BACKGROUND: With the growing impact of observational research studies, there is also a growing focus on data quality (DQ). As opposed to experimental study designs, observational research studies are performed using data mostly collected in a non-research context (secondary use). Depending on the number of data elements to be analyzed, DQ reports of data stored within research networks can grow very large. They might be cumbersome to read and important information could be overseen quickly. To address this issue, a DQ assessment (DQA) tool with a graphical user interface (GUI) was developed and provided as a web application. METHODS: The aim was to provide an easy-to-use interface for users without prior programming knowledge to carry out DQ checks and to present the results in a clearly structured way. This interface serves as a starting point for a more detailed investigation of possible DQ irregularities. A user-centered development process ensured the practical feasibility of the interactive GUI. The interface was implemented in the R programming language and aligned to Kahn et al.'s DQ categories conformance, completeness and plausibility. RESULTS: With DQAgui, an R package with a web-app frontend for DQ assessment was developed. The GUI allows users to perform DQ analyses of tabular data sets and to systematically evaluate the results. During the development of the GUI, additional features were implemented, such as analyzing a subset of the data by defining time periods and restricting the analyses to certain data elements. CONCLUSIONS: As part of the MIRACUM project, DQAgui is now being used at ten German university hospitals for DQ assessment and to provide a central overview of the availability of important data elements in a datamap over 2 years. Future development efforts should focus on design optimization and include a usability evaluation.


Assuntos
Confiabilidade dos Dados , Software , Hospitais Universitários , Humanos , Interface Usuário-Computador
15.
Sensors (Basel) ; 22(4)2022 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-35214509

RESUMO

Sometimes it is difficult, or even impossible, to acquire real data from sensors and machines that must be used in research. Such examples are the modern industrial platforms that frequently are reticent to share data. In such situations, the only option is to work with synthetic data obtained by simulation. Regarding simulated data, a limitation could consist in the fact that the data are not appropriate for research, based on poor quality or limited quantity. In such cases, the design of algorithms that are tested on that data does not give credible results. For avoiding such situations, we consider that mathematically grounded data-quality assessments should be designed according to the specific type of problem that must be solved. In this paper, we approach a multivariate type of prediction whose results finally can be used for binary classification. We propose the use of a mathematically grounded data-quality assessment, which includes, among other things, the analysis of predictive power of independent variables used for prediction. We present the assumptions that should be passed by the synthetic data. Different threshold values are established by a human assessor. In the case of research data, if all the assumptions pass, then we can consider that the data are appropriate for research and can be applied by even using other methods for solving the same type of problem. The applied method finally delivers a classification table on which can be applied any indicators of performed classification quality, such as sensitivity, specificity, accuracy, F1 score, area under curve (AUC), receiver operating characteristics (ROC), true skill statistics (TSS) and Kappa coefficient. These indicators' values offer the possibility of comparison of the results obtained by applying the considered method with results of any other method applied for solving the same type of problem. For evaluation and validation purposes, we performed an experimental case study on a novel synthetic dataset provided by the well-known UCI data repository.


Assuntos
Algoritmos , Confiabilidade dos Dados , Área Sob a Curva , Simulação por Computador , Humanos , Curva ROC
16.
Drug Discov Today ; 27(5): 1441-1447, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35066138

RESUMO

Over recent years, there has been exciting growth in collaboration between academia and industry in the life sciences to make data more Findable, Accessible, Interoperable and Reusable (FAIR) to achieve greater value. Despite considerable progress, the transformative shift from an application-centric to a data-centric perspective, enabled by FAIR implementation, remains very much a work in progress on the 'FAIR journey'. In this review, we consider use cases for FAIR implementation. These can be deployed alongside assessment of data quality to maximize the value of data generated from research, clinical trials, and real-world healthcare data, which are essential for the discovery and development of new medical treatments by biopharma.


Assuntos
Disciplinas das Ciências Biológicas , Confiabilidade dos Dados , Indústrias
17.
BMC Med Inform Decis Mak ; 21(1): 297, 2021 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-34717599

RESUMO

BACKGROUND: The use of general practice electronic health records (EHRs) for research purposes is in its infancy in Australia. Given these data were collected for clinical purposes, questions remain around data quality and whether these data are suitable for use in prediction model development. In this study we assess the quality of data recorded in 201,462 patient EHRs from 483 Australian general practices to determine its usefulness in the development of a clinical prediction model for total knee replacement (TKR) surgery in patients with osteoarthritis (OA). METHODS: Variables to be used in model development were assessed for completeness and plausibility. Accuracy for the outcome and competing risk were assessed through record level linkage with two gold standard national registries, Australian Orthopaedic Association National Joint Replacement Registry (AOANJRR) and National Death Index (NDI). The validity of the EHR data was tested using participant characteristics from the 2014-15 Australian National Health Survey (NHS). RESULTS: There were substantial missing data for body mass index and weight gain between early adulthood and middle age. TKR and death were recorded with good accuracy, however, year of TKR, year of death and side of TKR were poorly recorded. Patient characteristics recorded in the EHR were comparable to participant characteristics from the NHS, except for OA medication and metastatic solid tumour. CONCLUSIONS: In this study, data relating to the outcome, competing risk and two predictors were unfit for prediction model development. This study highlights the need for more accurate and complete recording of patient data within EHRs if these data are to be used to develop clinical prediction models. Data linkage with other gold standard data sets/registries may in the meantime help overcome some of the current data quality challenges in general practice EHRs when developing prediction models.


Assuntos
Confiabilidade dos Dados , Registros Eletrônicos de Saúde , Adulto , Austrália , Medicina de Família e Comunidade , Humanos , Pessoa de Meia-Idade , Modelos Estatísticos , Prognóstico
18.
BMC Med Inform Decis Mak ; 21(1): 289, 2021 10 20.
Artigo em Inglês | MEDLINE | ID: mdl-34670548

RESUMO

BACKGROUND: To describe an automated method for assessment of the plausibility of continuous variables collected in the electronic health record (EHR) data for real world evidence research use. METHODS: The most widely used approach in quality assessment (QA) for continuous variables is to detect the implausible numbers using prespecified thresholds. In augmentation to the thresholding method, we developed a score-based method that leverages the longitudinal characteristics of EHR data for detection of the observations inconsistent with the history of a patient. The method was applied to the height and weight data in the EHR from the Million Veteran Program Data from the Veteran's Healthcare Administration (VHA). A validation study was also conducted. RESULTS: The receiver operating characteristic (ROC) metrics of the developed method outperforms the widely used thresholding method. It is also demonstrated that different quality assessment methods have a non-ignorable impact on the body mass index (BMI) classification calculated from height and weight data in the VHA's database. CONCLUSIONS: The score-based method enables automated and scaled detection of the problematic data points in health care big data while allowing the investigators to select the high-quality data based on their need. Leveraging the longitudinal characteristics in EHR will significantly improve the QA performance.


Assuntos
Registros Eletrônicos de Saúde , Veteranos , Big Data , Confiabilidade dos Dados , Gerenciamento de Dados , Humanos
19.
BMC Health Serv Res ; 21(Suppl 1): 214, 2021 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-34511104

RESUMO

BACKGROUND: Monitoring medically certified causes of death is essential to shape national health policies, track progress to Sustainable Development Goals, and gauge responses to epidemic and pandemic disease. The combination of electronic health information systems with new methods for data quality monitoring can facilitate quality assessments and help target quality improvement. Since 2015, Tanzania has been upgrading its Civil Registration and Vital Statistics system including efforts to improve the availability and quality of mortality data. METHODS: We used a computer application (ANACONDA v4.01) to assess the quality of medical certification of cause of death (MCCD) and ICD-10 coding for the underlying cause of death for 155,461 deaths from health facilities from 2014 to 2018. From 2018 to 2019, we continued quality analysis for 2690 deaths in one large administrative region 9 months before, and 9 months following MCCD quality improvement interventions. Interventions addressed governance, training, process, and practice. We assessed changes in the levels, distributions, and nature of unusable and insufficiently specified codes, and how these influenced estimates of the leading causes of death. RESULTS: 9.7% of expected annual deaths in Tanzania obtained a medically certified cause of death. Of these, 52% of MCCD ICD-10 codes were usable for health policy and planning, with no significant improvement over 5 years. Of certified deaths, 25% had unusable codes, 17% had insufficiently specified codes, and 6% were undetermined causes. Comparing the before and after intervention periods in one Region, codes usable for public health policy purposes improved from 48 to 65% within 1 year and the resulting distortions in the top twenty cause-specific mortality fractions due to unusable causes reduced from 27.4 to 13.5%. CONCLUSION: Data from less than 5% of annual deaths in Tanzania are usable for informing policy. For deaths with medical certification, errors were prevalent in almost half. This constrains capacity to monitor the 15 SDG indicators that require cause-specific mortality. Sustainable quality assurance mechanisms and interventions can result in rapid improvements in the quality of medically certified causes of death. ANACONDA provides an effective means for evaluation of such changes and helps target interventions to remaining weaknesses.


Assuntos
Confiabilidade dos Dados , Instalações de Saúde , Causas de Morte , Certificação , Humanos , Tanzânia/epidemiologia
20.
BMC Med Inform Decis Mak ; 21(1): 113, 2021 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-33812388

RESUMO

BACKGROUND: Ensuring data is of appropriate quality is essential for the secondary use of electronic health records (EHRs) in research and clinical decision support. An effective method of data quality assessment (DQA) is automating data quality rules (DQRs) to replace the time-consuming, labor-intensive manual process of creating DQRs, which is difficult to guarantee standard and comparable DQA results. This paper presents a case study of automatically creating DQRs based on openEHR archetypes in a Chinese hospital to investigate the feasibility and challenges of automating DQA for EHR data. METHODS: The clinical data repository (CDR) of the Shanxi Dayi Hospital is an archetype-based relational database. Four steps are undertaken to automatically create DQRs in this CDR database. First, the keywords and features relevant to DQA of archetypes were identified via mapping them to a well-established DQA framework, Kahn's DQA framework. Second, the templates of DQRs in correspondence with these identified keywords and features were created in the structured query language (SQL). Third, the quality constraints were retrieved from archetypes. Fourth, these quality constraints were automatically converted to DQRs according to the pre-designed templates and mapping relationships of archetypes and data tables. We utilized the archetypes of the CDR to automatically create DQRs to meet quality requirements of the Chinese Application-Level Ranking Standard for EHR Systems (CARSES) and evaluated their coverage by comparing with expert-created DQRs. RESULTS: We used 27 archetypes to automatically create 359 DQRs. 319 of them are in agreement with the expert-created DQRs, covering 84.97% (311/366) requirements of the CARSES. The auto-created DQRs had varying levels of coverage of the four quality domains mandated by the CARSES: 100% (45/45) of consistency, 98.11% (208/212) of completeness, 54.02% (57/87) of conformity, and 50% (11/22) of timeliness. CONCLUSION: It's feasible to create DQRs automatically based on openEHR archetypes. This study evaluated the coverage of the auto-created DQRs to a typical DQA task of Chinese hospitals, the CARSES. The challenges of automating DQR creation were identified, such as quality requirements based on semantic, and complex constraints of multiple elements. This research can enlighten the exploration of DQR auto-creation and contribute to the automatic DQA.


Assuntos
Sistemas de Apoio a Decisões Clínicas , Registros Eletrônicos de Saúde , Confiabilidade dos Dados , Humanos , Idioma , Semântica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA