Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
J Am Med Inform Assoc ; 28(7): 1591-1599, 2021 07 14.
Artigo em Inglês | MEDLINE | ID: mdl-33496785

RESUMO

OBJECTIVE: Data quality (DQ) must be consistently defined in context. The attributes, metadata, and context of longitudinal real-world data (RWD) have not been formalized for quality improvement across the data production and curation life cycle. We sought to complete a literature review on DQ assessment frameworks, indicators and tools for research, public health, service, and quality improvement across the data life cycle. MATERIALS AND METHODS: The review followed PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Databases from health, physical and social sciences were used: Cinahl, Embase, Scopus, ProQuest, Emcare, PsycINFO, Compendex, and Inspec. Embase was used instead of PubMed (an interface to search MEDLINE) because it includes all MeSH (Medical Subject Headings) terms used and journals in MEDLINE as well as additional unique journals and conference abstracts. A combined data life cycle and quality framework guided the search of published and gray literature for DQ frameworks, indicators, and tools. At least 2 authors independently identified articles for inclusion and extracted and categorized DQ concepts and constructs. All authors discussed findings iteratively until consensus was reached. RESULTS: The 120 included articles yielded concepts related to contextual (data source, custodian, and user) and technical (interoperability) factors across the data life cycle. Contextual DQ subcategories included relevance, usability, accessibility, timeliness, and trust. Well-tested computable DQ indicators and assessment tools were also found. CONCLUSIONS: A DQ assessment framework that covers intrinsic, technical, and contextual categories across the data life cycle enables assessment and management of RWD repositories to ensure fitness for purpose. Balancing security, privacy, and FAIR principles requires trust and reciprocity, transparent governance, and organizational cultures that value good documentation.


Assuntos
Confiabilidade dos Dados , Melhoria de Qualidade , Animais , Estágios do Ciclo de Vida
2.
Am Heart J ; 226: 75-84, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32526532

RESUMO

BACKGROUND: The objective was to describe the design of a population-level electronic health record (EHR) and insurance claims-based surveillance system of adolescents and adults with congenital heart defects (CHDs) in Colorado and to evaluate the bias introduced by duplicate cases across data sources. METHODS: The Colorado CHD Surveillance System ascertained individuals aged 11-64 years with a CHD based on International Classification of Diseases, Ninth Revision, Clinical Modification diagnostic coding between 2011 and 2013 from a diverse network of health care systems and an All Payer Claims Database (APCD). A probability-based identity reconciliation algorithm identified duplicate cases. Logistic regression was conducted to investigate bias introduced by duplicate cases on the relationship between CHD severity (severe compared to moderate/mild) and adverse outcomes including all-cause mortality, inpatient hospitalization, and major adverse cardiac events (myocardial infarction, congestive heart failure, or cerebrovascular event). Sensitivity analyses were conducted to investigate bias introduced by the sole use or exclusion of APCD data. RESULTS: A total of 12,293 unique cases were identified, of which 3,476 had a within or between data source duplicate. Duplicate cases were more likely to be in the youngest age group and have private health insurance, a severe heart defect, a CHD comorbidity, and higher health care utilization. We found that failure to resolve duplicate cases between data sources would inflate the relationship between CHD severity and both morbidity and mortality outcomes by ~15%. Sensitivity analyses indicate that scenarios in which APCD was excluded from case finding or relied upon as the sole source of case finding would also result in an overestimation of the relationship between a CHD severity and major adverse outcomes. DISCUSSION: Aggregated EHR- and claims-based surveillance systems of adolescents and adults with CHD that fail to account for duplicate records will introduce considerable bias into research findings. CONCLUSION: Population-level surveillance systems for rare chronic conditions, such as congenital heart disease, based on aggregation of EHR and claims data require sophisticated identity reconciliation methods to prevent bias introduced by duplicate cases.


Assuntos
Cardiopatias Congênitas/epidemiologia , Armazenamento e Recuperação da Informação/estatística & dados numéricos , Registro Médico Coordenado , Vigilância da População/métodos , Adolescente , Adulto , Viés , Criança , Colorado/epidemiologia , Registros Eletrônicos de Saúde , Feminino , Humanos , Formulário de Reclamação de Seguro , Masculino , Pessoa de Meia-Idade , Adulto Jovem
3.
J Am Med Inform Assoc ; 27(4): 505-513, 2020 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-32049329

RESUMO

OBJECTIVE: The disjointed healthcare system and the nonexistence of a universal patient identifier across systems necessitates accurate record linkage (RL). We aim to describe the implementation and evaluation of a hybrid record linkage method in a statewide surveillance system for congenital heart disease. MATERIALS AND METHODS: Clear-text personally identifiable information on individuals in the Colorado Congenital Heart Disease surveillance system was obtained from 5 electronic health record and medical claims data sources. Two deterministic methods and 1 probabilistic RL method using first name, last name, social security number, date of birth, and house number were initially implemented independently and then sequentially in a hybrid approach to assess RL performance. RESULTS: 16 480 nonunique individuals with congenital heart disease were ascertained. Deterministic linkage methods, when performed independently, yielded 4505 linked pairs (consisting of 2 records linked together within or across data sources). Probabilistic RL, using 3 initial characters of last name and gender for blocking, yielded 6294 linked pairs when executed independently. Using a hybrid linkage routine resulted in 6451 linkages and an additional 18%-24% correct linked pairs as compared to the independent methods. A hybrid linkage routine resulted in higher recall and F-measure scores compared to probabilistic and deterministic methods performed independently. DISCUSSION: The hybrid approach resulted in increased linkage accuracy and identified pairs of linked record that would have otherwise been missed when using any independent linkage technique. CONCLUSION: When performing RL within and across disparate data sources, the hybrid RL routine outperformed independent deterministic and probabilistic methods.


Assuntos
Registros Eletrônicos de Saúde , Cardiopatias Congênitas , Registro Médico Coordenado/métodos , Vigilância da População , Adolescente , Adulto , Algoritmos , Colorado , Humanos , Probabilidade
4.
EGEMS (Wash DC) ; 7(1): 17, 2019 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-31065558

RESUMO

INTRODUCTION: In aggregate, existing data quality (DQ) checks are currently represented in heterogeneous formats, making it difficult to compare, categorize, and index checks. This study contributes a data element-function conceptual model to facilitate the categorization and indexing of DQ checks and explores the feasibility of leveraging natural language processing (NLP) for scalable acquisition of knowledge of common data elements and functions from DQ checks narratives. METHODS: The model defines a "data element", the primary focus of the check, and a "function", the qualitative or quantitative measure over a data element. We applied NLP techniques to extract both from 172 checks for Observational Health Data Sciences and Informatics (OHDSI) and 3,434 checks for Kaiser Permanente's Center for Effectiveness and Safety Research (CESR). RESULTS: The model was able to classify all checks. A total of 751 unique data elements and 24 unique functions were extracted. The top five frequent data element-function pairings for OHDSI were Person-Count (55 checks), Insurance-Distribution (17), Medication-Count (16), Condition-Count (14), and Observations-Count (13); for CESR, they were Medication-Variable Type (175), Medication-Missing (172), Medication-Existence (152), Medication-Count (127), and Socioeconomic Factors-Variable Type (114). CONCLUSIONS: This study shows the efficacy of the data element-function conceptual model for classifying DQ checks, demonstrates early promise of NLP-assisted knowledge acquisition, and reveals the great heterogeneity in the focus in DQ checks, confirming variation in intrinsic checks and use-case specific "fitness-for-use" checks.

5.
EGEMS (Wash DC) ; 5(1): 8, 2017 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-29881733

RESUMO

OBJECTIVE: To compare rule-based data quality (DQ) assessment approaches across multiple national clinical data sharing organizations. METHODS: Six organizations with established data quality assessment (DQA) programs provided documentation or source code describing current DQ checks. DQ checks were mapped to the categories within the data verification context of the harmonized DQA terminology. To ensure all DQ checks were consistently mapped, conventions were developed and four iterations of mapping performed. Difficult-to-map DQ checks were discussed with research team members until consensus was achieved. RESULTS: Participating organizations provided 11,026 DQ checks, of which 99.97 percent were successfully mapped to a DQA category. Of the mapped DQ checks (N=11,023), 214 (1.94 percent) mapped to multiple DQA categories. The majority of DQ checks mapped to Atemporal Plausibility (49.60 percent), Value Conformance (17.84 percent), and Atemporal Completeness (12.98 percent) categories. DISCUSSION: Using the common DQA terminology, near-complete (99.97 percent) coverage across a wide range of DQA programs and specifications was reached. Comparing the distributions of mapped DQ checks revealed important differences between participating organizations. This variation may be related to the organization's stakeholder requirements, primary analytical focus, or maturity of their DQA program. Not within scope, mapping checks within the data validation context of the terminology may provide additional insights into DQA practice differences. CONCLUSION: A common DQA terminology provides a means to help organizations and researchers understand the coverage of their current DQA efforts as well as highlight potential areas for additional DQA development. Sharing DQ checks between organizations could help expand the scope of DQA across clinical data networks.

6.
EGEMS (Wash DC) ; 4(1): 1244, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27713905

RESUMO

OBJECTIVE: Harmonized data quality (DQ) assessment terms, methods, and reporting practices can establish a common understanding of the strengths and limitations of electronic health record (EHR) data for operational analytics, quality improvement, and research. Existing published DQ terms were harmonized to a comprehensive unified terminology with definitions and examples and organized into a conceptual framework to support a common approach to defining whether EHR data is 'fit' for specific uses. MATERIALS AND METHODS: DQ publications, informatics and analytics experts, managers of established DQ programs, and operational manuals from several mature EHR-based research networks were reviewed to identify potential DQ terms and categories. Two face-to-face stakeholder meetings were used to vet an initial set of DQ terms and definitions that were grouped into an overall conceptual framework. Feedback received from data producers and users was used to construct a draft set of harmonized DQ terms and categories. Multiple rounds of iterative refinement resulted in a set of terms and organizing framework consisting of DQ categories, subcategories, terms, definitions, and examples. The harmonized terminology and logical framework's inclusiveness was evaluated against ten published DQ terminologies. RESULTS: Existing DQ terms were harmonized and organized into a framework by defining three DQ categories: (1) Conformance (2) Completeness and (3) Plausibility and two DQ assessment contexts: (1) Verification and (2) Validation. Conformance and Plausibility categories were further divided into subcategories. Each category and subcategory was defined with respect to whether the data may be verified with organizational data, or validated against an accepted gold standard, depending on proposed context and uses. The coverage of the harmonized DQ terminology was validated by successfully aligning to multiple published DQ terminologies. DISCUSSION: Existing DQ concepts, community input, and expert review informed the development of a distinct set of terms, organized into categories and subcategories. The resulting DQ terms successfully encompassed a wide range of disparate DQ terminologies. Operational definitions were developed to provide guidance for implementing DQ assessment procedures. The resulting structure is an inclusive DQ framework for standardizing DQ assessment and reporting. While our analysis focused on the DQ issues often found in EHR data, the new terminology may be applicable to a wide range of electronic health data such as administrative, research, and patient-reported data. CONCLUSION: A consistent, common DQ terminology, organized into a logical framework, is an initial step in enabling data owners and users, patients, and policy makers to evaluate and communicate data quality findings in a well-defined manner with a shared vocabulary. Future work will leverage the framework and terminology to develop reusable data quality assessment and reporting methods.

8.
Drug Saf ; 38(8): 749-65, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26055920

RESUMO

INTRODUCTION: An often key component to coordinating surveillance activities across distributed networks is the design and implementation of a common data model (CDM). The purpose of this study was to evaluate two drug safety surveillance CDMs from an ecosystem perspective to better understand how differences in CDMs and analytic tools affect usability and interpretation of results. METHODS: Humana claims data from 2007 to 2012 were mapped to Observational Medical Outcomes Partnership (OMOP) and Mini-Sentinel CDMs. Data were described and compared at the patient level by source code and mapped concepts. Study cohort construction and effect estimates were also compared using two different analytical methods--one based on a new user design implementing a high-dimensional propensity score (HDPS) algorithm and the other based on univariate self-controlled case series (SCCS) design--across six established positive drug-outcome pairs to learn how differences in CDMs and analytics influence steps in the database analytic process and results. RESULTS: Claims data for approximately 7.7 million Humana health plan members were transformed into the two CDMs. Three health outcome cohorts and two drug cohorts showed differences in cohort size and constituency between Mini-Sentinel and OMOP CDMs, which was a result of multiple factors. Overall, the implementation of the HDPS procedure on Mini-Sentinel CDM detected more known positive associations than that on OMOP CDM. The SCCS method results were comparable on both CDMs. Differences in the implementation of the HDPS procedure between the two CDMs were identified; analytic model and risk period specification had a significant impact on the performance of the HDPS procedure on OMOP CDM. CONCLUSIONS: Differences were observed between OMOP and Mini-Sentinel CDMs. The analysis of both CDMs at the data model level indicated that such conceptual differences had only a slight but not significant impact on identifying known safety associations. Our results show that differences at the ecosystem level of analyses across the CDMs can lead to strikingly different risk estimations, but this can be primarily attributed to the choices of analytic approach and their implementation in the community-developed analytic tools. The opportunities of using CDMs are clear, but our study shows the need for judicious comparison of analyses across the CDMs. Our work emphasizes the need for ongoing efforts to ensure sustainable transparent platforms to maintain and develop CDMs and associated tools for effective safety surveillance.


Assuntos
Bases de Dados Factuais/estatística & dados numéricos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Registros Eletrônicos de Saúde/estatística & dados numéricos , Vigilância de Produtos Comercializados/estatística & dados numéricos , Vigilância de Evento Sentinela , Estudos de Coortes , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/diagnóstico , Humanos
9.
Med Care ; 50 Suppl: S21-9, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22692254

RESUMO

INTRODUCTION: Answers to clinical and public health research questions increasingly require aggregated data from multiple sites. Data from electronic health records and other clinical sources are useful for such studies, but require stringent quality assessment. Data quality assessment is particularly important in multisite studies to distinguish true variations in care from data quality problems. METHODS: We propose a "fit-for-use" conceptual model for data quality assessment and a process model for planning and conducting single-site and multisite data quality assessments. These approaches are illustrated using examples from prior multisite studies. APPROACH: Critical components of multisite data quality assessment include: thoughtful prioritization of variables and data quality dimensions for assessment; development and use of standardized approaches to data quality assessment that can improve data utility over time; iterative cycles of assessment within and between sites; targeting assessment toward data domains known to be vulnerable to quality problems; and detailed documentation of the rationale and outcomes of data quality assessments to inform data users. The assessment process requires constant communication between site-level data providers, data coordinating centers, and principal investigators. DISCUSSION: A conceptually based and systematically executed approach to data quality assessment is essential to achieve the potential of the electronic revolution in health care. High-quality data allow "learning health care organizations" to analyze and act on their own information, to compare their outcomes to peers, and to address critical scientific questions from the population perspective.


Assuntos
Pesquisa Comparativa da Efetividade/organização & administração , Registros Eletrônicos de Saúde , Avaliação de Processos e Resultados em Cuidados de Saúde , Garantia da Qualidade dos Cuidados de Saúde/normas , Pesquisa Comparativa da Efetividade/normas , Comportamento Cooperativo , Humanos , Informática Médica , Assistência Centrada no Paciente , Estados Unidos
10.
J Pediatr ; 157(1): 98-102.e1, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20304421

RESUMO

OBJECTIVES: To assess the relationship between children's hospital readmission and the performance of child health systems in the states in which hospitals are located. STUDY DESIGN: We conducted a retrospective cohort study of 197,744 patients 2 to 18 years old from 39 children's hospitals located in 24 states in the United States in 2005. Subjects were observed for a year after discharge for readmission to the same hospital. The odds of readmission were modeled on the basis of patient-level characteristics and state child health system performance as ranked by the Commonwealth Fund. RESULTS: A total of 1.8% of patients were readmitted within a week, 4.8% within a month, and 16.3% within 365 days. After adjustment for patient-level characteristics, the probability of readmission varied significantly between states (P=.001), and the likelihood of readmission during the ensuing year increased as the states' health system performance ranking improved. States in the best ranking quartile had a 2.03% higher readmission rate than states in the lowest quartile (P=.02); the same directional relationship was observed for readmission intervals from 1 to 365 days after discharge. CONCLUSIONS: Hospital readmission rates are significantly related to the performance of the surrounding health care system.


Assuntos
Criança Hospitalizada/estatística & dados numéricos , Acessibilidade aos Serviços de Saúde/estatística & dados numéricos , Disparidades em Assistência à Saúde/estatística & dados numéricos , Readmissão do Paciente/estatística & dados numéricos , Adolescente , Criança , Pré-Escolar , Feminino , Seguimentos , Hospitais/estatística & dados numéricos , Humanos , Masculino , Estudos Retrospectivos , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA