Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
J Card Fail ; 26(7): 610-617, 2020 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-32304875

RESUMO

BACKGROUND: Surveillance and outcome studies for heart failure (HF) require accurate identification of patients with HF. Algorithms based on International Classification of Diseases (ICD) codes to identify HF from administrative data are inadequate owing to their relatively low sensitivity. Detailed clinical information from electronic medical records (EMRs) is potentially useful for improving ICD algorithms. This study aimed to enhance the ICD algorithm for HF definition by incorporating comprehensive information from EMRs. METHODS: The study included 2106 inpatients in Calgary, Alberta, Canada. Medical chart review was used as the reference gold standard for evaluating developed algorithms. The commonly used ICD codes for defining HF were used (namely, the ICD algorithm). The performance of different algorithms using the free text discharge summaries from a population-based EMR were compared with the ICD algorithm. These algorithms included a keyword search algorithm looking for HF-specific terms, a machine learning-based HF concept (HFC) algorithm, an EMR structured data based algorithm, and combined algorithms (the ICD and HFC combined algorithm). RESULTS: Of 2106 patients, 296 (14.1%) were patients with HF as determined by chart review. The ICD algorithm had 92.4% positive predictive value (PPV) but low sensitivity (57.4%). The EMR keyword search algorithm achieved a higher sensitivity (65.5%) than the ICD algorithm, but with a lower PPV (77.6%). The HFC algorithm achieved a better sensitivity (80.0%) and maintained a reasonable PPV (88.9%) compared with the ICD algorithm and the keyword algorithm. An even higher sensitivity (83.3%) was reached by combining the HFC and ICD algorithms, with a lower PPV (83.3%). The structured EMR data algorithm reached a sensitivity of 78% and a PPV of 54.2%. The combined EMR structured data and ICD algorithm had a higher sensitivity (82.4%), but the PPV remained low at 54.8%. All algorithms had a specificity ranging from 87.5% to 99.2%. CONCLUSIONS: Applying natural language processing and machine learning on the discharge summaries of inpatient EMR data can improve the capture of cases of HF compared with the widely used ICD algorithm. The utility of the HFC algorithm is straightforward, making it easily applied for HF case identification.


Assuntos
Insuficiência Cardíaca , Classificação Internacional de Doenças , Algoritmos , Registros Eletrônicos de Saúde , Insuficiência Cardíaca/diagnóstico , Insuficiência Cardíaca/epidemiologia , Insuficiência Cardíaca/terapia , Humanos , Processamento de Linguagem Natural
2.
BMC Med Inform Decis Mak ; 20(1): 75, 2020 04 25.
Artigo em Inglês | MEDLINE | ID: mdl-32334599

RESUMO

BACKGROUND: Data quality assessment presents a challenge for research using coded administrative health data. The objective of this study is to develop and validate a set of coding association rules for coded diagnostic data. METHODS: We used the Canadian re-abstracted hospital discharge abstract data coded in International Classification of Disease, 10th revision (ICD-10) codes. Association rule mining was conducted on the re-abstracted data in four age groups (0-4, 20-44, 45-64; ≥ 65) to extract ICD-10 coding association rules at the three-digit (category of diagnosis) and four-digit levels (category of diagnosis with etiology, anatomy, or severity). The rules were reviewed by a panel of 5 physicians and 2 classification specialists using a modified Delphi rating process. We proposed and defined the variance and bias to assess data quality using the rules. RESULTS: After the rule mining process and the panel review, 388 rules at the three-digit level and 275 rules at the four-digit level were developed. Half of the rules were from the age group of ≥65. Rules captured meaningful age-specific clinical associations, with rules at the age group of ≥65 being more complex and comprehensive than other age groups. The variance and bias can identify rules with high bias and variance in Alberta data and provides directions for quality improvement. CONCLUSIONS: A set of ICD-10 data quality rules were developed and validated by a clinical and classification expert panel. The rules can be used as a tool to assess ICD-coded data, enabling the monitoring and comparison of data quality across institutions, provinces, and countries.


Assuntos
Confiabilidade dos Dados , Adolescente , Adulto , Idoso , Canadá , Criança , Pré-Escolar , Mineração de Dados , Saúde , Humanos , Lactente , Recém-Nascido , Classificação Internacional de Doenças , Pessoa de Meia-Idade , Adulto Jovem
3.
J Biomed Inform ; 79: 41-47, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29425732

RESUMO

OBJECTIVE: Data quality assessment is a challenging facet for research using coded administrative health data. Current assessment approaches are time and resource intensive. We explored whether association rule mining (ARM) can be used to develop rules for assessing data quality. MATERIALS AND METHODS: We extracted 2013 and 2014 records from the hospital discharge abstract database (DAD) for patients between the ages of 55 and 65 from five acute care hospitals in Alberta, Canada. The ARM was conducted using the 2013 DAD to extract rules with support ≥0.0019 and confidence ≥0.5 using the bootstrap technique, and tested in the 2014 DAD. The rules were compared against the method of coding frequency and assessed for their ability to detect error introduced by two kinds of data manipulation: random permutation and random deletion. RESULTS: The association rules generally had clear clinical meanings. Comparing 2014 data to 2013 data (both original), there were 3 rules with a confidence difference >0.1, while coding frequency difference of codes in the right hand of rules was less than 0.004. After random permutation of 50% of codes in the 2014 data, average rule confidence dropped from 0.72 to 0.27 while coding frequency remained unchanged. Rule confidence decreased with the increase of coding deletion, as expected. Rule confidence was more sensitive to code deletion compared to coding frequency, with slope of change ranging from 1.7 to 184.9 with a median of 9.1. CONCLUSION: The ARM is a promising technique to assess data quality. It offers a systematic way to derive coding association rules hidden in data, and potentially provides a sensitive and efficient method of assessing data quality compared to standard methods.


Assuntos
Codificação Clínica , Mineração de Dados/métodos , Pacientes Internados , Informática Médica/métodos , Idoso , Alberta , Algoritmos , Simulação por Computador , Bases de Dados Factuais , Feminino , Hospitalização , Hospitais , Humanos , Classificação Internacional de Doenças , Masculino , Pessoa de Meia-Idade , Alta do Paciente , Reprodutibilidade dos Testes
4.
PLoS One ; 15(12): e0242404, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33259520

RESUMO

BACKGROUND: The All Our Families (AOF) cohort study is a longitudinal population-based study which collected biological samples from 1948 pregnant women between May 2008 and December 2010. As the quality of samples can decline over time, the objective of the current study was to assess the association between storage time and RNA (ribonucleic acid) yield and purity, and confirm the quality of these samples after 7-10 years in long-term storage. METHODS: Maternal whole blood samples were previously collected by trained phlebotomists and stored in four separate PAXgene Blood RNA Tubes (PreAnalytiX) between 2008 and 2011. RNA was isolated in 2011 and 2018 using PAXgene Blood RNA Kits (PreAnalytiX) as per the manufacturer's instruction. RNA purity (260/280), as well as RNA yield, were measured using a Nanodrop. The RNA integrity number (RIN) was also assessed from 5-25 and 111-130 months of storage using RNA 6000 Nano Kit and Agilent 2100 BioAnalyzer. Descriptive statistics, paired t-test, and response feature analysis using linear regression were used to assess the association between various predictor variables and quality of the RNA isolated. RESULTS: Overall, RNA purity and yield of the samples did not decline over time. RNA purity of samples isolated in 2011 (2.08, 95% CI: 2.08-2.09) were statistically lower (p<0.000) than samples isolated in 2018 (2.101, 95% CI: 2.097, 2.104), and there was no statistical difference between the 2011 (13.08 µg /tube, 95% CI: 12.27-13.89) and 2018 (12.64 µg /tube, 95% CI: 11.83-13.46) RNA yield (p = 0.2964). For every month of storage, the change in RNA purity is -0.01(260/280), and the change in RNA yield between 2011 and 2018 is -0.90 µ g / tube. The mean RIN was 8.49 (95% CI:8.44-8.54), and it ranged from 7.2 to 9.5. The rate of change in expected RIN per month of storage is 0.003 (95% CI 0.002-0.004), so while statistically significant, these results are not relevant. CONCLUSIONS: RNA quality does not decrease over time, and the methods used to collect and store samples, within a population-based study are robust to inherent operational factors which may degrade sample quality over time.


Assuntos
Coleta de Amostras Sanguíneas/normas , Estabilidade de RNA/genética , RNA/sangue , Manejo de Espécimes/normas , Testes Diagnósticos de Rotina , Feminino , Humanos , Gravidez , Controle de Qualidade , RNA/genética
5.
Data Brief ; 18: 710-712, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-29896537

RESUMO

Data presented in this article relates to the research article entitled "Exploration of association rule mining for coding consistency and completeness assessment in inpatient administrative health data" (Peng et al. [1]) in preparation). We provided a set of ICD-10 coding association rules in the age group of 55 to 65. The rules were extracted from an inpatient administrative health data at five acute care hospitals in Alberta, Canada, using association rule mining. Thresholds of support and confidence for the association rules mining process were set at 0.19% and 50% respectively. The data set contains 426 rules, in which 86 rules are not nested. Data are provided in the supplementary material. The presented coding association rules provide a reference for future researches on the use of association rule mining for data quality assessment.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA