ABSTRACT
BACKGROUND: The 2018 anatomic physiologic (AP) classification American Heart Association/American College of Cardiology (AHA/ACC) Guidelines for Adults with Congenital Heart Disease (ACHD) encompasses both native and post-operative anatomy and physiology to guide care management. As some physiologic conditions and post-operative states lack specific International Classification of Diseases (ICD) 9- Clinical Modification (CM) and 10-CM codes, an ICD code-based classification approximating the ACHD AP classification is needed for population-based studies. METHODS: A total of 232 individuals, aged ≥ 18 years at the time of a health encounter between January 1, 2010 and December 31, 2019 and identified with at least one of 87 ICD codes for a congenital heart defect were validated through medical chart review. Individuals were assigned one of 4 mutually exclusive modified AP classification categories: (1) severe AB, (2) severe CD, (3) non-severe AB, or (4) non-severe CD, based on native anatomy "severe" or "non-severe" and physiology AB ("none" or "mild") or CD ("moderate" or "severe") by two methods: (1) medical record review, and (2) ICD and Current Procedural Terminology (CPT) code-based classification. The composite outcome was defined as a combination of a death, emergency department (ED) visits, or any hospitalizations that occurred at least 6 months after the index date and was assessed by each modified AP classification method. RESULTS: Of 232 cases (52.2% male, 71.1% White), 28.4% experienced a composite outcome a median of 1.6 years after the index encounter. No difference in prediction of the composite outcome was seen based on modified AP classification between chart review and ICD code-based methodology. CONCLUSION: Modified AP classification by chart review and ICD codes are comparable in predicting the composite outcome at least 6 months after classification. Modified AP classification using ICD code-based classification of CHD native anatomy and physiology is an important tool for population-based ACHD surveillance using administrative data.
Subject(s)
Heart Defects, Congenital , International Classification of Diseases , Humans , Heart Defects, Congenital/classification , Heart Defects, Congenital/physiopathology , Male , Female , Adult , Middle Aged , United States/epidemiology , Retrospective Studies , Severity of Illness IndexABSTRACT
BACKGROUND: Socioeconomic factors may lead to a disproportionate impact on health care usage and death among individuals with congenital heart defects (CHD) by race, ethnicity, and socioeconomic factors. How neighborhood poverty affects racial and ethnic disparities in health care usage and death among individuals with CHD across the life span is not well described. METHODS AND RESULTS: Individuals aged 1 to 64 years, with at least 1 CHD-related International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) code were identified from health care encounters between January 1, 2011, and December 31, 2013, from 4 US sites. Residence was classified into lower- or higher-poverty neighborhoods on the basis of zip code tabulation area from the 2014 American Community Survey 5-year estimates. Multivariable logistic regression models, adjusting for site, sex, CHD anatomic severity, and insurance-evaluated associations between race and ethnicity, and health care usage and death, stratified by neighborhood poverty. Of 31 542 individuals, 22.2% were non-Hispanic Black and 17.0% Hispanic. In high-poverty neighborhoods, non-Hispanic Black (44.4%) and Hispanic (47.7%) individuals, respectively, were more likely to be hospitalized (adjusted odds ratio [aOR], 1.2 [95% CI, 1.1-1.3]; and aOR, 1.3 [95% CI, 1.2-1.5]) and have emergency department visits (aOR, 1.3 [95% CI, 1.2-1.5] and aOR, 1.8 [95% CI, 1.5-2.0]) compared with non-Hispanic White individuals. In high poverty neighborhoods, non-Hispanic Black individuals with CHD had 1.7 times the odds of death compared with non-Hispanic White individuals in high-poverty neighborhoods (95% CI, 1.1-2.7). Racial and ethnic disparities in health care usage were similar in low-poverty neighborhoods, but disparities in death were attenuated (aOR for non-Hispanic Black, 1.2 [95% CI=0.9-1.7]). CONCLUSIONS: Racial and ethnic disparities in health care usage were found among individuals with CHD in low- and high-poverty neighborhoods, but mortality disparities were larger in high-poverty neighborhoods. Understanding individual- and community-level social determinants of health, including access to health care, may help address racial and ethnic inequities in health care usage and death among individuals with CHD.
Subject(s)
Healthcare Disparities , Heart Defects, Congenital , Adolescent , Adult , Child , Child, Preschool , Female , Humans , Infant , Male , Middle Aged , Young Adult , Black or African American/statistics & numerical data , Ethnicity/statistics & numerical data , Healthcare Disparities/ethnology , Healthcare Disparities/statistics & numerical data , Heart Defects, Congenital/ethnology , Heart Defects, Congenital/mortality , Heart Defects, Congenital/therapy , Hispanic or Latino/statistics & numerical data , Neighborhood Characteristics , Patient Acceptance of Health Care/ethnology , Patient Acceptance of Health Care/statistics & numerical data , Poverty/statistics & numerical data , Residence Characteristics/statistics & numerical data , United States/epidemiology , White People/statistics & numerical dataABSTRACT
Background The Fontan operation is associated with significant morbidity and premature mortality. Fontan cases cannot always be identified by International Classification of Diseases (ICD) codes, making it challenging to create large Fontan patient cohorts. We sought to develop natural language processing-based machine learning models to automatically detect Fontan cases from free texts in electronic health records, and compare their performances with ICD code-based classification. Methods and Results We included free-text notes of 10 935 manually validated patients, 778 (7.1%) Fontan and 10 157 (92.9%) non-Fontan, from 2 health care systems. Using 80% of the patient data, we trained and optimized multiple machine learning models, support vector machines and 2 versions of RoBERTa (a robustly optimized transformer-based model for language understanding), for automatically identifying Fontan cases based on notes. For RoBERTa, we implemented a novel sliding window strategy to overcome its length limit. We evaluated the machine learning models and ICD code-based classification on 20% of the held-out patient data using the F1 score metric. The ICD classification model, support vector machine, and RoBERTa achieved F1 scores of 0.81 (95% CI, 0.79-0.83), 0.95 (95% CI, 0.92-0.97), and 0.89 (95% CI, 0.88-0.85) for the positive (Fontan) class, respectively. Support vector machines obtained the best performance (P<0.05), and both natural language processing models outperformed ICD code-based classification (P<0.05). The sliding window strategy improved performance over the base model (P<0.05) but did not outperform support vector machines. ICD code-based classification produced more false positives. Conclusions Natural language processing models can automatically detect Fontan patients based on clinical notes with higher accuracy than ICD codes, and the former demonstrated the possibility of further improvement.
Subject(s)
International Classification of Diseases , Natural Language Processing , Humans , Machine Learning , Electronic Health Records , ElectronicsABSTRACT
Background Administrative data permit analysis of large cohorts but rely on International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), and International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) codes that may not reflect true congenital heart defects (CHDs). Methods and Results CHDs in 1497 cases with at least 1 encounter between January 1, 2010 and December 31, 2019 in 2 health care systems, identified by at least 1 of 87 ICD-9-CM/ICD-10-CM CHD codes were validated through medical record review for the presence of CHD and CHD native anatomy. Interobserver and intraobserver reliability averaged >95%. Positive predictive value (PPV) of ICD-9-CM/ICD-10-CM codes for CHD was 68.1% (1020/1497) overall, 94.6% (123/130) for cases identified in both health care systems, 95.8% (249/260) for severe codes, 52.6% (370/703) for shunt codes, 75.9% (243/320) for valve codes, 73.5% (119/162) for shunt and valve codes, and 75.0% (39/52) for "other CHD" (7 ICD-9-CM/ICD-10-CM codes). PPV for cases with >1 unique CHD code was 85.4% (503/589) versus 56.3% (498/884) for 1 CHD code. Of cases with secundum atrial septal defect ICD-9-CM/ICD-10-CM codes 745.5/Q21.1 in isolation, PPV was 30.9% (123/398). Patent foramen ovale was present in 66.2% (316/477) of false positives. True positives had younger mean age at first encounter with a CHD code than false positives (22.4 versus 26.3 years; P=0.0017). Conclusions CHD ICD-9-CM/ICD-10-CM codes have modest PPV and may not represent true CHD cases. PPV was improved by selecting certain features, but most true cases did not have these characteristics. The development of algorithms to improve accuracy may improve accuracy of electronic health records for CHD surveillance.