Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 87
Filter
1.
J Biomed Inform ; 154: 104647, 2024 Apr 30.
Article in English | MEDLINE | ID: mdl-38692465

ABSTRACT

OBJECTIVE: To use software, datasets, and data formats in the domain of Infectious Disease Epidemiology as a test collection to evaluate a novel M1 use case, which we introduce in this paper. M1 is a machine that upon receipt of a new digital object of research exhaustively finds all valid compositions of it with existing objects. METHOD: We implemented a data-format-matching-only M1 using exhaustive search, which we refer to as M1DFM. We then ran M1DFM on the test collection and used error analysis to identify needed semantic constraints. RESULTS: Precision of M1DFM search was 61.7%. Error analysis identified needed semantic constraints and needed changes in handling of data services. Most semantic constraints were simple, but one data format was sufficiently complex to be practically impossible to represent semantic constraints over, from which we conclude limitatively that software developers will have to meet the machines halfway by engineering software whose inputs are sufficiently simple that their semantic constraints can be represented, akin to the simple APIs of services. We summarize these insights as M1-FAIR guiding principles for composability and suggest a roadmap for progressively capable devices in the service of reuse and accelerated scientific discovery. CONCLUSION: Algorithmic search of digital repositories for valid workflow compositions has potential to accelerate scientific discovery but requires a scalable solution to the problem of knowledge acquisition about semantic constraints on software inputs. Additionally, practical limitations on the logical complexity of semantic constraints must be respected, which has implications for the design of software.

2.
Sci Rep ; 14(1): 7831, 2024 04 03.
Article in English | MEDLINE | ID: mdl-38570569

ABSTRACT

The objective of this study is to develop and evaluate natural language processing (NLP) and machine learning models to predict infant feeding status from clinical notes in the Epic electronic health records system. The primary outcome was the classification of infant feeding status from clinical notes using Medical Subject Headings (MeSH) terms. Annotation of notes was completed using TeamTat to uniquely classify clinical notes according to infant feeding status. We trained 6 machine learning models to classify infant feeding status: logistic regression, random forest, XGBoost gradient descent, k-nearest neighbors, and support-vector classifier. Model comparison was evaluated based on overall accuracy, precision, recall, and F1 score. Our modeling corpus included an even number of clinical notes that was a balanced sample across each class. We manually reviewed 999 notes that represented 746 mother-infant dyads with a mean gestational age of 38.9 weeks and a mean maternal age of 26.6 years. The most frequent feeding status classification present for this study was exclusive breastfeeding [n = 183 (18.3%)], followed by exclusive formula bottle feeding [n = 146 (14.6%)], and exclusive feeding of expressed mother's milk [n = 102 (10.2%)], with mixed feeding being the least frequent [n = 23 (2.3%)]. Our final analysis evaluated the classification of clinical notes as breast, formula/bottle, and missing. The machine learning models were trained on these three classes after performing balancing and down sampling. The XGBoost model outperformed all others by achieving an accuracy of 90.1%, a macro-averaged precision of 90.3%, a macro-averaged recall of 90.1%, and a macro-averaged F1 score of 90.1%. Our results demonstrate that natural language processing can be applied to clinical notes stored in the electronic health records to classify infant feeding status. Early identification of breastfeeding status using NLP on unstructured electronic health records data can be used to inform precision public health interventions focused on improving lactation support for postpartum patients.


Subject(s)
Machine Learning , Natural Language Processing , Female , Humans , Infant , Software , Electronic Health Records , Mothers
3.
J Biomed Inform ; 153: 104642, 2024 May.
Article in English | MEDLINE | ID: mdl-38621641

ABSTRACT

OBJECTIVE: To develop a natural language processing (NLP) package to extract social determinants of health (SDoH) from clinical narratives, examine the bias among race and gender groups, test the generalizability of extracting SDoH for different disease groups, and examine population-level extraction ratio. METHODS: We developed SDoH corpora using clinical notes identified at the University of Florida (UF) Health. We systematically compared 7 transformer-based large language models (LLMs) and developed an open-source package - SODA (i.e., SOcial DeterminAnts) to facilitate SDoH extraction from clinical narratives. We examined the performance and potential bias of SODA for different race and gender groups, tested the generalizability of SODA using two disease domains including cancer and opioid use, and explored strategies for improvement. We applied SODA to extract 19 categories of SDoH from the breast (n = 7,971), lung (n = 11,804), and colorectal cancer (n = 6,240) cohorts to assess patient-level extraction ratio and examine the differences among race and gender groups. RESULTS: We developed an SDoH corpus using 629 clinical notes of cancer patients with annotations of 13,193 SDoH concepts/attributes from 19 categories of SDoH, and another cross-disease validation corpus using 200 notes from opioid use patients with 4,342 SDoH concepts/attributes. We compared 7 transformer models and the GatorTron model achieved the best mean average strict/lenient F1 scores of 0.9122 and 0.9367 for SDoH concept extraction and 0.9584 and 0.9593 for linking attributes to SDoH concepts. There is a small performance gap (∼4%) between Males and Females, but a large performance gap (>16 %) among race groups. The performance dropped when we applied the cancer SDoH model to the opioid cohort; fine-tuning using a smaller opioid SDoH corpus improved the performance. The extraction ratio varied in the three cancer cohorts, in which 10 SDoH could be extracted from over 70 % of cancer patients, but 9 SDoH could be extracted from less than 70 % of cancer patients. Individuals from the White and Black groups have a higher extraction ratio than other minority race groups. CONCLUSIONS: Our SODA package achieved good performance in extracting 19 categories of SDoH from clinical narratives. The SODA package with pre-trained transformer models is available at https://github.com/uf-hobi-informatics-lab/SODA_Docker.


Subject(s)
Narration , Natural Language Processing , Social Determinants of Health , Humans , Female , Male , Bias , Electronic Health Records , Documentation/methods , Data Mining/methods
5.
Am J Hypertens ; 37(1): 60-68, 2024 Jan 01.
Article in English | MEDLINE | ID: mdl-37712350

ABSTRACT

BACKGROUND: Apparent treatment-resistant hypertension (aTRH) is defined as uncontrolled blood pressure (BP) despite using ≥3 antihypertensive classes or controlled BP while using ≥4 antihypertensive classes. Patients with aTRH have a higher risk for adverse cardiovascular outcomes compared with patients with controlled hypertension (HTN). Although there have been prior reports on the prevalence, characteristics, and predictors of aTRH, these have been broadly derived from smaller datasets, randomized controlled trials, or closed healthcare systems. METHODS: We extracted patients with HTN defined by ICD-9 and ICD-10 codes during 1/1/2015-12/31/2018, from 2 large electronic health record databases: the OneFlorida Data Trust (n = 223,384) and Research Action for Health Network (REACHnet) (n = 175,229). We applied our previously validated aTRH and stable controlled HTN computable phenotype algorithms and performed univariate and multivariate analyses to identify the prevalence, characteristics, and predictors of aTRH in these populations. RESULTS: The prevalence of aTRH among patients with HTN in OneFlorida (16.7%) and REACHnet (11.3%) was similar to prior reports. Both populations had a significantly higher proportion of Black patients with aTRH compared with those with stable controlled HTN. aTRH in both populations shared similar significant predictors, including Black race, diabetes, heart failure, chronic kidney disease, cardiomegaly, and higher body mass index. In both populations, aTRH was significantly associated with similar comorbidities, when compared with stable controlled HTN. CONCLUSIONS: In 2 large, diverse real-world populations, we observed similar comorbidities and predictors of aTRH as prior studies. In the future, these results may be used to improve healthcare professionals' understanding of aTRH predictors and associated comorbidities.


Subject(s)
Antihypertensive Agents , Hypertension , Humans , Antihypertensive Agents/therapeutic use , Antihypertensive Agents/pharmacology , Electronic Health Records , Risk Factors , Hypertension/diagnosis , Hypertension/drug therapy , Hypertension/epidemiology , Blood Pressure , Prevalence
6.
Med Care ; 61(12 Suppl 2): S153-S160, 2023 12 01.
Article in English | MEDLINE | ID: mdl-37963035

ABSTRACT

PCORnet, the National Patient-Centered Clinical Research Network, provides the ability to conduct prospective and observational pragmatic research by leveraging standardized, curated electronic health records data together with patient and stakeholder engagement. PCORnet is funded by the Patient-Centered Outcomes Research Institute (PCORI) and is composed of 8 Clinical Research Networks that incorporate at total of 79 health system "sites." As the network developed, linkage to commercial health plans, federal insurance claims, disease registries, and other data resources demonstrated the value in extending the networks infrastructure to provide a more complete representation of patient's health and lived experiences. Initially, PCORnet studies avoided direct economic comparative effectiveness as a topic. However, PCORI's authorizing law was amended in 2019 to allow studies to incorporate patient-centered economic outcomes in primary research aims. With PCORI's expanded scope and PCORnet's phase 3 beginning in January 2022, there are opportunities to strengthen the network's ability to support economic patient-centered outcomes research. This commentary will discuss approaches that have been incorporated to date by the network and point to opportunities for the network to incorporate economic variables for analysis, informed by patient and stakeholder perspectives. Topics addressed include: (1) data linkage infrastructure; (2) commercial health plan partnerships; (3) Medicare and Medicaid linkage; (4) health system billing-based benchmarking; (5) area-level measures; (6) individual-level measures; (7) pharmacy benefits and retail pharmacy data; and (8) the importance of transparency and engagement while addressing the biases inherent in linking real-world data sources.


Subject(s)
Medicare , Patient Outcome Assessment , Aged , Humans , United States , Prospective Studies , Outcome Assessment, Health Care , Patient-Centered Care
7.
NPJ Digit Med ; 6(1): 210, 2023 Nov 16.
Article in English | MEDLINE | ID: mdl-37973919

ABSTRACT

There are enormous enthusiasm and concerns in applying large language models (LLMs) to healthcare. Yet current assumptions are based on general-purpose LLMs such as ChatGPT, which are not developed for medical use. This study develops a generative clinical LLM, GatorTronGPT, using 277 billion words of text including (1) 82 billion words of clinical text from 126 clinical departments and approximately 2 million patients at the University of Florida Health and (2) 195 billion words of diverse general English text. We train GatorTronGPT using a GPT-3 architecture with up to 20 billion parameters and evaluate its utility for biomedical natural language processing (NLP) and healthcare text generation. GatorTronGPT improves biomedical natural language processing. We apply GatorTronGPT to generate 20 billion words of synthetic text. Synthetic NLP models trained using synthetic text generated by GatorTronGPT outperform models trained using real-world clinical text. Physicians' Turing test using 1 (worst) to 9 (best) scale shows that there are no significant differences in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights into the opportunities and challenges of LLMs for medical research and healthcare.

8.
Standards (Basel) ; 3(3): 316-340, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37873508

ABSTRACT

The translational research community, in general, and the Clinical and Translational Science Awards (CTSA) community, in particular, share the vision of repurposing EHRs for research that will improve the quality of clinical practice. Many members of these communities are also aware that electronic health records (EHRs) suffer limitations of data becoming poorly structured, biased, and unusable out of original context. This creates obstacles to the continuity of care, utility, quality improvement, and translational research. Analogous limitations to sharing objective data in other areas of the natural sciences have been successfully overcome by developing and using common ontologies. This White Paper presents the authors' rationale for the use of ontologies with computable semantics for the improvement of clinical data quality and EHR usability formulated for researchers with a stake in clinical and translational science and who are advocates for the use of information technology in medicine but at the same time are concerned by current major shortfalls. This White Paper outlines pitfalls, opportunities, and solutions and recommends increased investment in research and development of ontologies with computable semantics for a new generation of EHRs.

9.
J Am Soc Mass Spectrom ; 34(12): 2857-2863, 2023 Dec 06.
Article in English | MEDLINE | ID: mdl-37874901

ABSTRACT

Liquid chromatography-mass spectrometry (LC-MS) metabolomics studies produce high-dimensional data that must be processed by a complex network of informatics tools to generate analysis-ready data sets. As the first computational step in metabolomics, data processing is increasingly becoming a challenge for researchers to develop customized computational workflows that are applicable for LC-MS metabolomics analysis. Ontology-based automated workflow composition (AWC) systems provide a feasible approach for developing computational workflows that consume high-dimensional molecular data. We used the Automated Pipeline Explorer (APE) to create an AWC for LC-MS metabolomics data processing across three use cases. Our results show that APE predicted 145 data processing workflows across all the three use cases. We identified six traditional workflows and six novel workflows. Through manual review, we found that one-third of novel workflows were executable whereby the data processing function could be completed without obtaining an error. When selecting the top six workflows from each use case, the computational viable rate of our predicted workflows reached 45%. Collectively, our study demonstrates the feasibility of developing an AWC system for LC-MS metabolomics data processing.


Subject(s)
Hominidae , Software , Animals , Workflow , Metabolomics/methods , Mass Spectrometry , Chromatography, Liquid/methods
10.
J Am Med Inform Assoc ; 31(1): 165-173, 2023 12 22.
Article in English | MEDLINE | ID: mdl-37812771

ABSTRACT

OBJECTIVE: Having sufficient population coverage from the electronic health records (EHRs)-connected health system is essential for building a comprehensive EHR-based diabetes surveillance system. This study aimed to establish an EHR-based type 1 diabetes (T1D) surveillance system for children and adolescents across racial and ethnic groups by identifying the minimum population coverage from EHR-connected health systems to accurately estimate T1D prevalence. MATERIALS AND METHODS: We conducted a retrospective, cross-sectional analysis involving children and adolescents <20 years old identified from the OneFlorida+ Clinical Research Network (2018-2020). T1D cases were identified using a previously validated computable phenotyping algorithm. The T1D prevalence for each ZIP Code Tabulation Area (ZCTA, 5 digits), defined as the number of T1D cases divided by the total number of residents in the corresponding ZCTA, was calculated. Population coverage for each ZCTA was measured using observed health system penetration rates (HSPR), which was calculated as the ratio of residents in the corresponding ZTCA and captured by OneFlorida+ to the overall population in the same ZCTA reported by the Census. We used a recursive partitioning algorithm to identify the minimum required observed HSPR to estimate T1D prevalence and compare our estimate with the reported T1D prevalence from the SEARCH study. RESULTS: Observed HSPRs of 55%, 55%, and 60% were identified as the minimum thresholds for the non-Hispanic White, non-Hispanic Black, and Hispanic populations. The estimated T1D prevalence for non-Hispanic White and non-Hispanic Black were 2.87 and 2.29 per 1000 youth, which are comparable to the reference study's estimation. The estimated prevalence of T1D for Hispanics (2.76 per 1000 youth) was higher than the reference study's estimation (1.48-1.64 per 1000 youth). The standardized T1D prevalence in the overall Florida population was 2.81 per 1000 youth in 2019. CONCLUSION: Our study provides a method to estimate T1D prevalence in children and adolescents using EHRs and reports the estimated HSPRs and prevalence of T1D for different race and ethnicity groups to facilitate EHR-based diabetes surveillance.


Subject(s)
Diabetes Mellitus, Type 1 , Child , Humans , Adolescent , Young Adult , Adult , Diabetes Mellitus, Type 1/epidemiology , Prevalence , Electronic Health Records , Cross-Sectional Studies , Retrospective Studies
11.
J Am Med Inform Assoc ; 30(9): 1486-1493, 2023 08 18.
Article in English | MEDLINE | ID: mdl-37316988

ABSTRACT

OBJECTIVE: To develop a natural language processing system that solves both clinical concept extraction and relation extraction in a unified prompt-based machine reading comprehension (MRC) architecture with good generalizability for cross-institution applications. METHODS: We formulate both clinical concept extraction and relation extraction using a unified prompt-based MRC architecture and explore state-of-the-art transformer models. We compare our MRC models with existing deep learning models for concept extraction and end-to-end relation extraction using 2 benchmark datasets developed by the 2018 National NLP Clinical Challenges (n2c2) challenge (medications and adverse drug events) and the 2022 n2c2 challenge (relations of social determinants of health [SDoH]). We also evaluate the transfer learning ability of the proposed MRC models in a cross-institution setting. We perform error analyses and examine how different prompting strategies affect the performance of MRC models. RESULTS AND CONCLUSION: The proposed MRC models achieve state-of-the-art performance for clinical concept and relation extraction on the 2 benchmark datasets, outperforming previous non-MRC transformer models. GatorTron-MRC achieves the best strict and lenient F1-scores for concept extraction, outperforming previous deep learning models on the 2 datasets by 1%-3% and 0.7%-1.3%, respectively. For end-to-end relation extraction, GatorTron-MRC and BERT-MIMIC-MRC achieve the best F1-scores, outperforming previous deep learning models by 0.9%-2.4% and 10%-11%, respectively. For cross-institution evaluation, GatorTron-MRC outperforms traditional GatorTron by 6.4% and 16% for the 2 datasets, respectively. The proposed method is better at handling nested/overlapped concepts, extracting relations, and has good portability for cross-institute applications. Our clinical MRC package is publicly available at https://github.com/uf-hobi-informatics-lab/ClinicalTransformerMRC.


Subject(s)
Comprehension , Drug-Related Side Effects and Adverse Reactions , Humans , Natural Language Processing
12.
medRxiv ; 2023 May 01.
Article in English | MEDLINE | ID: mdl-37205447

ABSTRACT

Background: Apparent treatment-resistant hypertension (aTRH) is defined as uncontrolled blood pressure (BP) despite using ≥3 antihypertensive classes or controlled BP while using ≥4 antihypertensive classes. Patients with aTRH have a higher risk for adverse cardiovascular outcomes compared to patients with controlled hypertension. Although there have been prior reports on the prevalence, characteristics, and predictors of aTRH, these have been broadly derived from smaller datasets, randomized controlled trials, or closed healthcare systems. Methods: We extracted patients with hypertension defined by ICD 9 and 10 codes during 1/1/2015-12/31/2018, from two large electronic health record databases: the OneFlorida Data Trust (n=223,384) and Research Action for Health Network (REACHnet) (n=175,229). We applied our previously validated aTRH and stable controlled hypertension (HTN) computable phenotype algorithms and performed univariate and multivariate analyses to identify the prevalence, characteristics, and predictors of aTRH in these real-world populations. Results: The prevalence of aTRH in OneFlorida (16.7%) and REACHnet (11.3%) was similar to prior reports. Both populations had a significantly higher proportion of black patients with aTRH compared to those with stable controlled HTN. aTRH in both populations shared similar significant predictors, including black race, diabetes, heart failure, chronic kidney disease, cardiomegaly, and higher body mass index. In both populations, aTRH was significantly associated with similar comorbidities, when compared with stable controlled HTN. Conclusion: In two large, diverse real-world populations, we observed similar comorbidities and predictors of aTRH as prior studies. In the future, these results may be used to improve healthcare professionals' understanding of aTRH predictors and associated comorbidities. Clinical Perspective: What Is New?: Prior studies of apparent treatment resistant hypertension have focused on cohorts from smaller datasets, randomized controlled trials, or closed healthcare systems.We used validated computable phenotype algorithms for apparent treatment resistant hypertension and stable controlled hypertension to identify the prevalence, characteristics, and predictors of apparent treatment resistant hypertension in two large, diverse real-world populations.What Are the Clinical Implications?: Large, diverse real-world populations showed a similar prevalence of aTRH, 16.7% in OneFlorida and 11.3% in REACHnet, compared to those observed from other cohorts.Patients classified as apparent treatment resistant hypertension were significantly older and had a higher prevalence of comorbid conditions such as diabetes, dyslipidemia, coronary artery disease, heart failure with preserved ejection fraction, and chronic kidney disease stages 1-3.Within diverse, real-world populations, the strongest predictors for apparent treatment resistant hypertension were black race, higher body mass index, heart failure, chronic kidney disease, and diabetes.

13.
Metabolomics ; 19(2): 11, 2023 02 06.
Article in English | MEDLINE | ID: mdl-36745241

ABSTRACT

BACKGROUND: Liquid chromatography-high resolution mass spectrometry (LC-HRMS) is a popular approach for metabolomics data acquisition and requires many data processing software tools. The FAIR Principles - Findability, Accessibility, Interoperability, and Reusability - were proposed to promote open science and reusable data management, and to maximize the benefit obtained from contemporary and formal scholarly digital publishing. More recently, the FAIR principles were extended to include Research Software (FAIR4RS). AIM OF REVIEW: This study facilitates open science in metabolomics by providing an implementation solution for adopting FAIR4RS in the LC-HRMS metabolomics data processing software. We believe our evaluation guidelines and results can help improve the FAIRness of research software. KEY SCIENTIFIC CONCEPTS OF REVIEW: We evaluated 124 LC-HRMS metabolomics data processing software obtained from a systematic review and selected 61 software for detailed evaluation using FAIR4RS-related criteria, which were extracted from the literature along with internal discussions. We assigned each criterion one or more FAIR4RS categories through discussion. The minimum, median, and maximum percentages of criteria fulfillment of software were 21.6%, 47.7%, and 71.8%. Statistical analysis revealed no significant improvement in FAIRness over time. We identified four criteria covering multiple FAIR4RS categories but had a low %fulfillment: (1) No software had semantic annotation of key information; (2) only 6.3% of evaluated software were registered to Zenodo and received DOIs; (3) only 14.5% of selected software had official software containerization or virtual machine; (4) only 16.7% of evaluated software had a fully documented functions in code. According to the results, we discussed improvement strategies and future directions.


Subject(s)
Metabolomics , Software , Metabolomics/methods , Chromatography, Liquid/methods , Mass Spectrometry/methods , Data Management
14.
J Am Coll Surg ; 236(2): 279-291, 2023 02 01.
Article in English | MEDLINE | ID: mdl-36648256

ABSTRACT

BACKGROUND: In single-institution studies, overtriaging low-risk postoperative patients to ICUs has been associated with a low value of care; undertriaging high-risk postoperative patients to general wards has been associated with increased mortality and morbidity. This study tested the reproducibility of an automated postoperative triage classification system to generating an actionable, explainable decision support system. STUDY DESIGN: This longitudinal cohort study included adults undergoing inpatient surgery at two university hospitals. Triage classifications were generated by an explainable deep learning model using preoperative and intraoperative electronic health record features. Nearest neighbor algorithms identified risk-matched controls. Primary outcomes were mortality, morbidity, and value of care (inverted risk-adjusted mortality/total direct costs). RESULTS: Among 4,669 ICU admissions, 237 (5.1%) were overtriaged. Compared with 1,021 control ward admissions, overtriaged admissions had similar outcomes but higher costs ($15.9K [interquartile range $9.8K to $22.3K] vs $10.7K [$7.0K to $17.6K], p < 0.001) and lower value of care (0.2 [0.1 to 0.3] vs 1.5 [0.9 to 2.2], p < 0.001). Among 8,594 ward admissions, 1,029 (12.0%) were undertriaged. Compared with 2,498 control ICU admissions, undertriaged admissions had longer hospital length-of-stays (6.4 [3.4 to 12.4] vs 5.4 [2.6 to 10.4] days, p < 0.001); greater incidence of hospital mortality (1.7% vs 0.7%, p = 0.03), cardiac arrest (1.4% vs 0.5%, p = 0.04), and persistent acute kidney injury without renal recovery (5.2% vs 2.8%, p = 0.002); similar costs ($21.8K [$13.3K to $34.9K] vs $21.9K [$13.1K to $36.3K]); and lower value of care (0.8 [0.5 to 1.3] vs 1.2 [0.7 to 2.0], p < 0.001). CONCLUSIONS: Overtriage was associated with low value of care; undertriage was associated with both low value of care and increased mortality and morbidity. The proposed framework for generating automated postoperative triage classifications is reproducible.


Subject(s)
Deep Learning , Adult , Humans , Longitudinal Studies , Reproducibility of Results , Triage , Cohort Studies , Retrospective Studies
15.
J Am Heart Assoc ; 12(1): e026652, 2023 01 03.
Article in English | MEDLINE | ID: mdl-36565195

ABSTRACT

Background Knowledge of real-world antihypertensive use is limited to prevalent hypertension, limiting our understanding of how treatment evolves and its contribution to persistently poor blood pressure control. We sought to characterize antihypertensive initiation among new users. Methods and Results Using Medicaid and Medicare data from the OneFlorida+ Clinical Research Consortium, we identified new users of ≥1 first-line antihypertensives (angiotensin-converting enzyme inhibitor, calcium channel blocker, angiotensin receptor blocker, thiazide diuretic, or ß-blocker) between 2013 and 2021 among adults with diagnosed hypertension, and no antihypertensive fill during the prior 12 months. We evaluated initial antihypertensive regimens by class and drug overall and across study years and examined variation in antihypertensive initiation across demographics (sex, race, and ethnicity) and comorbidity (chronic kidney disease, diabetes, and atherosclerotic cardiovascular disease). We identified 143 054 patients initiating 188 995 antihypertensives (75% monotherapy; 25% combination therapy), with mean age 59 years and 57% of whom were women. The most commonly initiated antihypertensive class overall was angiotensin-converting enzyme inhibitors (39%) followed by ß-blockers (31%), calcium channel blockers (24%), thiazides (19%), and angiotensin receptor blockers (11%). With the exception of ß-blockers, a single drug accounted for ≥75% of use of each class. ß-blocker use decreased (35%-26%), and calcium channel blocker use increased (24%-28%) over the study period, while initiation of most other classes remained relatively stable. We also observed significant differences in antihypertensive selection across demographic and comorbidity strata. Conclusions These findings indicate that substantial variation exists in initial antihypertensive prescribing, and there remain significant gaps between current guideline recommendations and real-world implementation in early hypertension care.


Subject(s)
Antihypertensive Agents , Hypertension , Humans , Female , Aged , United States/epidemiology , Middle Aged , Male , Antihypertensive Agents/therapeutic use , Medicare , Angiotensin-Converting Enzyme Inhibitors/therapeutic use , Calcium Channel Blockers/therapeutic use , Hypertension/drug therapy , Hypertension/epidemiology , Adrenergic beta-Antagonists/therapeutic use , Angiotensin Receptor Antagonists/therapeutic use
16.
Ann Surg ; 277(2): 179-185, 2023 02 01.
Article in English | MEDLINE | ID: mdl-35797553

ABSTRACT

OBJECTIVE: We test the hypothesis that for low-acuity surgical patients, postoperative intensive care unit (ICU) admission is associated with lower value of care compared with ward admission. BACKGROUND: Overtriaging low-acuity patients to ICU consumes valuable resources and may not confer better patient outcomes. Associations among postoperative overtriage, patient outcomes, costs, and value of care have not been previously reported. METHODS: In this longitudinal cohort study, postoperative ICU admissions were classified as overtriaged or appropriately triaged according to machine learning-based patient acuity assessments and requirements for immediate postoperative mechanical ventilation or vasopressor support. The nearest neighbors algorithm identified risk-matched control ward admissions. The primary outcome was value of care, calculated as inverse observed-to-expected mortality ratios divided by total costs. RESULTS: Acuity assessments had an area under the receiver operating characteristic curve of 0.92 in generating predictions for triage classifications. Of 8592 postoperative ICU admissions, 423 (4.9%) were overtriaged. These were matched with 2155 control ward admissions with similar comorbidities, incidence of emergent surgery, immediate postoperative vital signs, and do not resuscitate order placement and rescindment patterns. Compared with controls, overtraiged admissions did not have a lower incidence of any measured complications. Total costs for admission were $16.4K for overtriage and $15.9K for controls ( P =0.03). Value of care was lower for overtriaged admissions [2.9 (2.0-4.0)] compared with controls [24.2 (14.1-34.5), P <0.001]. CONCLUSIONS: Low-acuity postoperative patients who were overtriaged to ICUs had increased total costs, no improvements in outcomes, and received low-value care.


Subject(s)
Hospitalization , Intensive Care Units , Humans , Longitudinal Studies , Retrospective Studies , Cohort Studies
17.
NPJ Digit Med ; 5(1): 194, 2022 Dec 26.
Article in English | MEDLINE | ID: mdl-36572766

ABSTRACT

There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records (EHRs). Natural language processing (NLP) powered by pretrained language models is the key technology for medical AI systems utilizing clinical narratives. However, there are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters (compared with billions of parameters in the general domain). It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs. In this study, we develop from scratch a large clinical language model-GatorTron-using >90 billion words of text (including >82 billion words of de-identified clinical text) and systematically evaluate it on five clinical NLP tasks including clinical concept extraction, medical relation extraction, semantic textual similarity, natural language inference (NLI), and medical question answering (MQA). We examine how (1) scaling up the number of parameters and (2) scaling up the size of the training data could benefit these NLP tasks. GatorTron models scale up the clinical language model from 110 million to 8.9 billion parameters and improve five clinical NLP tasks (e.g., 9.6% and 9.5% improvement in accuracy for NLI and MQA), which can be applied to medical AI systems to improve healthcare delivery. The GatorTron models are publicly available at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og .

18.
Article in English | MEDLINE | ID: mdl-36532301

ABSTRACT

Established guidelines describe minimum requirements for reporting algorithms in healthcare; it is equally important to objectify the characteristics of ideal algorithms that confer maximum potential benefits to patients, clinicians, and investigators. We propose a framework for ideal algorithms, including 6 desiderata: explainable (convey the relative importance of features in determining outputs), dynamic (capture temporal changes in physiologic signals and clinical events), precise (use high-resolution, multimodal data and aptly complex architecture), autonomous (learn with minimal supervision and execute without human input), fair (evaluate and mitigate implicit bias and social inequity), and reproducible (validated externally and prospectively and shared with academic communities). We present an ideal algorithms checklist and apply it to highly cited algorithms. Strategies and tools such as the predictive, descriptive, relevant (PDR) framework, the Standard Protocol Items: Recommendations for Interventional Trials-Artificial Intelligence (SPIRIT-AI) extension, sparse regression methods, and minimizing concept drift can help healthcare algorithms achieve these objectives, toward ideal algorithms in healthcare.

19.
Database (Oxford) ; 20222022 10 08.
Article in English | MEDLINE | ID: mdl-36208225

ABSTRACT

Similar to managing software packages, managing the ontology life cycle involves multiple complex workflows such as preparing releases, continuous quality control checking and dependency management. To manage these processes, a diverse set of tools is required, from command-line utilities to powerful ontology-engineering environmentsr. Particularly in the biomedical domain, which has developed a set of highly diverse yet inter-dependent ontologies, standardizing release practices and metadata and establishing shared quality standards are crucial to enable interoperability. The Ontology Development Kit (ODK) provides a set of standardized, customizable and automatically executable workflows, and packages all required tooling in a single Docker image. In this paper, we provide an overview of how the ODK works, show how it is used in practice and describe how we envision it driving standardization efforts in our community. Database URL: https://github.com/INCATools/ontology-development-kit.


Subject(s)
Biological Ontologies , Databases, Factual , Metadata , Quality Control , Software , Workflow
20.
Front Artif Intell ; 5: 842306, 2022.
Article in English | MEDLINE | ID: mdl-36034597

ABSTRACT

Human pathophysiology is occasionally too complex for unaided hypothetical-deductive reasoning and the isolated application of additive or linear statistical methods. Clustering algorithms use input data patterns and distributions to form groups of similar patients or diseases that share distinct properties. Although clinicians frequently perform tasks that may be enhanced by clustering, few receive formal training and clinician-centered literature in clustering is sparse. To add value to clinical care and research, optimal clustering practices require a thorough understanding of how to process and optimize data, select features, weigh strengths and weaknesses of different clustering methods, select the optimal clustering method, and apply clustering methods to solve problems. These concepts and our suggestions for implementing them are described in this narrative review of published literature. All clustering methods share the weakness of finding potential clusters even when natural clusters do not exist, underscoring the importance of applying data-driven techniques as well as clinical and statistical expertise to clustering analyses. When applied properly, patient and disease phenotype clustering can reveal obscured associations that can help clinicians understand disease pathophysiology, predict treatment response, and identify patients for clinical trial enrollment.

SELECTION OF CITATIONS
SEARCH DETAIL
...