Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 488
Filter
1.
JMIR Med Inform ; 12: e52896, 2024 Jul 26.
Article in English | MEDLINE | ID: mdl-39087585

ABSTRACT

Background: The application of machine learning in health care often necessitates the use of hierarchical codes such as the International Classification of Diseases (ICD) and Anatomical Therapeutic Chemical (ATC) systems. These codes classify diseases and medications, respectively, thereby forming extensive data dimensions. Unsupervised feature selection tackles the "curse of dimensionality" and helps to improve the accuracy and performance of supervised learning models by reducing the number of irrelevant or redundant features and avoiding overfitting. Techniques for unsupervised feature selection, such as filter, wrapper, and embedded methods, are implemented to select the most important features with the most intrinsic information. However, they face challenges due to the sheer volume of ICD and ATC codes and the hierarchical structures of these systems. Objective: The objective of this study was to compare several unsupervised feature selection methods for ICD and ATC code databases of patients with coronary artery disease in different aspects of performance and complexity and select the best set of features representing these patients. Methods: We compared several unsupervised feature selection methods for 2 ICD and 1 ATC code databases of 51,506 patients with coronary artery disease in Alberta, Canada. Specifically, we used the Laplacian score, unsupervised feature selection for multicluster data, autoencoder-inspired unsupervised feature selection, principal feature analysis, and concrete autoencoders with and without ICD or ATC tree weight adjustment to select the 100 best features from over 9000 ICD and 2000 ATC codes. We assessed the selected features based on their ability to reconstruct the initial feature space and predict 90-day mortality following discharge. We also compared the complexity of the selected features by mean code level in the ICD or ATC tree and the interpretability of the features in the mortality prediction task using Shapley analysis. Results: In feature space reconstruction and mortality prediction, the concrete autoencoder-based methods outperformed other techniques. Particularly, a weight-adjusted concrete autoencoder variant demonstrated improved reconstruction accuracy and significant predictive performance enhancement, confirmed by DeLong and McNemar tests (P<.05). Concrete autoencoders preferred more general codes, and they consistently reconstructed all features accurately. Additionally, features selected by weight-adjusted concrete autoencoders yielded higher Shapley values in mortality prediction than most alternatives. Conclusions: This study scrutinized 5 feature selection methods in ICD and ATC code data sets in an unsupervised context. Our findings underscore the superiority of the concrete autoencoder method in selecting salient features that represent the entire data set, offering a potential asset for subsequent machine learning research. We also present a novel weight adjustment approach for the concrete autoencoders specifically tailored for ICD and ATC code data sets to enhance the generalizability and interpretability of the selected features.

2.
J Am Geriatr Soc ; 2024 Aug 10.
Article in English | MEDLINE | ID: mdl-39126234

ABSTRACT

BACKGROUND: Older adults with severe aortic stenosis (AS) may receive care in a nursing home (NH) prior to undergoing transcatheter aortic valve replacement (TAVR). NH level of care can be used to stabilize medical conditions, to provide rehabilitation services, or for long-term care services. Our primary objective is to determine whether NH utilization pre-TAVR can be used to stratify patients at risk for higher mortality and poor disposition outcomes at 30 and 365 days post-TAVR. METHODS: We conducted a retrospective cohort study among Medicare beneficiaries who spent ≥1 day in an NH 6 months before TAVR (2011-2019). The intensity of NH utilization was categorized as low users (1-30 days), medium users (31-89 days), long-stay NH residents (≥ 100 days, with no more than a 10-day gap in care), and high post-acute rehabilitation patients (≥90 days, with more than a 10-day gap in care). The probabilities of death and disposition were estimated using multinomial logistic regression, adjusting for age, sex, and race. RESULTS: Among 15,581 patients, 9908 (63.6%) were low users, 4312 (27.7%) were medium users, 663 (4.3%) were high post-acute care rehab users, and 698 (4.4%) were long-stay NH residents before TAVR. High post-acute care rehabilitation patients were more likely to have dementia, weight loss, falls, and extensive dependence of activities of daily living (ADLs) as compared with low NH users. Mortality was the greatest in high post-acute care rehab users: 5.5% at 30 days, and 36.4% at 365 days. In contrast, low NH users had similar mortality rates compared with long-stay NH residents: 4.8% versus 4.8% at 30 days, and 24.9% versus 27.0% at 365 days. CONCLUSION: Frequent bouts of post-acute rehabilitation before TAVR were associated with adverse outcomes, yet this metric may be helpful to determine which patients with severe AS could benefit from palliative and geriatric services.

3.
Online J Public Health Inform ; 16: e55104, 2024 Aug 09.
Article in English | MEDLINE | ID: mdl-39121466

ABSTRACT

BACKGROUND: Vaccine hesitancy is a growing global health threat that is increasingly studied through the monitoring and analysis of social media platforms. One understudied area is the impact of echo chambers and influential users on disseminating vaccine information in social networks. Assessing the temporal development of echo chambers and the influence of key users on their growth provides valuable insights into effective communication strategies to prevent increases in vaccine hesitancy. This also aligns with the World Health Organization's (WHO) infodemiology research agenda, which aims to propose new methods for social listening. OBJECTIVE: Using data from a Taiwanese forum, this study aims to examine how engagement patterns of influential users, both within and across different COVID-19 stances, contribute to the formation of echo chambers over time. METHODS: Data for this study come from a Taiwanese forum called PTT. All vaccine-related posts on the "Gossiping" subforum were scraped from January 2021 to December 2022 using the keyword "vaccine." A multilayer network model was constructed to assess the existence of echo chambers. Each layer represents either provaccination, vaccine hesitant, or antivaccination posts based on specific criteria. Layer-level metrics, such as average diversity and Spearman rank correlations, were used to measure chambering. To understand the behavior of influential users-or key nodes-in the network, the activity of high-diversity and hardliner nodes was analyzed. RESULTS: Overall, the provaccination and antivaccination layers are strongly polarized. This trend is temporal and becomes more apparent after November 2021. Diverse nodes primarily participate in discussions related to provaccination topics, both receiving comments and contributing to them. Interactions with the antivaccination layer are comparatively minimal, likely due to its smaller size, suggesting that the forum is a "healthy community." Overall, diverse nodes exhibit cross-cutting engagement. By contrast, hardliners in the vaccine hesitant and antivaccination layers are more active in commenting within their own communities. This trend is temporal, showing an increase during the Omicron outbreak. Hardliner activity potentially reinforces their stances over time. Thus, there are opposing forces of chambering and cross-cutting. CONCLUSIONS: Efforts should be made to moderate hardliner and influential nodes in the antivaccination layer and to support provaccination users engaged in cross-cutting exchanges. There are several limitations to this study. One is the bias of the platform used, and another is the lack of a comprehensive definition of "influence." To address these issues, comparative studies across different platforms can be conducted, and various metrics of influence should be explored. Additionally, examining the impact of influential users on network structure and chambering through network simulations and regression analysis provides more robust insights. The study also lacks an explanation for the reasons behind chambering trends. Conducting content analysis can help to understand the nature of engagement and inform interventions to address echo chambers. These approaches align with and further the WHO infodemic research agenda.

4.
Sensors (Basel) ; 24(15)2024 Jul 24.
Article in English | MEDLINE | ID: mdl-39123849

ABSTRACT

As an indispensable part of the vehicle environment perception task, road traffic marking detection plays a vital role in correctly understanding the current traffic situation. However, the existing traffic marking detection algorithms still have some limitations. Taking lane detection as an example, the current detection methods mainly focus on the location information detection of lane lines, and they only judge the overall attribute of each detected lane line instance, thus lacking more fine-grained dynamic detection of lane line attributes. In order to meet the needs of intelligent vehicles for the dynamic attribute detection of lane lines and more perfect road environment information in urban road environment, this paper constructs a fine-grained attribute detection method for lane lines, which uses pixel-level attribute sequence points to describe the complete attribute distribution of lane lines and then matches the detection results of the lane lines. Realizing the attribute judgment of different segment positions of lane instances is called the fine-grained attribute detection of lane lines (Lane-FGA). In addition, in view of the lack of annotation information in the current open-source lane data set, this paper constructs a lane data set with both lane instance information and fine-grained attribute information by combining manual annotation and intelligent annotation. At the same time, a cyclic iterative attribute inference algorithm is designed to solve the difficult problem of lane attribute labeling in areas without visual cues such as occlusion and damage. In the end, the average accuracy of the proposed algorithm reaches 97% on various types of lane attribute detection.

5.
Spectrochim Acta A Mol Biomol Spectrosc ; 323: 124868, 2024 Jul 23.
Article in English | MEDLINE | ID: mdl-39128307

ABSTRACT

Hyperspectral Raman imaging not only offers spectroscopic fingerprints but also reveals morphological information such as spatial distributions in an analytical sample. However, the spectrum-per-pixel nature of hyperspectral imaging (HSI) results in a vast amount of data. Furthermore, HSI often requires pre- and post-processing steps to extract valuable chemical information. To derive pure spectral signatures and concentration abundance maps of the active spectroscopic compounds, both endmember extraction (EX) and Multivariate Curve Resolution (MCR) techniques are widely employed. The objective of this study is to carry out a systematic investigation based on Raman mapping datasets to highlight the similarities and differences between these two approaches in retrieving pure variables, and ultimately provide guidelines for pure variable extraction. Numerical simulations and Raman mapping experiments on a mixture of pharmaceutical powders and on a layered plastic foil sample were conducted to underscore the distinctions between MCR and EX algorithms (in particular Vertex Component Analysis, VCA) and their outputs. Both methods were found to perform well if the dataset contains pure pixels for each of the individual components. However, in cases where such pure pixels do not exist, only MCR was found to be capable of extracting the pure component spectra.

6.
PeerJ Comput Sci ; 10: e2152, 2024.
Article in English | MEDLINE | ID: mdl-38983193

ABSTRACT

With the rapid extensive development of the Internet, users not only enjoy great convenience but also face numerous serious security problems. The increasing frequency of data breaches has made it clear that the network security situation is becoming increasingly urgent. In the realm of cybersecurity, intrusion detection plays a pivotal role in monitoring network attacks. However, the efficacy of existing solutions in detecting such intrusions remains suboptimal, perpetuating the security crisis. To address this challenge, we propose a sparse autoencoder-Bayesian optimization-convolutional neural network (SA-BO-CNN) system based on convolutional neural network (CNN). Firstly, to tackle the issue of data imbalance, we employ the SMOTE resampling function during system construction. Secondly, we enhance the system's feature extraction capabilities by incorporating SA. Finally, we leverage BO in conjunction with CNN to enhance system accuracy. Additionally, a multi-round iteration approach is adopted to further refine detection accuracy. Experimental findings demonstrate an impressive system accuracy of 98.36%. Comparative analyses underscore the superior detection rate of the SA-BO-CNN system.

7.
BMC Pregnancy Childbirth ; 24(1): 460, 2024 Jul 03.
Article in English | MEDLINE | ID: mdl-38961444

ABSTRACT

BACKGROUND AND AIMS: Although minimally invasive hysterectomy offers advantages, abdominal hysterectomy remains the predominant surgical method. Creating a standardized dataset and establishing a hysterectomy registry system present opportunities for early interventions in reducing volume and selecting benign hysterectomy methods. This research aims to develop a dataset for designing benign hysterectomy registration system. METHODS: Between April and September 2020, a qualitative study was carried out to create a data set for enrolling patients who were candidate for hysterectomy. At this stage, the research team conducted an information needs assessment, relevant data element identification, registry software development, and field testing; Subsequently, a web-based application was designed. In June 2023the registry software was evaluated using data extracted from medical records of patients admitted at Al-Zahra Hospital in Tabriz, Iran. RESULTS: During two months, 40 patients with benign hysterectomy were successfully registered. The final dataset for the hysterectomy patient registry comprise 11 main groups, 27 subclasses, and a total of 91 Data elements. Mandatory data and essential reports were defined. Furthermore, a web-based registry system designed and evaluated based on data set and various scenarios. CONCLUSION: Creating a hysterectomy registration system is the initial stride toward identifying and registering hysterectomy candidate patients. this system capture information about the procedure techniques, and associated complications. In Iran, this registry can serve as a valuable resource for assessing the quality of care delivered and the distribution of clinical measures.


Subject(s)
Hospitals, Teaching , Hysterectomy , Registries , Humans , Female , Iran , Hysterectomy/methods , Hysterectomy/statistics & numerical data , Adult , Middle Aged , Referral and Consultation/statistics & numerical data , Qualitative Research , Datasets as Topic
8.
JMIR Med Inform ; 12: e57674, 2024 Jun 28.
Article in English | MEDLINE | ID: mdl-38952020

ABSTRACT

Background: Large language models (LLMs) have achieved great progress in natural language processing tasks and demonstrated the potential for use in clinical applications. Despite their capabilities, LLMs in the medical domain are prone to generating hallucinations (not fully reliable responses). Hallucinations in LLMs' responses create substantial risks, potentially threatening patients' physical safety. Thus, to perceive and prevent this safety risk, it is essential to evaluate LLMs in the medical domain and build a systematic evaluation. Objective: We developed a comprehensive evaluation system, MedGPTEval, composed of criteria, medical data sets in Chinese, and publicly available benchmarks. Methods: First, a set of evaluation criteria was designed based on a comprehensive literature review. Second, existing candidate criteria were optimized by using a Delphi method with 5 experts in medicine and engineering. Third, 3 clinical experts designed medical data sets to interact with LLMs. Finally, benchmarking experiments were conducted on the data sets. The responses generated by chatbots based on LLMs were recorded for blind evaluations by 5 licensed medical experts. The evaluation criteria that were obtained covered medical professional capabilities, social comprehensive capabilities, contextual capabilities, and computational robustness, with 16 detailed indicators. The medical data sets include 27 medical dialogues and 7 case reports in Chinese. Three chatbots were evaluated: ChatGPT by OpenAI; ERNIE Bot by Baidu, Inc; and Doctor PuJiang (Dr PJ) by Shanghai Artificial Intelligence Laboratory. Results: Dr PJ outperformed ChatGPT and ERNIE Bot in the multiple-turn medical dialogues and case report scenarios. Dr PJ also outperformed ChatGPT in the semantic consistency rate and complete error rate category, indicating better robustness. However, Dr PJ had slightly lower scores in medical professional capabilities compared with ChatGPT in the multiple-turn dialogue scenario. Conclusions: MedGPTEval provides comprehensive criteria to evaluate chatbots by LLMs in the medical domain, open-source data sets, and benchmarks assessing 3 LLMs. Experimental results demonstrate that Dr PJ outperforms ChatGPT and ERNIE Bot in social and professional contexts. Therefore, such an assessment system can be easily adopted by researchers in this community to augment an open-source data set.

9.
J Agric Food Chem ; 72(29): 16496-16505, 2024 Jul 24.
Article in English | MEDLINE | ID: mdl-38996189

ABSTRACT

For a better understanding of cadmium (Cd) accumulation over long time periods in cereals, Cd levels of the German wheat and rye harvest from 1975 to 2021 were analyzed. Overall, wheat had higher grain Cd concentrations than rye. Comparing mean values from different time periods showed that Cd levels in winter rye have stabilized, while Cd concentrations in winter wheat have decreased. Furthermore, Cd concentrations in almost all samples were below the newly introduced European Commission limits specifying the maximum permissible contaminant levels in foodstuffs (Cd in grains: rye 50 µg/kg FW; wheat 100 µg/kg FW). However, it is important to note that Cd is still ubiquitous in the German wheat and rye harvest. Although there has been a significant reduction in emissions and imissions for around 30 years, the extraordinarily long biological half-life and carcinogenicity of Cd still make it a relevant substance to food safety and human health.


Subject(s)
Cadmium , Food Contamination , Secale , Triticum , Cadmium/analysis , Triticum/chemistry , Secale/chemistry , Germany , Food Contamination/analysis , Soil Pollutants/analysis
10.
Heliyon ; 10(12): e32674, 2024 Jun 30.
Article in English | MEDLINE | ID: mdl-39021911

ABSTRACT

Color plays a pivotal role in product design, as it can evoke emotional responses from users. Understanding these emotional needs is crucial for effective brand image design. This paper introduces a novel approach, the Brand Image Design using Deep Multi-Scale Fusion Neural Network optimized with Cheetah Optimization Algorithm (BID-DMSFNN-COA), for classifying product color brand images as "Stylish" and "Natural". By leveraging deep learning techniques and optimization algorithms, the proposed method aims to enhance brand image accuracy and address existing challenges in product color trend forecasting research. Initially, data are collected from the Mnist Data Set. The data are then supplied into the pre-processing section. In the pre-processing segment, it removes the noise and enhances the input image utilizing master slave adaptive notch filter. The Deep Multi-Scale Fusion Neural Network optimized with cheetah optimization algorithm effectively classifies the product colour brand image as "Stylish" and "Natural". Implemented on the MATLAB platform, the BID-DMSFNN-COA technique achieves remarkable accuracy rates of 99 % for both "Natural" and "Stylish" classifications. In comparison, existing methods such as BID-GNN, BID-ANN, and BID-CNN yield lower accuracy rates ranging from 65 % to 85 % for "Stylish" and 65 %-70 % for "Natural" product color brand image design. The simulation outcomes reveal the superior performance of the BID-DMSFNN-COA technique across various metrics including accuracy, F-score, precision, recall, sensitivity, specificity, and ROC analysis. Notably, the proposed method consistently outperforms existing approaches, providing higher values across all evaluation criteria. These findings underscore the effectiveness of the BID-DMSFNN-COA technique in enhancing brand image design through accurate product color classification.

11.
NASN Sch Nurse ; 39(4): 221-228, 2024 07.
Article in English | MEDLINE | ID: mdl-39078169

ABSTRACT

The National School Health Data Set: Every Student Counts! (ESC!) is NASN's data initiative focusing on building data capacity for school nurses, a uniform data set with standardized definitions, and promoting data infrastructure including school nurse access to electronic documentation, interoperability of educational systems and school health records, and build partnerships to increase data collection, storage, retrievable, and utilization. Each year since 2018, states have submitted data to NASN for inclusion in the National School Health Data Set. Participation is built on a tiered programing model to include school nurses at the school, state, and national level. Every state has identified a State Data Coordinator (SDC) who serves as a liaison to NASN to support ESC! but also provides support to school nurses in their state. This article provides an overview of the ESC! data initiative for the 2023-2024 school year, which includes the data from the 2022-2023 school year.


Subject(s)
School Health Services , School Nursing , Humans , United States , Child , Societies, Nursing , Adolescent
12.
Clin Neuropsychol ; : 1-17, 2024 Jul 26.
Article in English | MEDLINE | ID: mdl-39060956

ABSTRACT

Objective: Reports of financial exploitation have steadily increased among older adults. Few studies have examined neuropsychological profiles for individuals vulnerable to financial exploitation, and existing studies have focused on susceptibility to scams, one specific type of financial exploitation. The current study therefore examines whether a general measure of financial exploitation vulnerability is associated with neuropsychological performance in a community sample. Methods: A sample (n = 116) of adults aged 50 or older without dementia completed a laboratory visit that measures physical and psychological functioning and a neuropsychological assessment, the Uniform Data Set-3 (UDS-3) and California Verbal Learning Test-II. Results: After covarying for demographics, current medical problems, financial literacy, and a global cognition screen, financial exploitation vulnerability was negatively associated with scores on the Multilingual Naming Test, Craft Story Recall and Delayed Recall, California Verbal Learning Test-II Delayed Recall and Recognition Discriminability, Phonemic Fluency, and Trails B. Financial exploitation vulnerability was not associated with performance on Digit Span, Semantic Fluency, Benson Complex Figure Recall, or Trails A. Conclusions: Among older adults without dementia, individuals at higher risk for financial exploitation demonstrated worse verbal memory, confrontation naming, phonemic fluency, and set-shifting. These tests are generally sensitive to Default Mode Network functioning and Alzheimer's Disease neuropathology. Longitudinal studies in more impaired samples are warranted to further corroborate and elucidate these relationships.

13.
JMIR Nurs ; 7: e55793, 2024 Jun 24.
Article in English | MEDLINE | ID: mdl-38913994

ABSTRACT

BACKGROUND: Increased workload, including workload related to electronic health record (EHR) documentation, is reported as a main contributor to nurse burnout and adversely affects patient safety and nurse satisfaction. Traditional methods for workload analysis are either administrative measures (such as the nurse-patient ratio) that do not represent actual nursing care or are subjective and limited to snapshots of care (eg, time-motion studies). Observing care and testing workflow changes in real time can be obstructive to clinical care. An examination of EHR interactions using EHR audit logs could provide a scalable, unobtrusive way to quantify the nursing workload, at least to the extent that nursing work is represented in EHR documentation. EHR audit logs are extremely complex; however, simple analytical methods cannot discover complex temporal patterns, requiring use of state-of-the-art temporal data-mining approaches. To effectively use these approaches, it is necessary to structure the raw audit logs into a consistent and scalable logical data model that can be consumed by machine learning (ML) algorithms. OBJECTIVE: We aimed to conceptualize a logical data model for nurse-EHR interactions that would support the future development of temporal ML models based on EHR audit log data. METHODS: We conducted a preliminary review of EHR audit logs to understand the types of nursing-specific data captured. Using concepts derived from the literature and our previous experience studying temporal patterns in biomedical data, we formulated a logical data model that can describe nurse-EHR interactions, the nurse-intrinsic and situational characteristics that may influence those interactions, and outcomes of relevance to the nursing workload in a scalable and extensible manner. RESULTS: We describe the data structure and concepts from EHR audit log data associated with nursing workload as a logical data model named RNteract. We conceptually demonstrate how using this logical data model could support temporal unsupervised ML and state-of-the-art artificial intelligence (AI) methods for predictive modeling. CONCLUSIONS: The RNteract logical data model appears capable of supporting a variety of AI-based systems and should be generalizable to any type of EHR system or health care setting. Quantitatively identifying and analyzing temporal patterns of nurse-EHR interactions is foundational for developing interventions that support the nursing documentation workload and address nurse burnout.


Subject(s)
Data Mining , Electronic Health Records , Workload , Electronic Health Records/statistics & numerical data , Humans , Data Mining/methods , Workload/statistics & numerical data , Documentation/standards , Documentation/statistics & numerical data , Medical Audit/methods , Machine Learning
14.
Front Physiol ; 15: 1399374, 2024.
Article in English | MEDLINE | ID: mdl-38872836

ABSTRACT

Background: Infections and seizures are some of the most common complications in stroke survivors. Infections are the most common risk factor for seizures and stroke survivors that experience an infection are at greater risk of experiencing seizures. A predictive model to determine which stroke survivors are at the greatest risk for a seizure after an infection can be used to help providers focus on prevention of seizures in higher risk residents that experience an infection. Methods: A predictive model was generated from a retrospective study of the Long-Term Care Minimum Data Set (MDS) 3.0 (2014-2018, n = 262,301). Techniques included three data balancing methods (SMOTE for up sampling, ENN for down sampling, and SMOTEENN for up and down sampling) and three feature selection methods (LASSO, Recursive Feature Elimination, and Principal Component Analysis). One balancing and one feature selection technique was applied, and the resulting dataset was then trained on four machine learning models (Logistic Regression, Random Forest, XGBoost, and Neural Network). Model performance was evaluated with AUC and accuracy, and interpretation used SHapley Additive exPlanations. Results: Using data balancing methods improved the prediction performances of the machine learning models, but feature selection did not remove any features and did not affect performance. With all models having a high accuracy (76.5%-99.9%), interpretation on all four models yielded the most holistic view. SHAP values indicated that therapy (speech, physical, occupational, and respiratory), independence (activities of daily living for walking, mobility, eating, dressing, and toilet use), and mood (severity score, anti-anxiety medications, antidepressants, and antipsychotics) features contributed the most. Meaning, stroke survivors who received fewer therapy hours, were less independent, had a worse overall mood were at a greater risk of having a seizure after an infection. Conclusion: The development of a tool to predict seizure following an infection in stroke survivors can be interpreted by providers to guide treatment and prevent complications long term. This promotes individualized treatment plans that can increase the quality of resident care.

15.
JMIRx Med ; 5: e45973, 2024 Jun 12.
Article in English | MEDLINE | ID: mdl-38889069

ABSTRACT

Background: The Society of Thoracic Surgeons and European System for Cardiac Operative Risk Evaluation (EuroSCORE) II risk scores are the most commonly used risk prediction models for in-hospital mortality after adult cardiac surgery. However, they are prone to miscalibration over time and poor generalization across data sets; thus, their use remains controversial. Despite increased interest, a gap in understanding the effect of data set drift on the performance of machine learning (ML) over time remains a barrier to its wider use in clinical practice. Data set drift occurs when an ML system underperforms because of a mismatch between the data it was developed from and the data on which it is deployed. Objective: In this study, we analyzed the extent of performance drift using models built on a large UK cardiac surgery database. The objectives were to (1) rank and assess the extent of performance drift in cardiac surgery risk ML models over time and (2) investigate any potential influence of data set drift and variable importance drift on performance drift. Methods: We conducted a retrospective analysis of prospectively, routinely gathered data on adult patients undergoing cardiac surgery in the United Kingdom between 2012 and 2019. We temporally split the data 70:30 into a training and validation set and a holdout set. Five novel ML mortality prediction models were developed and assessed, along with EuroSCORE II, for relationships between and within variable importance drift, performance drift, and actual data set drift. Performance was assessed using a consensus metric. Results: A total of 227,087 adults underwent cardiac surgery during the study period, with a mortality rate of 2.76% (n=6258). There was strong evidence of a decrease in overall performance across all models (P<.0001). Extreme gradient boosting (clinical effectiveness metric [CEM] 0.728, 95% CI 0.728-0.729) and random forest (CEM 0.727, 95% CI 0.727-0.728) were the overall best-performing models, both temporally and nontemporally. EuroSCORE II performed the worst across all comparisons. Sharp changes in variable importance and data set drift from October to December 2017, from June to July 2018, and from December 2018 to February 2019 mirrored the effects of performance decrease across models. Conclusions: All models show a decrease in at least 3 of the 5 individual metrics. CEM and variable importance drift detection demonstrate the limitation of logistic regression methods used for cardiac surgery risk prediction and the effects of data set drift. Future work will be required to determine the interplay between ML models and whether ensemble models could improve on their respective performance advantages.

16.
Article in English | MEDLINE | ID: mdl-38928942

ABSTRACT

BACKGROUND: Standardized health-data collection enables effective disaster responses and patient care. Emergency medical teams use the Japan Surveillance in Post-Extreme Emergencies and Disasters (J-SPEED) reporting template to collect patient data. EMTs submit data on treated patients to an EMT coordination cell. The World Health Organization's (WHO) EMT minimum dataset (MDS) offers an international standard for disaster data collection. GOAL: The goal of this study was to analyze age and gender distribution of medical consultations in EMT during disasters. METHODS: Data collected from 2016 to 2020 using the J-SPEED/MDS tools during six disasters in Japan and Mozambique were analyzed. Linear regression with data smoothing via the moving average method was employed to identify trends in medical consultations based on age and gender. RESULTS: 31,056 consultations were recorded: 13,958 in Japan and 17,098 in Mozambique. Women accounted for 56.3% and 55.7% of examinees in Japan and Mozambique, respectively. Children accounted for 6.8% of consultations in Japan and 28.1% in Mozambique. Elders accounted for 1.32 and 1.52 times more consultations than adults in Japan and Mozambique, respectively. CONCLUSIONS: Study findings highlight the importance of considering age-specific healthcare requirements in disaster planning. Real-time data collection tools such as J-SPEED and MDS, which generate both daily reports and raw data for in-depth analysis, facilitate the validation of equitable access to healthcare services, emphasize the specific needs of vulnerable groups, and enable the consideration of cultural preferences to improve healthcare provision by EMTs.


Subject(s)
Disasters , Humans , Female , Japan , Mozambique , Male , Aged , Middle Aged , Adult , Adolescent , Young Adult , Child , Child, Preschool , Infant , Emergency Medical Services/statistics & numerical data , Aged, 80 and over , Age Factors , Infant, Newborn , Sex Factors
17.
JMIR AI ; 3: e47805, 2024 May 20.
Article in English | MEDLINE | ID: mdl-38875667

ABSTRACT

BACKGROUND: Passive mobile sensing provides opportunities for measuring and monitoring health status in the wild and outside of clinics. However, longitudinal, multimodal mobile sensor data can be small, noisy, and incomplete. This makes processing, modeling, and prediction of these data challenging. The small size of the data set restricts it from being modeled using complex deep learning networks. The current state of the art (SOTA) tackles small sensor data sets following a singular modeling paradigm based on traditional machine learning (ML) algorithms. These opt for either a user-agnostic modeling approach, making the model susceptible to a larger degree of noise, or a personalized approach, where training on individual data alludes to a more limited data set, giving rise to overfitting, therefore, ultimately, having to seek a trade-off by choosing 1 of the 2 modeling approaches to reach predictions. OBJECTIVE: The objective of this study was to filter, rank, and output the best predictions for small, multimodal, longitudinal sensor data using a framework that is designed to tackle data sets that are limited in size (particularly targeting health studies that use passive multimodal sensors) and that combines both user agnostic and personalized approaches, along with a combination of ranking strategies to filter predictions. METHODS: In this paper, we introduced a novel ranking framework for longitudinal multimodal sensors (FLMS) to address challenges encountered in health studies involving passive multimodal sensors. Using the FLMS, we (1) built a tensor-based aggregation and ranking strategy for final interpretation, (2) processed various combinations of sensor fusions, and (3) balanced user-agnostic and personalized modeling approaches with appropriate cross-validation strategies. The performance of the FLMS was validated with the help of a real data set of adolescents diagnosed with major depressive disorder for the prediction of change in depression in the adolescent participants. RESULTS: Predictions output by the proposed FLMS achieved a 7% increase in accuracy and a 13% increase in recall for the real data set. Experiments with existing SOTA ML algorithms showed an 11% increase in accuracy for the depression data set and how overfitting and sparsity were handled. CONCLUSIONS: The FLMS aims to fill the gap that currently exists when modeling passive sensor data with a small number of data points. It achieves this through leveraging both user-agnostic and personalized modeling techniques in tandem with an effective ranking strategy to filter predictions.

18.
Neurourol Urodyn ; 2024 Jun 05.
Article in English | MEDLINE | ID: mdl-38837735

ABSTRACT

INTRODUCTION AND OBJECTIVES: Relevant, meaningful, and achievable data points are critical in objectively assessing quality, utility, and outcomes in female stress urinary incontinence (SUI) surgery. A minimum data set female SUI surgery studies was proposed by the first American Urological Association guidelines on the surgical management of female SUI in 1997, but recommendation adherence has been suboptimal. The Female Stress Urinary Incontinence Surgical Publication Working Group (WG) was created from members of several prominent organizations to formulate a recommended standard of study structure, description, and minimum outcome data set to be utilized in designing and publishing future SUI studies. The goal of this WG was to create a body of evidence better able to assess the outcomes of female SUI surgery. METHODS: The WG reviewed the minimum data set proposed in the 1997 AUA SUI Guideline document, and other relevant literature. The body of literature was examined in the context of the profound changes in the field over the past 25 years. Through a DELPHI process, a standard study structure and minimum data set were generated. Care was taken to balance the value of several meaningful and relevant data points against the burden of creating an excessively difficult or restrictive standard that would disincentivize widespread adoption and negatively impact manuscript production and acceptance. RESULTS: The WG outlined standardization in four major areas: (1) study design, (2) pretreatment demographics and characterization of the study population, (3) intraoperative events, and (4) posttreatment evaluation, and complications. Forty-two items were evaluated and graded as: STANDARD-must be included; ADDITIONAL-may be included for a specific study and is inclusive of the Standard items; OPTIMAL-may be included for a comprehensive study and is inclusive of the Standard and Additional items; UNNECESSARY/LEGACY-not relevant. CONCLUSIONS: A reasonable, achievable, and clinically meaningful minimum data set has been constructed. A structured framework will allow future surgical interventions for female SUI to be objectively scrutinized and compared in a clinically significant manner. Ultimately, such a data set, if adopted by the academic community, will enhance the quality of the scientific literature, and ultimately improve short and long-term outcomes for female patients undergoing surgery to correct SUI.

19.
JMIR AI ; 3: e44185, 2024 Jan 31.
Article in English | MEDLINE | ID: mdl-38875533

ABSTRACT

BACKGROUND: Machine learning techniques are starting to be used in various health care data sets to identify frail persons who may benefit from interventions. However, evidence about the performance of machine learning techniques compared to conventional regression is mixed. It is also unclear what methodological and database factors are associated with performance. OBJECTIVE: This study aimed to compare the mortality prediction accuracy of various machine learning classifiers for identifying frail older adults in different scenarios. METHODS: We used deidentified data collected from older adults (65 years of age and older) assessed with interRAI-Home Care instrument in New Zealand between January 1, 2012, and December 31, 2016. A total of 138 interRAI assessment items were used to predict 6-month and 12-month mortality, using 3 machine learning classifiers (random forest [RF], extreme gradient boosting [XGBoost], and multilayer perceptron [MLP]) and regularized logistic regression. We conducted a simulation study comparing the performance of machine learning models with logistic regression and interRAI Home Care Frailty Scale and examined the effects of sample sizes, the number of features, and train-test split ratios. RESULTS: A total of 95,042 older adults (median age 82.66 years, IQR 77.92-88.76; n=37,462, 39.42% male) receiving home care were analyzed. The average area under the curve (AUC) and sensitivities of 6-month mortality prediction showed that machine learning classifiers did not outperform regularized logistic regressions. In terms of AUC, regularized logistic regression had better performance than XGBoost, MLP, and RF when the number of features was ≤80 and the sample size ≤16,000; MLP outperformed regularized logistic regression in terms of sensitivities when the number of features was ≥40 and the sample size ≥4000. Conversely, RF and XGBoost demonstrated higher specificities than regularized logistic regression in all scenarios. CONCLUSIONS: The study revealed that machine learning models exhibited significant variation in prediction performance when evaluated using different metrics. Regularized logistic regression was an effective model for identifying frail older adults receiving home care, as indicated by the AUC, particularly when the number of features and sample sizes were not excessively large. Conversely, MLP displayed superior sensitivity, while RF exhibited superior specificity when the number of features and sample sizes were large.

20.
Environ Monit Assess ; 196(6): 567, 2024 May 22.
Article in English | MEDLINE | ID: mdl-38775991

ABSTRACT

The study attempted to evaluate the agricultural soil quality using the Soil Quality Index (SQI) model in two Community Development Blocks, Ausgram-II and Memari-II of Purba Bardhaman District. Total 104 soil samples were collected (0-20 cm depth) from each Block to analyse 13 parameters (bulk density, soil porosity, soil aggregate stability, water holding capacity, infiltration rate, available nitrogen, available phosphorous, available potassium, soil pH, soil organic carbon, electrical conductivity, soil respiration and microbial biomass carbon) in this study. The Integrated Quality Index (IQI) was applied using the weighted additive approach and non-linear scoring technique to retain the Minimum Data Set (MDS). Principal Component Analysis (PCA) identified that SAS, BD, available K, pH, available N, and available P were the key contributing parameters to SQI in Ausgram-II. In contrast, WHC, SR, available N, pH, and SAS contributed the most to SQI in Memari-II. Results revealed that Ausgram-II (0.97) is notably higher SQI than Memari-II (0.69). In Ausgram-II, 99.72% of agricultural lands showed very high SQI (Grade I), whereas, in Memari-II, 49.95% of lands exhibited a moderate SQI (Grade III) and 49.90% showed a high SQI (Grade II). Sustainable Yield Index (SYI), Sensitivity Index (SI) and Efficiency Ratio (ER) were used to validate the SQIs. A positive correlation was observed between SQI and paddy ( R2 = 0.82 & 0.72) and potato yield (R2 = 0.71 & 0.78) in Ausgram-II and Memari-II Block, respectively. This study could evaluate the agricultural soil quality and provide insights for decision-making in fertiliser management practices to promote agricultural sustainability.


Subject(s)
Agriculture , Environmental Monitoring , Oryza , Soil , India , Soil/chemistry , Environmental Monitoring/methods , Oryza/growth & development , Nitrogen/analysis , Soil Pollutants/analysis , Phosphorus/analysis
SELECTION OF CITATIONS
SEARCH DETAIL