RESUMO
BACKGROUND: Global longitudinal strain (GLS) is recognized as a powerful predictor of heart failure (HF). However, the entire strain curve may entail important prognostic information regarding HF risk that might be undiscovered by only focusing on the peak strain value. OBJECTIVE: The hypothesis of the present study was, that analysis of the entire strain curve using unsupervised machine learning (uML) would reveal novel ventricular deformation patterns capable of predicting incident HF independently of GLS. METHODS: Longitudinal strain curves from 3710 subjects from the general population without prevalent HF were analyzed using uML. RESULTS: Mean age was 56 years and 43 % were male. During a median follow-up of 5.3 years, 92 subjects (2.5 %) developed HF. The uML algorithm generated a hierarchical clustering tree (HCT) resulting in 10 different clusters. Generally, the strain curves displayed reduced early diastolic strain to peak-strain ratio with an increasing incidence rate of HF. In multivariable Cox regressions, cluster 9 was significantly associated with increased risk of HF when compared to cluster 2-5, and 7-8 [For cluster 3: HR 8.95, 95 %CI: 2.08;38.48, P = 0.003] even though the subjects of cluster 9 were younger, displayed healthier clinical baseline characteristics, and only had slightly reduced GLS. The mean strain curve of cluster 9 displayed an early systolic lengthening followed by a late and reduced contraction specifically related to the basal lateral segment. CONCLUSION: The unsupervised machine learning algorithm identified unknown strain patterns beyond GLS presumably related to increased risk of HF.
RESUMO
OBJECTIVE: Computational drug re-purposing has received a lot of attention in the past decade. However, methods developed to date focused on established compounds for which information on both, successfully treated patients and chemical and genomic impact, were known. Such information does not always exist for first-in-class drugs under development. METHODS: To identify indications (diseases) for drugs under development we extended and tested several unsupervised computational methods that utilize Electronic Health Record (EHR) data. RESULTS: We tested the methods on known drugs with multiple indications and show that a variant of matrix factorization leads to the best performance for first-in-line drugs improving upon prior methods that were developed for established drugs. The method also identifies novel predictions for key immunology and oncology drugs. Our results show that the performance of re-purposing methods differ greatly between oncology and inflammation/immunology. We hypothesize that the lower performance in oncology can be explained by the fact that many chemotherapies are not targeted therapies. CONCLUSION: Finding new indications for drugs is extremely valuable. Our results explore how to best use EHR data for finding new indications for first in class drugs drug using a phenotypical-similarity driven approach. Our methods can be integrated with others methods using multiple data modalities such as chemical, molecular, genetic data.
RESUMO
OBJECTIVES: The aim of the study was to analyze the data of diabetic patients regarding warning signs of hypoglycemia to predict it at an early stage using various novel machine learning (ML) algorithms. Individual interviews with diabetic patients were conducted over 6 months to acquire information regarding their experience with hypoglycemic episodes. DESIGN: This information included warning signs of hypoglycemia, such as incoherent speech, exhaustion, weakness, and other clinically relevant cases of low blood sugar. Researchers used supervised, unsupervised, and hybrid techniques. In supervised techniques, researchers applied regression, while in hybrid classification ML techniques were used. In a 5-fold cross-validation approach, the prediction performance of seven models was examined using the area under the receiver operating characteristic curve (AUROC). We analyzed the data of 290 diabetic patients with low blood sugar episodes. RESULTS: Our investigation discovered that gradient boosting and neural networks performed better in regression, with accuracies of 0.416 and 0.417, respectively. In classification models, gradient boosting, AdaBoost, and random forest performed better overall, with AUC scores of 0.821, 0.814, and 0.821, individually. Precision values were 0.779, 0.775, and 0.776 for gradient boosting, AdaBoost, and random forest, respectively. CONCLUSION: AdaBoost and Gradient Boosting models, in particular, outperformed all others in predicting the probability of clinically severe hypoglycemia. These techniques enable community health nurses to predict hypoglycemia at an early stage and provide the necessary therapies to patients to prevent complications resulting from hypoglycemia.
RESUMO
Multistable perceptual phenomena provide insights into the mind's dynamic states within a stable external environment and the neural underpinnings of these consciousness changes are often studied with binocular rivalry. Conventional methods to study binocular rivalry suffer from biases and assumptions that limit their ability to describe the continuous nature of this perceptual transitions and to discover what kind of percept was perceived across time. In this study, we propose a novel way to avoid those shortcomings by combining a continuous psychophysical method that estimates introspection during binocular rivalry with machine learning clustering and transition probability analysis. This combination of techniques reveals individual variability and complexity of perceptual experience in 28 normally sighted participants. Also, the analysis of transition probabilities between perceptual categories, i.e., exclusive and different kinds of mixed percepts, suggest that interocular perceptual competition, triggered by low-level stimuli, involves conflict between monocular and binocular neural processing sites rather than mutual inhibition of monocular sites.
RESUMO
BACKGROUND: Septic patients who develop acute respiratory failure (ARF) requiring mechanical ventilation represent a heterogenous subgroup of critically ill patients with widely variable clinical characteristics. Identifying distinct phenotypes of these patients may reveal insights about the broader heterogeneity in the clinical course of sepsis, considering multi-organ dynamics. We aimed to derive novel phenotypes of sepsis-induced ARF using observational clinical data and investigate the generalizability of the derived phenotypes. METHODS: We performed a multi-center retrospective study of ICU patients with sepsis who required mechanical ventilation for ≥ 24 h. Data from two different high-volume academic hospital centers were used, where all phenotypes were derived in MICU of Hospital-I (N = 3225). The derived phenotypes were validated in MICU of Hospital-II (N = 848), SICU of Hospital-I (N = 1112), and SICU of Hospital-II (N = 465). Clinical data from 24 h preceding intubation was used to derive distinct phenotypes using an explainable machine learning-based clustering model interpreted by clinical experts. RESULTS: Four distinct ARF phenotypes were identified: A (severe multi-organ dysfunction (MOD) with a high likelihood of kidney injury and heart failure), B (severe hypoxemic respiratory failure [median P/F = 123]), C (mild hypoxia [median P/F = 240]), and D (severe MOD with a high likelihood of hepatic injury, coagulopathy, and lactic acidosis). Patients in each phenotype showed differences in clinical course and mortality rates despite similarities in demographics and admission co-morbidities. The phenotypes were reproduced in external validation utilizing the MICU of Hospital-II and SICUs from Hospital-I and -II. Kaplan-Meier analysis showed significant difference in 28-day mortality across the phenotypes (p < 0.01) and consistent across MICU and SICU of both Hospital-I and -II. The phenotypes demonstrated differences in treatment effects associated with high positive end-expiratory pressure (PEEP) strategy. CONCLUSION: The phenotypes demonstrated unique patterns of organ injury and differences in clinical outcomes, which may help inform future research and clinical trial design for tailored management strategies.
Assuntos
Estado Terminal , Fenótipo , Insuficiência Respiratória , Sepse , Humanos , Estudos Retrospectivos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Sepse/complicações , Sepse/fisiopatologia , Estado Terminal/terapia , Insuficiência Respiratória/terapia , Insuficiência Respiratória/etiologia , Unidades de Terapia Intensiva/organização & administração , Unidades de Terapia Intensiva/estatística & dados numéricos , Respiração Artificial/métodos , Respiração Artificial/estatística & dados numéricosRESUMO
BACKGROUND: Specific food preferences can determine an individual's dietary patterns and therefore, may be associated with certain health risks and benefits. METHODS: Using food preference questionnaire (FPQ) data from a subset comprising over 180,000 UK Biobank participants, we employed Latent Profile Analysis (LPA) approach to identify the main patterns or profiles among participants. blood biochemistry across groups/profiles was compared using the non-parametric Kruskal-Wallis test. We applied the Limma algorithm for differential abundance analysis on 168 metabolites and 2923 proteins, and utilized the Database for Annotation, Visualization and Integrated Discovery (DAVID) to identify enriched biological processes and pathways. Relative risks (RR) were calculated for chronic diseases and mental conditions per group, adjusting for sociodemographic factors. RESULTS: Based on their food preferences, three profiles were termed: the putative Health-conscious group (low preference for animal-based or sweet foods, and high preference for vegetables and fruits), the Omnivore group (high preference for all foods), and the putative Sweet-tooth group (high preference for sweet foods and sweetened beverages). The Health-conscious group exhibited lower risk of heart failure (RR = 0.86, 95%CI 0.79-0.93) and chronic kidney disease (RR = 0.69, 95%CI 0.65-0.74) compared to the two other groups. The Sweet-tooth group had greater risk of depression (RR = 1.27, 95%CI 1.21-1.34), diabetes (RR = 1.15, 95%CI 1.01-1.31), and stroke (RR = 1.22, 95%CI 1.15-1.31) compared to the other two groups. Cancer (overall) relative risk showed little difference across the Health-conscious, Omnivore, and Sweet-tooth groups with RR of 0.98 (95%CI 0.96-1.01), 1.00 (95%CI 0.98-1.03), and 1.01 (95%CI 0.98-1.04), respectively. The Health-conscious group was associated with lower levels of inflammatory biomarkers (e.g., C-reactive Protein) which are also known to be elevated in those with common metabolic diseases (e.g., cardiovascular disease). Other markers modulated in the Health-conscious group, ketone bodies, insulin-like growth factor-binding protein (IGFBP), and Growth Hormone 1 were more abundant, while leptin was less abundant. Further, the IGFBP pathway, which influences IGF1 activity, may be significantly enhanced by dietary choices. CONCLUSIONS: These observations align with previous findings from studies focusing on weight loss interventions, which include a reduction in leptin levels. Overall, the Health-conscious group, with preference to healthier food options, has better health outcomes, compared to Sweet-tooth and Omnivore groups.
Assuntos
Inteligência Artificial , Bancos de Espécimes Biológicos , Preferências Alimentares , Metabolômica , Proteômica , Humanos , Reino Unido , Masculino , Feminino , Pessoa de Meia-Idade , Proteômica/métodos , Metaboloma , Adulto , Idoso , Inquéritos e Questionários , Saúde , Biobanco do Reino UnidoRESUMO
Obstructive sleep apnea is a heterogeneous sleep disorder with varying phenotypes. Several studies have already performed cluster analyses to discover various obstructive sleep apnea phenotypic clusters. However, the selection of the clustering method might affect the outputs. Consequently, it is unclear whether similar obstructive sleep apnea clusters can be reproduced using different clustering methods. In this study, we applied four well-known clustering methods: Agglomerative Hierarchical Clustering; K-means; Fuzzy C-means; and Gaussian Mixture Model to a population of 865 suspected obstructive sleep apnea patients. By creating five clusters with each method, we examined the effect of clustering methods on forming obstructive sleep apnea clusters and the differences in their physiological characteristics. We utilized a visualization technique to indicate the cluster formations, Cohen's kappa statistics to find the similarity and agreement between clustering methods, and performance evaluation to compare the clustering performance. As a result, two out of five clusters were distinctly different with all four methods, while three other clusters exhibited overlapping features across all methods. In terms of agreement, Fuzzy C-means and K-means had the strongest (κ = 0.87), and Agglomerative hierarchical clustering and Gaussian Mixture Model had the weakest agreement (κ = 0.51) between each other. The K-means showed the best clustering performance, followed by the Fuzzy C-means in most evaluation criteria. Moreover, Fuzzy C-means showed the greatest potential in handling overlapping clusters compared with other methods. In conclusion, we revealed a direct impact of clustering method selection on the formation and physiological characteristics of obstructive sleep apnea clusters. In addition, we highlighted the capability of soft clustering methods, particularly Fuzzy C-means, in the application of obstructive sleep apnea phenotyping.
RESUMO
The clinical manifestation of Parkinson's disease exhibits significant heterogeneity in the prevalence of non-motor symptoms and the rate of progression of motor symptoms, suggesting that Parkinson's disease can be classified into distinct subtypes. In this study, we aimed to explore this heterogeneity by identifying a set of subtypes with distinct patterns of spatiotemporal trajectories of neurodegeneration. We applied Subtype and Stage Inference (SuStaIn), an unsupervised machine learning algorithm that combined disease progression modelling with clustering methods, to cortical and subcortical neurodegeneration visible on 3 T structural MRI of a large cross-sectional sample of 504 patients and 279 healthy controls. Serial longitudinal data were available for a subset of 178 patients at the 2-year follow-up and for 140 patients at the 4-year follow-up. In a subset of 210 patients, concomitant Alzheimer's disease pathology was assessed by evaluating amyloid-ß concentrations in the CSF or via the amyloid-specific radiotracer 18F-flutemetamol with PET. The SuStaIn analysis revealed three distinct subtypes, each characterized by unique patterns of spatiotemporal evolution of brain atrophy: neocortical, limbic and brainstem. In the neocortical subtype, a reduction in brain volume occurred in the frontal and parietal cortices in the earliest disease stage and progressed across the entire neocortex during the early stage, although with relative sparing of the striatum, pallidum, accumbens area and brainstem. The limbic subtype represented comparative regional vulnerability, which was characterized by early volume loss in the amygdala, accumbens area, striatum and temporal cortex, subsequently spreading to the parietal and frontal cortices across disease stage. The brainstem subtype showed gradual rostral progression from the brainstem extending to the amygdala and hippocampus, followed by the temporal and other cortices. Longitudinal MRI data confirmed that 77.8% of participants at the 2-year follow-up and 84.0% at the 4-year follow-up were assigned to subtypes consistent with estimates from the cross-sectional data. This three-subtype model aligned with empirically proposed subtypes based on age at onset, because the neocortical subtype demonstrated characteristics similar to those found in the old-onset phenotype, including older onset and cognitive decline symptoms (P < 0.05). Moreover, the subtypes correspond to the three categories of the neuropathological consensus criteria for symptomatic patients with Lewy pathology, proposing neocortex-, limbic- and brainstem-predominant patterns as different subgroups of α-synuclein distributions. Among the subtypes, the prevalence of biomarker evidence of amyloid-ß pathology was comparable. Upon validation, the subtype model might be applied to individual cases, potentially serving as a biomarker to track disease progression and predict temporal evolution.
RESUMO
In nature, animal vocalizations can provide crucial information about identity, including kinship and hierarchy. However, lab-based vocal behavior is typically studied during brief interactions between animals with no prior social relationship, and under environmental conditions with limited ethological relevance. Here, we address this gap by establishing long-term acoustic recordings from Mongolian gerbil families, a core social group that uses an array of sonic and ultrasonic vocalizations. Three separate gerbil families were transferred to an enlarged environment and continuous 20-day audio recordings were obtained. Using a variational autoencoder (VAE) to quantify 583,237 vocalizations, we show that gerbils exhibit a more elaborate vocal repertoire than has been previously reported and that vocal repertoire usage differs significantly by family. By performing gaussian mixture model clustering on the VAE latent space, we show that families preferentially use characteristic sets of vocal clusters and that these usage preferences remain stable over weeks. Furthermore, gerbils displayed family-specific transitions between vocal clusters. Since gerbils live naturally as extended families in complex underground burrows that are adjacent to other families, these results suggest the presence of a vocal dialect which could be exploited by animals to represent kinship. These findings position the Mongolian gerbil as a compelling animal model to study the neural basis of vocal communication and demonstrates the potential for using unsupervised machine learning with uninterrupted acoustic recordings to gain insights into naturalistic animal behavior.
RESUMO
The growing problem of unsolicited text messages (smishing) and data irregularities necessitates stronger spam detection solutions. This paper explores the development of a sophisticated model designed to identify smishing messages by understanding the complex relationships among words, images, and context-specific factors, areas that remain underexplored in existing research. To address this, we merge a UCI spam dataset of regular text messages with real-world spam data, leveraging OCR technology for comprehensive analysis. The study employs a combination of traditional machine learning models, including K-means, Non-Negative Matrix Factorization, and Gaussian Mixture Models, along with feature extraction techniques such as TF-IDF and PCA. Additionally, deep learning models like RNN-Flatten, LSTM, and Bi-LSTM are utilized. The selection of these models is driven by their complementary strengths in capturing both the linear and non-linear relationships inherent in smishing messages. Machine learning models are chosen for their efficiency in handling structured text data, while deep learning models are selected for their superior ability to capture sequential dependencies and contextual nuances. The performance of these models is rigorously evaluated using metrics like accuracy, precision, recall, and F1 score, enabling a comparative analysis between the machine learning and deep learning approaches. Notably, the K-means feature extraction with vectorizer achieved 91.01% accuracy, and the KNN-Flatten model reached 94.13% accuracy, emerging as the top performer. The rationale behind highlighting these models is their potential to significantly improve smishing detection rates. For instance, the high accuracy of the KNN-Flatten model suggests its applicability in real-time spam detection systems, but its computational complexity might limit scalability in large-scale deployments. Similarly, while K-means with vectorizer excels in accuracy, it may struggle with the dynamic and evolving nature of smishing attacks, necessitating continual retraining.
Assuntos
Neoplasias Encefálicas , Humanos , Neoplasias Encefálicas/líquido cefalorraquidiano , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/diagnóstico , Neoplasias Encefálicas/patologia , Sequenciamento por Nanoporos/métodos , Masculino , Linfoma/líquido cefalorraquidiano , Linfoma/diagnóstico , Linfoma/patologia , Linfoma/genética , Feminino , Pessoa de Meia-Idade , Idoso , CitologiaRESUMO
Introduction: In adult patients with ureteropelvic junction obstruction (UPJO), little data exist on predicting pyeloplasty outcome, and there is no unified definition of pyeloplasty success. As such, defining pyeloplasty success retrospectively is particularly vulnerable to bias, allowing researchers to choose significant outcomes with the benefit of hindsight. To mitigate these biases, we performed an unsupervised machine learning cluster analysis on a dataset of 216 pyeloplasty patients between 2015 and 2023 from a multihospital system to identify the defining risk factors of patients that experience worse outcomes. Methods: A KPrototypes model was fitted with pre- and perioperative data and blinded to postoperative outcomes. T-test and chi-square tests were performed to look at significant differences of characteristics between clusters. SHapley Additive exPlanation values were calculated from a random forest classifier to determine the most predictive features of cluster membership. A logistic regression model identified which of the most predictive variables remained significant after adjusting for confounding effects. Results: Two distinct clusters were identified. One cluster (denoted as "high-risk") contained 111 (51.4%) patients and was identified by having more comorbidities, such as old age (62.7 vs 35.7), high body mass index (BMI) (26.9 vs 23.8), hypertension (66.7% vs 17.1%), and previous abdominal surgery (72.1% vs 37.1%) and was found to have worse outcomes, such as more frequent severe postoperative complications (7.2% vs 1.0%). After adjusting for confounding effects, the most predictive features of high-risk cluster membership were old age, low preoperative estimated glomerular filtration rate (eGFR), hypertension, greater BMI, previous abdominal surgery, and left-sided UPJO. Conclusions: Adult UPJO patients with older age, lower eGFR, hypertension, greater BMI, previous abdominal surgery, and left-sided UPJO naturally cluster into to a group that more commonly suffers from perioperative complications and worse outcomes. Preoperative counseling and perioperative management for patients with these risk factors may need to be thought of or approached differently.
RESUMO
The aim of this study was to develop a machine learning-assisted rapid determination methodology for traditional Chinese Medicine Constitution. Based on the Constitution in Chinese Medicine Questionnaire (CCMQ), the most applied diagnostic instrument for assessing individuals' constitutions, we employed automated supervised machine learning algorithms (i.e., Tree-based Pipeline Optimization Tool; TPOT) on all the possible item combinations for each subscale and an unsupervised machine learning algorithm (i.e., variable clustering; varclus) on the whole scale to select items that can best predict body constitution (BC) classifications or BC scores. By utilizing subsets of items selected based on TPOT and corresponding machine learning algorithms, the accuracies of BC classifications prediction ranged from 0.819 to 0.936, with the root mean square errors of BC scores prediction stabilizing between 6.241 and 9.877. Overall, the results suggested that the automated machine learning algorithms performed better than the varclus algorithm for item selection. Additionally, based on an automated machine learning item selection procedure, we provided the top three ranked item combinations with each possible subscale length, along with their corresponding algorithms for predicting BC classification and severity. This approach could accommodate the needs of different practitioners in traditional Chinese medicine for rapid constitution determination.
RESUMO
Biopharmaceutical resins are pivotal inert matrices used across industry and academia, playing crucial roles in a myriad of applications. For biopharmaceutical process research and development applications, a deep understanding of the physical and chemical properties of the resin itself is frequently required, including for drug purification, drug delivery, and immobilized biocatalysis. Nevertheless, the prevailing methodologies currently employed for elucidating these important aspects of biopharmaceutical resins are often lacking, frequently require significant sample alteration, are destructive or ionizing in nature, and may not adequately provide representative information. In this work, we propose the use of unsupervised machine learning technologies, in the form of both non-negative matrix factorization (NMF) and k-means segmentation, in conjugation with Raman hyperspectral imaging to rapidly elucidate the molecular and spatial properties of biopharmaceutical resins. Leveraging our proposed technology, we offer a new approach to comprehensively understanding important resin-based systems for application across biopharmaceuticals and beyond. Specifically, focusing herein on a representative resin widely utilized across the industry (i.e., Immobead 150P), our findings showcase the ability of our machine learning-based technology to molecularly identify and spatially resolve all chemical species present. Further, we offer a comprehensive evaluation of optimal excitation for hyperspectral imaging data collection, demonstrating results across 532, 638, and 785 nm excitation. In all cases, our proposed technology deconvoluted, both spatially and spectrally, resin and glass substrates via NMF. After NMF deconvolution, image segmentation was also successfully accomplished in all data sets via k-means clustering. To the best of our knowledge, this is the first report utilizing the combination of two unsupervised machine learning methodologies, combining NMF and k-means, for the rapid deconvolution and segmentation of biopharmaceutical resins. As such, we offer a powerful new data-rich experimentation tool for application across multidisciplinary fields for a deeper understanding of resins.
RESUMO
OBJECTIVES: Adherence to the American Diabetes Association (ADA) Standards of Medical Care is low. This study aimed to assist pharmacists in identifying patients for diabetes control interventions using unsupervised machine learning. METHODS: This study analyzed the 2021 Medical Expenditure Panel Survey and used a k-mode cluster analysis. Patient features analyzed were adherence to a select set of preventive measures from the ADA Standards of Medical Care (HbA1c test, foot examination, blood cholesterol test, dilated eye examination, and influenza vaccination) and some patient characteristics (age, gender, health insurance, insulin use, and diabetes-related complications). RESULTS: The study included 1,219 patients with self-reported diabetes, and the adherence rate to the ADA standards was 33.72%. Five distinct clusters emerged: (A) moderate-complexity, privately insured male; (B) moderate-complexity, publicly insured female; (C) low-complexity, privately insured female; (D) high-complexity, publicly insured female; (E) moderate-complexity, publicly insured male. Groups B, C, and E exhibited nonadherence. CONCLUSIONS: Pharmacists can target publicly insured elderly (Groups B and E) and privately insured middle-aged females (Group C) for interventions. For instance, pharmacists may help patients in Groups B and E locate existing resources in their insurance program and remind those in Group C of the importance of adequate diabetes care.
RESUMO
BACKGROUND: Structural anomalies in the frontal lobe and basal ganglia have been reported in patients with attention-deficit/hyperactivity disorder (ADHD). However, these findings have been not always consistent because of ADHD diversity. This study aimed to identify ADHD subtypes based on cognitive function and find their distinct brain structural characteristics. METHODS: Using the data of 656 children with ADHD from the Adolescent Brain Cognitive Development (ABCD) Study, we applied unsupervised machine learning to identify ADHD subtypes using the National Institutes of Health Toolbox Tasks. Moreover, we compared the regional brain volumes between each ADHD subtype and 6601 children without ADHD (non-ADHD). RESULTS: Hierarchical cluster analysis automatically classified ADHD into three distinct subtypes: ADHD-A (n = 212, characterized by high-order cognitive ability), ADHD-B (n = 190, characterized by low cognitive control, processing speed, and episodic memory), and ADHD-C (n = 254, characterized by strikingly low cognitive control, working memory, episodic memory, and language ability). Structural analyses revealed that the ADHD-C type had significantly smaller volumes of the left inferior temporal gyrus and right lateral orbitofrontal cortex than the non-ADHD group, and the right lateral orbitofrontal cortex volume was positively correlated with language performance in the ADHD-C type. However, the volumes of the ADHD-A and ADHD-B types were not significantly different from those of the non-ADHD group. CONCLUSIONS: These results indicate the presence of anomalies in the lateral orbitofrontal cortex associated with language deficits in the ADHD-C type. Subtype specificity may explain previous inconsistencies in brain structural anomalies reported in ADHD.
RESUMO
BACKGROUND: Public health crises, such as the COVID-19 pandemic, have prompted a need for health agencies to improve their disease preparedness strategies, informing their communities of new information and promoting preventive behaviors to help curb the spread of the virus. METHODS: We ran unsupervised machine learning and emotion analysis, validated with manual coding, on posts of health agencies (N = 1588) and their associated public comments (N = 7813) during a crucial initial period of the COVID-19 pandemic (January 2020 to February 2021) among nine different counties with a higher proportion of vaccine-hesitant communities in Northern California. In addition, we explored differences in concerns and expressed emotions by two key group-level factors, county-level COVID-19 death rate and political party affiliation. RESULTS: We consistently find that while health agencies primarily disseminated information about COVID-19 and the vaccine, they failed to address the concerns of their communities as expressed in public comment sections. Topics among public audiences focused on concerns with the COVID-19 vaccine safety and rollout, state mandates, flu vaccination, and frustration with politicians, and they expressed more positive and more negative emotions than health agencies. Further, there were several differences in primary topics and emotions expressed among public audiences by county-level COVID-19 death rate and political party affiliation. CONCLUSION: While this research serves as a case study, findings indicate how local health agencies, and their audiences, discuss their perceptions and concerns regarding the COVID-19 pandemic and may inform health communication researchers and practitioners on how to prepare and manage for emerging health crises.
Assuntos
Vacinas contra COVID-19 , COVID-19 , Mídias Sociais , Humanos , COVID-19/prevenção & controle , COVID-19/epidemiologia , California/epidemiologia , Mídias Sociais/estatística & dados numéricos , Vacinas contra COVID-19/administração & dosagem , Hesitação Vacinal/psicologia , Hesitação Vacinal/estatística & dados numéricos , SARS-CoV-2 , Vacinação/estatística & dados numéricos , Vacinação/psicologia , Aprendizado de MáquinaRESUMO
Background: Peripheral artery disease (PAD) represents the frequently seen circulatory condition related to a risk of critical limb ischemia and amputation. Critical lower extremity ischemia may require amputation, and the outcomes vary. In this study, we developed an artificial intelligence (AI)-driven predictive model for PAD subtypes to assess risk among patients more precisely and accurately to predict disease progression. Methods: The present retrospective study examined clinical data in PAD patents undergoing lower extremity amputation. The data were analyzed using an unsupervised machine learning algorithm (UMLA) for subgroup identification and risk stratification. The clustering result accuracy was validated by analyzing the follow-up data of clusters. Finally, we built the prediction model with binary logistic regression. Results: In total, we enrolled 507 cases into this work. Two distinct subgroups, consisting of Clusters 1 and 2, were identified by UMLA; those from Cluster 1 showed markedly poorer conditions and prognostic outcomes compared with those from Cluster 2. With regard to the new PAD subtype, we established a nomogram with eight predictive factors, including gender, age, smoking history, diabetes and coronary heart disease history, albumin levels, endovascular intervention, and amputation level. The nomogram could accurately categorize patients into two identified clusters, and the area under receiver operating characteristic curve was 0.861 (95 % confidence interval: 0.830-0.893). Conclusion: In this study, UMLA was used to identify new phenotypic subgroups among PAD cases who showed different risks of amputation. Our constructed AI-driven predictive model for PAD subtypes showed that it can be used for risk stratification and clinical management with high accuracy and reliability.
RESUMO
BACKGROUND: Unsupervised machine learning describes a collection of powerful techniques that seek to identify hidden patterns in unlabeled data. These techniques can be broadly categorized into dimension reduction, which transforms and combines the original set of measurements to simplify data, and cluster analysis, which seeks to group subjects based on some measure of similarity. Unsupervised machine learning can be used to explore alternative subtyping of disorders of gut-brain interaction (DGBI) compared to the existing gastrointestinal symptom-based definitions of Rome IV. PURPOSE: This present review aims to familiarize the reader with fundamental concepts of unsupervised machine learning using accessible definitions and provide a critical summary of their application to the evaluation of DGBI subtyping. By considering the overlap between Rome IV clinical definitions and identified clusters, along with clinical and physiological insights, this paper speculates on the possible implications for DGBI. Also considered are algorithmic developments in the unsupervised machine learning community that may help leverage increasingly available omics data to explore biologically informed definitions. Unsupervised machine learning challenges the modern subtyping of DGBI and, with the necessary clinical validation, has the potential to enhance future iterations of the Rome criteria to identify more homogeneous, diagnosable, and treatable patient populations.
RESUMO
BACKGROUND: The objective of this study was to define clinically meaningful phenotypes of intracerebral hemorrhage (ICH) using machine learning. METHODS: We used patient data from two US medical centers and the Antihypertensive Treatment of Acute Cerebral Hemorrhage-II clinical trial. We used k-prototypes to partition patient admission data. We then used silhouette method calculations and elbow method heuristics to optimize the clusters. Associations between phenotypes, complications (e.g., seizures), and functional outcomes were assessed using the Kruskal-Wallis H-test or χ2 test. RESULTS: There were 916 patients; the mean age was 63.8 ± 14.1 years, and 426 patients were female (46.5%). Three distinct clinical phenotypes emerged: patients with small hematomas, elevated blood pressure, and Glasgow Coma Scale scores > 12 (n = 141, 26.6%); patients with hematoma expansion and elevated international normalized ratio (n = 204, 38.4%); and patients with median hematoma volumes of 24 (interquartile range 8.2-59.5) mL, who were more frequently Black or African American, and who were likely to have intraventricular hemorrhage (n = 186, 35.0%). There were associations between clinical phenotype and seizure (P = 0.024), length of stay (P = 0.001), discharge disposition (P < 0.001), and death or disability (modified Rankin Scale scores 4-6) at 3-months' follow-up (P < 0.001). We reproduced these three clinical phenotypes of ICH in an independent cohort (n = 385) for external validation. CONCLUSIONS: Machine learning identified three phenotypes of ICH that are clinically significant, associated with patient complications, and associated with functional outcomes. Cerebellar hematomas are an additional phenotype underrepresented in our data sources.