Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 889
Filter
1.
Clin Ter ; 175(3): 98-116, 2024.
Article in English | MEDLINE | ID: mdl-38767067

ABSTRACT

Background: The human microbiome, consisting of diverse bacte-rial, fungal, protozoan and viral species, exerts a profound influence on various physiological processes and disease susceptibility. However, the complexity of microbiome data has presented significant challenges in the analysis and interpretation of these intricate datasets, leading to the development of specialized software that employs machine learning algorithms for these aims. Methods: In this paper, we analyze raw data taken from 16S rRNA gene sequencing from three studies, including stool samples from healthy control, patients with adenoma, and patients with colorectal cancer. Firstly, we use network-based methods to reduce dimensions of the dataset and consider only the most important features. In addition, we employ supervised machine learning algorithms to make prediction. Results: Results show that graph-based techniques reduces dimen-sion from 255 up to 78 features with modularity score 0.73 based on different centrality measures. On the other hand, projection methods (non-negative matrix factorization and principal component analysis) reduce dimensions to 7 features. Furthermore, we apply supervised machine learning algorithms on the most important features obtained from centrality measures and on the ones obtained from projection methods, founding that the evaluation metrics have approximately the same scores when applying the algorithms on the entire dataset, on 78 feature and on 7 features. Conclusions: This study demonstrates the efficacy of graph-based and projection methods in the interpretation for 16S rRNA gene sequencing data. Supervised machine learning on refined features from both approaches yields comparable predictive performance, emphasizing specific microbial features-bacteroides, prevotella, fusobacterium, lysinibacillus, blautia, sphingomonas, and faecalibacterium-as key in predicting patient conditions from raw data.


Subject(s)
Microbiota , RNA, Ribosomal, 16S , Supervised Machine Learning , Unsupervised Machine Learning , Humans , Microbiota/genetics , RNA, Ribosomal, 16S/genetics , RNA, Ribosomal, 16S/analysis , Colorectal Neoplasms/microbiology , Gastrointestinal Microbiome/genetics , Algorithms , Feces/microbiology , Adenoma/microbiology
2.
Sci Rep ; 14(1): 11455, 2024 05 20.
Article in English | MEDLINE | ID: mdl-38769329

ABSTRACT

Cone-beam computed tomography (CBCT) is a crucial component of adaptive radiation therapy; however, it frequently encounters challenges such as artifacts and noise, significantly constraining its clinical utility. While CycleGAN is a widely employed method for CT image synthesis, it has notable limitations regarding the inadequate capture of global features. To tackle these challenges, we introduce a refined unsupervised learning model called improved vision transformer CycleGAN (IViT-CycleGAN). Firstly, we integrate a U-net framework that builds upon ViT. Next, we augment the feed-forward neural network by incorporating deep convolutional networks. Lastly, we enhance the stability of the model training process by introducing gradient penalty and integrating an additional loss term into the generator loss. The experiment demonstrates from multiple perspectives that our model-generated synthesizing CT(sCT) has significant advantages compared to other unsupervised learning models, thereby validating the clinical applicability and robustness of our model. In future clinical practice, our model has the potential to assist clinical practitioners in formulating precise radiotherapy plans.


Subject(s)
Cone-Beam Computed Tomography , Neural Networks, Computer , Humans , Cone-Beam Computed Tomography/methods , Image Processing, Computer-Assisted/methods , Tomography, X-Ray Computed/methods , Algorithms , Unsupervised Machine Learning
3.
Front Public Health ; 12: 1337432, 2024.
Article in English | MEDLINE | ID: mdl-38699419

ABSTRACT

Introduction: Obesity and gender play a critical role in shaping the outcomes of COVID-19 disease. These two factors have a dynamic relationship with each other, as well as other risk factors, which hinders interpretation of how they influence severity and disease progression. This work aimed to study differences in COVID-19 disease outcomes through analysis of risk profiles stratified by gender and obesity status. Methods: This study employed an unsupervised clustering analysis, using Mexico's national COVID-19 hospitalization dataset, which contains demographic information and health outcomes of patients hospitalized due to COVID-19. Patients were segmented into four groups by obesity and gender, with participants' attributes and clinical outcome data described for each. Then, Consensus and PAM clustering methods were used to identify distinct risk profiles based on underlying patient characteristics. Risk profile discovery was completed on 70% of records, with the remaining 30% available for validation. Results: Data from 88,536 hospitalized patients were analyzed. Obesity, regardless of gender, was linked with higher odds of hypertension, diabetes, cardiovascular diseases, pneumonia, and Intensive Care Unit (ICU) admissions. Men tended to have higher frequencies of ICU admissions and pneumonia and higher mortality rates than women. Within each of the four analysis groups (divided based on gender and obesity status), clustering analyses identified four to five distinct risk profiles. For example, among women with obesity, there were four profiles; those with a hypertensive profile were more likely to have pneumonia, and those with a diabetic profile were most likely to be admitted to the ICU. Conclusion: Our analysis emphasizes the complex interplay between obesity, gender, and health outcomes in COVID-19 hospitalizations. The identified risk profiles highlight the need for personalized treatment strategies for COVID-19 patients and can assist in planning for patterns of deterioration in future waves of SARS-CoV-2 virus transmission. This research underscores the importance of tackling obesity as a major public health concern, given its interplay with many other health conditions, including infectious diseases such as COVID-19.


Subject(s)
COVID-19 , Hospitalization , Obesity , Unsupervised Machine Learning , Humans , COVID-19/epidemiology , COVID-19/mortality , Male , Female , Obesity/epidemiology , Mexico/epidemiology , Middle Aged , Hospitalization/statistics & numerical data , Risk Factors , Adult , Sex Factors , Aged , SARS-CoV-2 , Cluster Analysis
4.
Radiat Oncol ; 19(1): 61, 2024 May 21.
Article in English | MEDLINE | ID: mdl-38773620

ABSTRACT

PURPOSE: Accurate deformable registration of magnetic resonance imaging (MRI) scans containing pathologies is challenging due to changes in tissue appearance. In this paper, we developed a novel automated three-dimensional (3D) convolutional U-Net based deformable image registration (ConvUNet-DIR) method using unsupervised learning to establish correspondence between baseline pre-operative and follow-up MRI scans of patients with brain glioma. METHODS: This study involved multi-parametric brain MRI scans (T1, T1-contrast enhanced, T2, FLAIR) acquired at pre-operative and follow-up time for 160 patients diagnosed with glioma, representing the BraTS-Reg 2022 challenge dataset. ConvUNet-DIR, a deep learning-based deformable registration workflow using 3D U-Net style architecture as a core, was developed to establish correspondence between the MRI scans. The workflow consists of three components: (1) the U-Net learns features from pairs of MRI scans and estimates a mapping between them, (2) the grid generator computes the sampling grid based on the derived transformation parameters, and (3) the spatial transformation layer generates a warped image by applying the sampling operation using interpolation. A similarity measure was used as a loss function for the network with a regularization parameter limiting the deformation. The model was trained via unsupervised learning using pairs of MRI scans on a training data set (n = 102) and validated on a validation data set (n = 26) to assess its generalizability. Its performance was evaluated on a test set (n = 32) by computing the Dice score and structural similarity index (SSIM) quantitative metrics. The model's performance also was compared with the baseline state-of-the-art VoxelMorph (VM1 and VM2) learning-based algorithms. RESULTS: The ConvUNet-DIR model showed promising competency in performing accurate 3D deformable registration. It achieved a mean Dice score of 0.975 ± 0.003 and SSIM of 0.908 ± 0.011 on the test set (n = 32). Experimental results also demonstrated that ConvUNet-DIR outperformed the VoxelMorph algorithms concerning Dice (VM1: 0.969 ± 0.006 and VM2: 0.957 ± 0.008) and SSIM (VM1: 0.893 ± 0.012 and VM2: 0.857 ± 0.017) metrics. The time required to perform a registration for a pair of MRI scans is about 1 s on the CPU. CONCLUSIONS: The developed deep learning-based model can perform an end-to-end deformable registration of a pair of 3D MRI scans for glioma patients without human intervention. The model could provide accurate, efficient, and robust deformable registration without needing pre-alignment and labeling. It outperformed the state-of-the-art VoxelMorph learning-based deformable registration algorithms and other supervised/unsupervised deep learning-based methods reported in the literature.


Subject(s)
Brain Neoplasms , Deep Learning , Glioma , Magnetic Resonance Imaging , Unsupervised Machine Learning , Humans , Magnetic Resonance Imaging/methods , Brain Neoplasms/diagnostic imaging , Brain Neoplasms/radiotherapy , Glioma/diagnostic imaging , Glioma/radiotherapy , Glioma/pathology , Radiation Oncology/methods , Image Processing, Computer-Assisted/methods , Imaging, Three-Dimensional/methods
5.
Cereb Cortex ; 34(13): 72-83, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38696605

ABSTRACT

Autism spectrum disorder has been emerging as a growing public health threat. Early diagnosis of autism spectrum disorder is crucial for timely, effective intervention and treatment. However, conventional diagnosis methods based on communications and behavioral patterns are unreliable for children younger than 2 years of age. Given evidences of neurodevelopmental abnormalities in autism spectrum disorder infants, we resort to a novel deep learning-based method to extract key features from the inherently scarce, class-imbalanced, and heterogeneous structural MR images for early autism diagnosis. Specifically, we propose a Siamese verification framework to extend the scarce data, and an unsupervised compressor to alleviate data imbalance by extracting key features. We also proposed weight constraints to cope with sample heterogeneity by giving different samples different voting weights during validation, and used Path Signature to unravel meaningful developmental features from the two-time point data longitudinally. We further extracted machine learning focused brain regions for autism diagnosis. Extensive experiments have shown that our method performed well under practical scenarios, transcending existing machine learning methods and providing anatomical insights for autism early diagnosis.


Subject(s)
Autism Spectrum Disorder , Brain , Deep Learning , Early Diagnosis , Humans , Autism Spectrum Disorder/diagnostic imaging , Autism Spectrum Disorder/diagnosis , Infant , Brain/diagnostic imaging , Brain/pathology , Magnetic Resonance Imaging/methods , Child, Preschool , Male , Female , Autistic Disorder/diagnosis , Autistic Disorder/diagnostic imaging , Autistic Disorder/pathology , Unsupervised Machine Learning
6.
PeerJ ; 12: e17340, 2024.
Article in English | MEDLINE | ID: mdl-38756444

ABSTRACT

Introduction: This study aimed to evaluate the prognosis of patients with COVID-19 and hypertension who were treated with angiotensin-converting enzyme inhibitor (ACEI)/angiotensin receptor B (ARB) drugs and to identify key features affecting patient prognosis using an unsupervised learning method. Methods: A large-scale clinical dataset, including patient information, medical history, and laboratory test results, was collected. Two hundred patients with COVID-19 and hypertension were included. After cluster analysis, patients were divided into good and poor prognosis groups. The unsupervised learning method was used to evaluate clinical characteristics and prognosis, and patients were divided into different prognosis groups. The improved wild dog optimization algorithm (IDOA) was used for feature selection and cluster analysis, followed by the IDOA-k-means algorithm. The impact of ACEI/ARB drugs on patient prognosis and key characteristics affecting patient prognosis were also analysed. Results: Key features related to prognosis included baseline information and laboratory test results, while clinical symptoms and imaging results had low predictive power. The top six important features were age, hypertension grade, MuLBSTA, ACEI/ARB, NT-proBNP, and high-sensitivity troponin I. These features were consistent with the results of the unsupervised prediction model. A visualization system was developed based on these key features. Conclusion: Using unsupervised learning and the improved k-means algorithm, this study accurately analysed the prognosis of patients with COVID-19 and hypertension. The use of ACEI/ARB drugs was found to be a protective factor for poor clinical prognosis. Unsupervised learning methods can be used to differentiate patient populations and assess treatment effects. This study identified important features affecting patient prognosis and developed a visualization system with clinical significance for prognosis assessment and treatment decision-making.


Subject(s)
Angiotensin Receptor Antagonists , Angiotensin-Converting Enzyme Inhibitors , COVID-19 , Hypertension , SARS-CoV-2 , Unsupervised Machine Learning , Humans , Hypertension/drug therapy , Angiotensin-Converting Enzyme Inhibitors/therapeutic use , Male , Prognosis , Retrospective Studies , Female , Middle Aged , Angiotensin Receptor Antagonists/therapeutic use , Aged , COVID-19 Drug Treatment , Algorithms , Cluster Analysis
7.
PLoS One ; 19(5): e0302502, 2024.
Article in English | MEDLINE | ID: mdl-38743773

ABSTRACT

ChatGPT has demonstrated impressive abilities and impacted various aspects of human society since its creation, gaining widespread attention from different social spheres. This study aims to comprehensively assess public perception of ChatGPT on Reddit. The dataset was collected via Reddit, a social media platform, and includes 23,733 posts and comments related to ChatGPT. Firstly, to examine public attitudes, this study conducts content analysis utilizing topic modeling with the Latent Dirichlet Allocation (LDA) algorithm to extract pertinent topics. Furthermore, sentiment analysis categorizes user posts and comments as positive, negative, or neutral using Textblob and Vader in natural language processing. The result of topic modeling shows that seven topics regarding ChatGPT are identified, which can be grouped into three themes: user perception, technical methods, and impacts on society. Results from the sentiment analysis show that 61.6% of the posts and comments hold favorable opinions on ChatGPT. They emphasize ChatGPT's ability to prompt and engage in natural conversations with users, without relying on complex natural language processing. It provides suggestions for ChatGPT developers to enhance its usability design and functionality. Meanwhile, stakeholders, including users, should comprehend the advantages and disadvantages of ChatGPT in human society to promote ethical and regulated implementation of the system.


Subject(s)
Public Opinion , Social Media , Humans , Natural Language Processing , Unsupervised Machine Learning , Attitude , Algorithms
8.
BMC Musculoskelet Disord ; 25(1): 376, 2024 May 13.
Article in English | MEDLINE | ID: mdl-38741076

ABSTRACT

OBJECTIVES: The traditional understanding of craniocervical alignment emphasizes specific anatomical landmarks. However, recent research has challenged the reliance on forward head posture as the primary diagnostic criterion for neck pain. An advanced relationship exists between neck pain and craniocervical alignment, which requires a deeper exploration of diverse postures and movement patterns using advanced techniques, such as clustering analysis. We aimed to explore the complex relationship between craniocervical alignment, and neck pain and to categorize alignment patterns in individuals with nonspecific neck pain using the K-means algorithm. METHODS: This study included 229 office workers with nonspecific neck pain who applied unsupervised machine learning techniques. The craniocervical angles (CCA) during rest, protraction, and retraction were measured using two-dimensional video analysis, and neck pain severity was assessed using the Northwick Park Neck Pain Questionnaire (NPQ). CCA during sitting upright in a comfortable position was assessed to evaluate the resting CCA. The average of midpoints between repeated protraction and retraction measures was considered as the midpoint CCA. The K-means algorithm helped categorize participants into alignment clusters based on age, sex and CCA data. RESULTS: We found no significant correlation between NPQ scores and CCA data, challenging the traditional understanding of neck pain and alignment. We observed a significant difference in age (F = 140.14, p < 0.001), NPQ total score (F = 115.83, p < 0.001), resting CCA (F = 79.22, p < 0.001), CCA during protraction (F = 33.98, p < 0.001), CCA during retraction (F = 40.40, p < 0.001), and midpoint CCA (F = 66.92, p < 0.001) among the three clusters and healthy controls. Cluster 1 was characterized by the lowest resting and midpoint CCA, and CCA during pro- and -retraction, indicating a significant forward head posture and a pattern of retraction restriction. Cluster 2, the oldest group, showed CCA measurements similar to healthy controls, yet reported the highest NPQ scores. Cluster 3 exhibited the highest CCA during protraction and retraction, suggesting a limitation in protraction movement. DISCUSSION: Analyzing 229 office workers, three distinct alignment patterns were identified, each with unique postural characteristics; therefore, treatments addressing posture should be individualized and not generalized across the population.


Subject(s)
Neck Pain , Posture , Unsupervised Machine Learning , Humans , Neck Pain/physiopathology , Male , Female , Adult , Posture/physiology , Middle Aged , Cluster Analysis , Head , Cervical Vertebrae/physiopathology , Cervical Vertebrae/diagnostic imaging , Movement/physiology , Pain Measurement/methods , Young Adult , Head Movements/physiology
9.
Neural Netw ; 175: 106295, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38614023

ABSTRACT

Multi-view unsupervised feature selection (MUFS) is an efficient approach for dimensional reduction of heterogeneous data. However, existing MUFS approaches mostly assign the samples the same weight, thus the diversity of samples is not utilized efficiently. Additionally, due to the presence of various regularizations, the resulting MUFS problems are often non-convex, making it difficult to find the optimal solutions. To address this issue, a novel MUFS method named Self-paced Regularized Adaptive Multi-view Unsupervised Feature Selection (SPAMUFS) is proposed. Specifically, the proposed approach firstly trains the MUFS model with simple samples, and gradually learns complex samples by using self-paced regularizer. l2,p-norm (0

Subject(s)
Algorithms , Unsupervised Machine Learning , Humans , Neural Networks, Computer
10.
Neural Netw ; 175: 106315, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38626618

ABSTRACT

Pre-trained Language Model (PLM) is nowadays the mainstay of Unsupervised Sentence Representation Learning (USRL). However, PLMs are sensitive to the frequency information of words from their pre-training corpora, resulting in anisotropic embedding space, where the embeddings of high-frequency words are clustered but those of low-frequency words disperse sparsely. This anisotropic phenomenon results in two problems of similarity bias and information bias, lowering the quality of sentence embeddings. To solve the problems, we fine-tune PLMs by leveraging the frequency information of words and propose a novel USRL framework, namely Sentence Representation Learning with Frequency-induced Adversarial tuning and Incomplete sentence filtering (Slt-fai). We calculate the word frequencies over the pre-training corpora of PLMs and assign words thresholding frequency labels. With them, (1) we incorporate a similarity discriminator used to distinguish the embeddings of high-frequency and low-frequency words, and adversarially tune the PLM with it, enabling to achieve uniformly frequency-invariant embedding space; and (2) we propose a novel incomplete sentence detection task, where we incorporate an information discriminator to distinguish the embeddings of original sentences and incomplete sentences by randomly masking several low-frequency words, enabling to emphasize the more informative low-frequency words. Our Slt-fai is a flexible and plug-and-play framework, and it can be integrated with existing USRL techniques. We evaluate Slt-fai with various backbones on benchmark datasets. Empirical results indicate that Slt-fai can be superior to the existing USRL baselines.


Subject(s)
Language , Unsupervised Machine Learning , Humans , Neural Networks, Computer , Natural Language Processing , Algorithms
11.
Phys Med Biol ; 69(10)2024 May 10.
Article in English | MEDLINE | ID: mdl-38604186

ABSTRACT

Objective. Recently, deep learning models have been used to reconstruct parallel magnetic resonance (MR) images from undersampled k-space data. However, most existing approaches depend on large databases of fully sampled MR data for training, which can be challenging or sometimes infeasible to acquire in certain scenarios. The goal is to develop an effective alternative for improved reconstruction quality that does not rely on external training datasets.Approach. We introduce a novel zero-shot dual-domain fusion unsupervised neural network (DFUSNN) for parallel MR imaging reconstruction without any external training datasets. We employ the Noise2Noise (N2N) network for the reconstruction in the k-space domain, integrate phase and coil sensitivity smoothness priors into the k-space N2N network, and use an early stopping criterion to prevent overfitting. Additionally, we propose a dual-domain fusion method based on Bayesian optimization to enhance reconstruction quality efficiently.Results. Simulation experiments conducted on three datasets with different undersampling patterns showed that the DFUSNN outperforms all other competing unsupervised methods and the one-shot Hankel-k-space generative model (HKGM). The DFUSNN also achieves comparable results to the supervised Deep-SLR method.Significance. The novel DFUSNN model offers a viable solution for reconstructing high-quality MR images without the need for external training datasets, thereby overcoming a major hurdle in scenarios where acquiring fully sampled MR data is difficult.


Subject(s)
Image Processing, Computer-Assisted , Magnetic Resonance Imaging , Neural Networks, Computer , Magnetic Resonance Imaging/methods , Image Processing, Computer-Assisted/methods , Unsupervised Machine Learning , Humans
12.
Water Sci Technol ; 89(7): 1757-1770, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38619901

ABSTRACT

The water reuse facilities of industrial parks face the challenge of managing a growing variety of wastewater sources as their inlet water. Typically, this clustering outcome is designed by engineers with extensive expertise. This paper presents an innovative application of unsupervised learning methods to classify inlet water in Chinese water reuse stations, aiming to reduce reliance on engineer experience. The concept of 'water quality distance' was incorporated into three unsupervised learning clustering algorithms (K-means, DBSCAN, and AGNES), which were validated through six case studies. Of the six cases, three were employed to illustrate the feasibility of the unsupervised learning clustering algorithm. The results indicated that the clustering algorithm exhibited greater stability and excellence compared to both artificial clustering and ChatGPT-based clustering. The remaining three cases were utilized to showcase the reliability of the three clustering algorithms. The findings revealed that the AGNES algorithm demonstrated superior potential application ability. The average purity in six cases of K-means, DBSCAN, and AGNES were 0.947, 0.852, and 0.955, respectively.


Subject(s)
Bays , Unsupervised Machine Learning , Reproducibility of Results , Algorithms , Cluster Analysis
13.
Acta Neuropathol Commun ; 12(1): 51, 2024 Apr 04.
Article in English | MEDLINE | ID: mdl-38576030

ABSTRACT

DNA methylation analysis based on supervised machine learning algorithms with static reference data, allowing diagnostic tumour typing with unprecedented precision, has quickly become a new standard of care. Whereas genome-wide diagnostic methylation profiling is mostly performed on microarrays, an increasing number of institutions additionally employ nanopore sequencing as a faster alternative. In addition, methylation-specific parallel sequencing can generate methylation and genomic copy number data. Given these diverse approaches to methylation profiling, to date, there is no single tool that allows (1) classification and interpretation of microarray, nanopore and parallel sequencing data, (2) direct control of nanopore sequencers, and (3) the integration of microarray-based methylation reference data. Furthermore, no software capable of entirely running in routine diagnostic laboratory environments lacking high-performance computing and network infrastructure exists. To overcome these shortcomings, we present EpiDiP/NanoDiP as an open-source DNA methylation and copy number profiling suite, which has been benchmarked against an established supervised machine learning approach using in-house routine diagnostics data obtained between 2019 and 2021. Running locally on portable, cost- and energy-saving system-on-chip as well as gpGPU-augmented edge computing devices, NanoDiP works in offline mode, ensuring data privacy. It does not require the rigid training data annotation of supervised approaches. Furthermore, NanoDiP is the core of our public, free-of-charge EpiDiP web service which enables comparative methylation data analysis against an extensive reference data collection. We envision this versatile platform as a useful resource not only for neuropathologists and surgical pathologists but also for the tumour epigenetics research community. In daily diagnostic routine, analysis of native, unfixed biopsies by NanoDiP delivers molecular tumour classification in an intraoperative time frame.


Subject(s)
Epigenomics , Neoplasms , Humans , Unsupervised Machine Learning , Cloud Computing , Neoplasms/diagnosis , Neoplasms/genetics , DNA Methylation
14.
Rev Alerg Mex ; 71(1): 8-11, 2024 Feb 01.
Article in Spanish | MEDLINE | ID: mdl-38683063

ABSTRACT

OBJECTIVE: Analyze feelings about allergen-specific immunotherapy on Twitter using the VADER model VADER (Valence Aware Dictionary and sEntiment Reasoner) model. METHODS: tweets related to specific allergen immunotherapy were obtained through the Twitter Application Programming Interface (API). The keywords "allergy shot" were used between January 1, 2012, and December 31, 2022. The data was processed by removing URLs, usernames, hashtags, multiple spaces, and duplicate tweets. Subsequently, a sentiment analysis was performed using the VADER model. RESULTS: A total of 34,711 tweets were retrieved, of which 1928 were eliminated. Of the remaining 32,783 tweets, 32.41% expressed a negative sentiment, 31.11% expressed a neutral sentiment, and 36.47% expressed a positive sentiment, with an average polarity of 0.02751 (neutral) over the 11-year period. CONCLUSIONS: The average polarity of tweets about allergen-specific immunotherapy is neutral over the 11 years analyzed. There was an annual increase in the average polarity over the years, with 2017, 2018, and 2022 having positive polarity averages. Additionally, the number of tweets decreased over time.


OBJETIVO: Analizar los sentimientos acerca de la inmunoterapia alérgeno-específica en Twitter mediante el modelo VADER (Valence Aware Dictionary and sEntiment Reasoner). MÉTODOS: Se utilizaron tweets relacionados con la inmunoterapia alérgeno-específica obtenidos a través del API (Application Programming Interface) de Twitter. Se incorporaron las palabras clave "allergy shot" en el período comprendido entre el 1 de enero de 2012 y el 31 de diciembre de 2022. Los datos obtenidos fueron procesados, eliminando las URL, nombres de usuarios, hashtags, espacios múltiples y tweets duplicados. Posteriormente, se realizó un análisis de sentimientos utilizando el modelo VADER. RESULTADOS: Se recolectaron 34,711 tweets, de los que se eliminaron 1928. De los 32,783 tweets restantes, se encontró que el 32.41% de los usuarios expresó un sentimiento negativo, el 31.11% un sentimiento neutral y el 36.47% un sentimiento positivo, con una media de polaridad de 0.02751 (neutral) a lo largo de los 11 años. CONCLUSIONES: La polaridad media de los tweets acerca de la inmunoterapia alérgeno-específica es neutral a lo largo de los 11 años analizados. Existe un aumento anual en la polaridad media positiva a lo largo de los años, sobre todo entre 2017, 2018 y 2022. La cantidad de tweets disminuyó con el tiempo.


Subject(s)
Desensitization, Immunologic , Social Media , Unsupervised Machine Learning , Humans , Desensitization, Immunologic/methods , Emotions
15.
Biophys J ; 123(9): 1152-1163, 2024 May 07.
Article in English | MEDLINE | ID: mdl-38571310

ABSTRACT

Conformational dynamics of RNA plays important roles in a variety of cellular functions such as transcriptional regulation, catalysis, scaffolding, and sensing. Recently, RNAs with low-complexity sequences have been shown to phase separate and form condensate phases similar to lowcomplexity protein domains. The affinity for phase separation and the material characteristics of RNA condensates are strongly dependent on sequence composition and patterning. We hypothesize that differences in the affinities for RNA phase separation can be uncovered by studying sequence-dependent conformational dynamics of single RNA chains. To this end, we have employed atomistic simulations and deep dimensionality reduction techniques to map temperature-dependent conformational free energy landscapes for 20 base-long homopolymeric RNA sequences: poly(U), poly(G), poly(C), and poly(A). The energy landscapes of homopolymeric RNAs reveal a plethora of metastable states with qualitatively different populations stemming from differences in base chemistry. Through detailed analysis of base, phosphate, and sugar interactions, we show that experimentally observed temperature-driven shifts in metastable state populations align with experiments on RNA phase transitions. Specifically, we find that the thermodynamics of unfolding of homopolymeric RNA follows the poly(G) > poly(A) > poly(C) > poly(U) order of stability, mirroring the propensity of RNA to form condensates. To conclude, this work shows that at least for homopolymeric RNA sequences the single-chain conformational dynamics contains sufficient information for predicting and quantifying condensate forming affinities of RNAs. Thus, we anticipate that atomically detailed studies of temeprature -dependent energy landscapes of RNAs will be a useful guide for understanding the propensity of various RNA molecules to form condensates.


Subject(s)
Nucleic Acid Conformation , RNA , Thermodynamics , RNA/chemistry , RNA/metabolism , Molecular Dynamics Simulation , Unsupervised Machine Learning , Deep Learning , Temperature
16.
Eur J Radiol ; 175: 111459, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38636408

ABSTRACT

OBJECTIVES: This study aimed to investigate tumor heterogeneity of colorectal liver metastases (CRLM) and stratify the patients into different risk groups of prognoses following liver resection by applying an unsupervised radiomics machine-learning approach to preoperative CT images. METHODS: This retrospective study retrieved clinical information and CT images of 197 patients with CRLM from The Cancer Imaging Archive (TCIA) database. Radiomics features were extracted from a segmented liver lesion identified at the portal venous phase. Those features which showed high stability, non-redundancy, and indicative information were selected. An unsupervised consensus clustering analysis on these features was adopted to identify subgroups of CRLM patients. Overall survival (OS), disease-free survival (DFS), and liver-specific DFS were compared between the identified subgroups. Cox regression analysis was applied to evaluate prognostic risk factors. RESULTS: A total of 851 radiomics features were extracted, and 56 robust features were finally selected for unsupervised clustering analysis which identified two distinct subgroups (96 and 101 patients respectively). There were significant differences in the OS, DFS, and liver-specific DFS between the subgroups (all log-rank p < 0.05). The subgroup with worse outcome using the proposed radiomics model was consistently associated with shorter OS, DFS, and liver-specific DFS, with hazard ratios of 1.78 (95 %CI: 1.12-2.83), 1.72 (95 %CI: 1.16-2.54), and 1.59 (95 %CI: 1.10-2.31), respectively. The general performance of this radiomics model outperformed the traditional Clinical Risk Score and Tumor Burden Score in the prognosis prediction after surgery for CRLM. CONCLUSION: Radiomics features derived from preoperative CT images can reveal the heterogeneity of CRLM and stratify the patients with CRLM into subgroups with significantly different clinical outcomes.


Subject(s)
Colorectal Neoplasms , Liver Neoplasms , Tomography, X-Ray Computed , Unsupervised Machine Learning , Humans , Male , Female , Colorectal Neoplasms/pathology , Colorectal Neoplasms/diagnostic imaging , Liver Neoplasms/diagnostic imaging , Liver Neoplasms/secondary , Middle Aged , Tomography, X-Ray Computed/methods , Prognosis , Retrospective Studies , Aged , Adult , Survival Rate , Aged, 80 and over , Machine Learning , Radiomics
17.
Comput Biol Med ; 174: 108413, 2024 May.
Article in English | MEDLINE | ID: mdl-38608323

ABSTRACT

BACKGROUND AND OBJECTIVES: Lifestyle-related diseases (LSDs) impose a substantial economic burden on patients and health care services. LSDs are chronic in nature and can directly affect the heart and lungs. Therapeutic interventions only based on symptoms can be crucial for prompt treatment initiation in LSDs, as symptoms are the first information available to clinicians. So, this work aims to apply unsupervised machine learning (ML) techniques for developing models to predict drugs from symptoms for LSDs, with a specific focus on pulmonary and heart diseases. METHODS: The drug-disease and disease-symptom associations of 143 LSDs, 1271 drugs, and 305 symptoms were used to compute direct associations between drugs and symptoms. ML models with four different algorithms - K-Means, Bisecting K-Means, Mean Shift, and Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) - were developed to cluster the drugs using symptoms as features. The optimal model was saved in a server for the development of a web application. A web application was developed to perform the prediction based on the optimal model. RESULTS: The Bisecting K-means model showed the best performance with a silhouette coefficient of 0.647 and generated 138 drug clusters. The drugs within the optimal clusters showed good similarity based on i) gene ontology annotations of the gene targets, ii) chemical ontology annotations, and iii) maximum common substructure of the drugs. In the web application, the model also provides a confidence score for each predicted drug while predicting from a new set of input symptoms. CONCLUSION: In summary, direct associations between drugs and symptoms were computed, and those were used to develop a symptom-based drug prediction tool for LSDs with unsupervised ML models. The ML-based prediction can provide a second opinion to clinicians to aid their decision-making for early treatment of LSD patients. The web application (URL - http://bicresources.jcbose.ac.in/ssaha4/sdldpred) can provide a simple interface for all end-users to perform the ML-based prediction.


Subject(s)
Unsupervised Machine Learning , Humans , Chronic Disease , Life Style , Algorithms
18.
Sci Rep ; 14(1): 9782, 2024 04 29.
Article in English | MEDLINE | ID: mdl-38684770

ABSTRACT

Though COVID-19 is no longer a pandemic but rather an endemic, the epidemiological situation related to the SARS-CoV-2 virus is developing at an alarming rate, impacting every corner of the world. The rapid escalation of the coronavirus has led to the scientific community engagement, continually seeking solutions to ensure the comfort and safety of society. Understanding the joint impact of medical and non-medical interventions on COVID-19 spread is essential for making public health decisions that control the pandemic. This paper introduces two novel hybrid machine-learning ensembles that combine supervised and unsupervised learning for COVID-19 data classification and regression. The study utilizes publicly available COVID-19 outbreak and potential predictive features in the USA dataset, which provides information related to the outbreak of COVID-19 disease in the US, including data from each of 3142 US counties from the beginning of the epidemic (January 2020) until June 2021. The developed hybrid hierarchical classifiers outperform single classification algorithms. The best-achieved performance metrics for the classification task were Accuracy = 0.912, ROC-AUC = 0.916, and F1-score = 0.916. The proposed hybrid hierarchical ensemble combining both supervised and unsupervised learning allows us to increase the accuracy of the regression task by 11% in terms of MSE, 29% in terms of the area under the ROC, and 43% in terms of the MPP metric. Thus, using the proposed approach, it is possible to predict the number of COVID-19 cases and deaths based on demographic, geographic, climatic, traffic, public health, social-distancing-policy adherence, and political characteristics with sufficiently high accuracy. The study reveals that virus pressure is the most important feature in COVID-19 spread for classification and regression analysis. Five other significant features were identified to have the most influence on COVID-19 spread. The combined ensembling approach introduced in this study can help policymakers design prevention and control measures to avoid or minimize public health threats in the future.


Subject(s)
COVID-19 , SARS-CoV-2 , COVID-19/epidemiology , COVID-19/mortality , COVID-19/prevention & control , Humans , SARS-CoV-2/isolation & purification , Supervised Machine Learning , Pandemics , Algorithms , Unsupervised Machine Learning , United States/epidemiology , Machine Learning
19.
J Chem Inf Model ; 64(8): 3059-3079, 2024 Apr 22.
Article in English | MEDLINE | ID: mdl-38498942

ABSTRACT

Condensing the many physical variables defining a chemical system into a fixed-size array poses a significant challenge in the development of chemical Machine Learning (ML). Atom Centered Symmetry Functions (ACSFs) offer an intuitive featurization approach by means of a tedious and labor-intensive selection of tunable parameters. In this work, we implement an unsupervised ML strategy relying on a Gaussian Mixture Model (GMM) to automatically optimize the ACSF parameters. GMMs effortlessly decompose the vastness of the chemical and conformational spaces into well-defined radial and angular clusters, which are then used to build tailor-made ACSFs. The unsupervised exploration of the space has demonstrated general applicability across a diverse range of systems, spanning from various unimolecular landscapes to heterogeneous databases. The impact of the sampling technique and temperature on space exploration is also addressed, highlighting the particularly advantageous role of high-temperature Molecular Dynamics (MD) simulations. The reliability of the resulting features is assessed through the estimation of the atomic charges of a prototypical capped amino acid and a heterogeneous collection of CHON molecules. The automatically constructed ACSFs serve as high-quality descriptors, consistently yielding typical prediction errors below 0.010 electrons bound for the reported atomic charges. Altering the spatial distribution of the functions with respect to the cluster highlights the critical role of symmetry rupture in achieving significantly improved features. More specifically, using two separate functions to describe the lower and upper tails of the cluster results in the best performing models with errors as low as 0.006 electrons. Finally, the effectiveness of finely tuned features was checked across different architectures, unveiling the superior performance of Gaussian Process (GP) models over Feed Forward Neural Networks (FFNNs), particularly in low-data regimes, with nearly a 2-fold increase in prediction quality. Altogether, this approach paves the way toward an easier construction of local chemical descriptors, while providing valuable insights into how radial and angular spaces should be mapped. Finally, this work opens the possibility of encoding many-body information beyond angular terms into upcoming ML features.


Subject(s)
Molecular Dynamics Simulation , Unsupervised Machine Learning , Normal Distribution , Automation
20.
An Acad Bras Cienc ; 96(1): e20230409, 2024.
Article in English | MEDLINE | ID: mdl-38451625

ABSTRACT

This study utilizes Fourier transform infrared (FTIR) data from honey samples to cluster and categorize them based on their spectral characteristics. The aim is to group similar samples together, revealing patterns and aiding in classification. The process begins by determining the number of clusters using the elbow method, resulting in five distinct clusters. Principal Component Analysis (PCA) is then applied to reduce the dataset's dimensionality by capturing its significant variances. Hierarchical Cluster Analysis (HCA) further refines the sample clusters. 20% of the data, representing identified clusters, is randomly selected for testing, while the remainder serves as training data for a deep learning algorithm employing a multilayer perceptron (MLP). Following training, the test data are evaluated, revealing an impressive 96.15% accuracy. Accuracy measures the machine learning model's ability to predict class labels for new data accurately. This approach offers reliable honey sample clustering without necessitating extensive preprocessing. Moreover, its swiftness and cost-effectiveness enhance its practicality. Ultimately, by leveraging FTIR spectral data, this method successfully identifies similarities among honey samples, enabling efficient categorization and demonstrating promise in the field of spectral analysis in food science.


Subject(s)
Honey , Unsupervised Machine Learning , Fourier Analysis , Spectroscopy, Fourier Transform Infrared , Cluster Analysis
SELECTION OF CITATIONS
SEARCH DETAIL
...