RESUMEN
The novel object recognition test (NORT) is one of the most commonly employed behavioral tests in experimental animals designed to evaluate an animal's interest in and recognition of novelty. However, manual procedures, which rely on researchers' observations, prevent high throughput analysis. In this study, we developed an automated analysis method for NORT utilizing machine learning-assisted exploratory behavior detection. We recorded the exploratory behavior of the mice using a video camera. The coordinates of the mouse nose and tail base in recorded video files were detected using a pre-trained machine learning model, DeepLabCut. Each video was then segmented into frame images, which were categorized into "exploratory," or "non-exploratory" frames based on manual observation. Mouse feature vectors were calculated as vectors from the nose to the vertices of the object and were utilized for SVM training. The trained SVM effectively detected exploratory behaviors, showing a strong correlation with human observer assessments. Upon application to NORT, the duration of mouse exploratory behavior towards objects predicted by the SVM exhibited a significant correlation with the assessments made by human observers. The novelty discrimination index derived from the SVM predictions also aligned well with that from human observations.
Asunto(s)
Conducta Animal , Conducta Exploratoria , Aprendizaje Automático , Reconocimiento en Psicología , Animales , Conducta Exploratoria/fisiología , Ratones , Reconocimiento en Psicología/fisiología , Masculino , Conducta Animal/fisiología , Procesamiento de Imagen Asistido por Computador/métodos , Máquina de Vectores de Soporte , Ratones Endogámicos C57BLRESUMEN
This research focused on distinguishing distinct matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) spectral signatures of three Enterococcus species. We evaluated and compared the predictive performance of four supervised machine learning algorithms, K-nearest neighbor (KNN), support vector machine (SVM), and random forest (RF), to accurately classify Enterococcus species. This study involved a comprehensive dataset of 410 strains, generating 1640 individual spectra through on-plate and off-plate protein extraction methods. Although the commercial database correctly identified 76.9% of the strains, machine learning classifiers demonstrated superior performance (accuracy 0.991). In the RF model, top informative peaks played a significant role in the classification. Whole-genome sequencing showed that the most informative peaks are biomarkers connected to proteins, which are essential for understanding bacterial classification and evolution. The integration of MALDI-TOF MS and machine learning provides a rapid and accurate method for identifying Enterococcus species, improving healthcare and food safety.
Asunto(s)
Enterococcus , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción , Aprendizaje Automático Supervisado , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/métodos , Enterococcus/clasificación , Enterococcus/química , Enterococcus/aislamiento & purificación , Enterococcus/genética , Algoritmos , Máquina de Vectores de Soporte , Técnicas de Tipificación Bacteriana/métodos , Aprendizaje AutomáticoRESUMEN
Objective: To identify HBV-related genes (HRGs) implicated in osteoporosis (OP) pathogenesis and develop a diagnostic model for early OP detection in chronic HBV infection (CBI) patients. Methods: Five public sequencing datasets were collected from the GEO database. Gene differential expression and LASSO analyses identified genes linked to OP and CBI. Machine learning algorithms (random forests, support vector machines, and gradient boosting machines) further filtered these genes. The best diagnostic model was chosen based on accuracy and Kappa values. A nomogram model based on HRGs was constructed and assessed for reliability. OP patients were divided into two chronic HBV-related clusters using non-negative matrix factorization. Differential gene expression analysis, Gene Ontology, and KEGG enrichment analyses explored the roles of these genes in OP progression, using ssGSEA and GSVA. Differences in immune cell infiltration between clusters and the correlation between HRGs and immune cells were examined using ssGSEA and the Pearson method. Results: Differential gene expression analysis of CBI and combined OP dataset identified 822 and 776 differentially expressed genes, respectively, with 43 genes intersecting. Following LASSO analysis and various machine learning recursive feature elimination algorithms, 16 HRGs were identified. The support vector machine emerged as the best predictive model based on accuracy and Kappa values, with AUC values of 0.92, 0.83, 0.74, and 0.7 for the training set, validation set, GSE7429, and GSE7158, respectively. The nomogram model exhibited AUC values of 0.91, 0.79, and 0.68 in the training set, GSE7429, and GSE7158, respectively. Non-negative matrix factorization divided OP patients into two clusters, revealing statistically significant differences in 11 types of immune cell infiltration between clusters. Finally, intersecting the HRGs obtained from LASSO analysis with the HRGs identified three genes. Conclusion: This study successfully identified HRGs and developed an efficient diagnostic model based on HRGs, demonstrating high accuracy and strong predictive performance across multiple datasets. This research not only offers new insights into the complex relationship between OP and CBI but also establishes a foundation for the development of early diagnostic and personalized treatment strategies for chronic HBV-related OP.
Asunto(s)
Biología Computacional , Virus de la Hepatitis B , Hepatitis B Crónica , Aprendizaje Automático , Osteoporosis , Humanos , Hepatitis B Crónica/genética , Hepatitis B Crónica/inmunología , Hepatitis B Crónica/virología , Biología Computacional/métodos , Osteoporosis/genética , Osteoporosis/diagnóstico , Virus de la Hepatitis B/inmunología , Virus de la Hepatitis B/genética , Perfilación de la Expresión Génica , Nomogramas , Transcriptoma , Bases de Datos Genéticas , Máquina de Vectores de Soporte , Predisposición Genética a la EnfermedadRESUMEN
One of the critical issues in medical data analysis is accurately predicting a patient's risk of heart disease, which is vital for early intervention and reducing mortality rates. Early detection allows for timely treatment and continuous monitoring by healthcare providers, which is essential but often limited by the inability of medical professionals to provide constant patient supervision. Early detection of cardiac problems and continuous patient monitoring by physicians can help reduce death rates. Doctors cannot constantly have contact with patients, and heart disease detection is not always accurate. By offering a more solid foundation for prediction and decision-making based on data provided by healthcare sectors worldwide, machine learning (ML) could help physicians with the prediction and detection of HD. This study aims to use different feature selection strategies to produce an accurate ML algorithm for early heart disease prediction. We have chosen features using chi-square, ANOVA, and mutual information methods. The three feature groups chosen were SF-1, SF-2, and SF-3. The study employed ten machine learning algorithms to determine the most accurate technique and feature subset fit. The classification algorithms used include support vector machines (SVM), XGBoost, bagging, decision trees (DT), and random forests (RF). We evaluated the proposed heart disease prediction technique using a private dataset, a public dataset, and different cross-validation methods. We used the Synthetic Minority Oversampling Technique (SMOTE) to eliminate inconsistent data and discover the machine learning algorithm that achieves the most accurate heart disease predictions. Healthcare providers might identify early-stage heart disease quickly and cheaply with the proposed method. We have used the most effective ML algorithm to create a mobile app that instantly predicts heart disease based on the input symptoms. The experimental results demonstrated that the XGBoost algorithm performed optimally when applied to the combined datasets and the SF-2 feature subset. It had 97.57% accuracy, 96.61% sensitivity, 90.48% specificity, 95.00% precision, a 92.68% F1 score, and a 98% AUC. We have developed an explainable AI method based on SHAP approaches to understand how the system makes its final predictions.
Asunto(s)
Algoritmos , Cardiopatías , Aprendizaje Automático , Humanos , Cardiopatías/diagnóstico , Máquina de Vectores de Soporte , Inteligencia ArtificialRESUMEN
BACKGROUND: To investigate whether radiomics models derived from pretreatment CT could help to predict response to immunotherapy in oral squamous cell carcinoma (OSCC). METHODS: Retrospectively, a total of 40 patients with measurable OSCC were included. The patients were divided into responder group and non-responder group according to the comparison of pre-treatment and post-treatment CT findings. Radiomics features were extracted from pre-treatment CT images, and optimal features were selected by univariate analysis and the least absolute shrinkage and selection operator (LASSO) regression analysis. Neural network, support vector machine, random forest and logistic regression models were used to predict response to immunotherapy in OSCC, and leave-one-out cross validation was employed to assess the performance of the classifiers. The area under the curve (AUC), accuracy, sensitivity and specificity were calculated to quantify the predictive efficacy. RESULTS: A total of 7 features were selected to build models upon machine learning methods. By comparing different machine learning based models, the neural network model achieved the best predictive ability, with an AUC of 0.864, an accuracy of 82.5%, a sensitivity of 82.5%, and a specificity of 82.5%. CONCLUSIONS: The pretreatment CT-based radiomics model showed good performance in predicting response to immunotherapy in OSCC. Pretreatment CT-based radiomics model might provide an alternative approach for the selection of patients who benefit from immunotherapy.
Asunto(s)
Inmunoterapia , Neoplasias de la Boca , Tomografía Computarizada por Rayos X , Humanos , Masculino , Femenino , Neoplasias de la Boca/diagnóstico por imagen , Neoplasias de la Boca/terapia , Neoplasias de la Boca/patología , Tomografía Computarizada por Rayos X/métodos , Persona de Mediana Edad , Estudios Retrospectivos , Inmunoterapia/métodos , Carcinoma de Células Escamosas/diagnóstico por imagen , Carcinoma de Células Escamosas/terapia , Aprendizaje Automático , Redes Neurales de la Computación , Anciano , Sensibilidad y Especificidad , Resultado del Tratamiento , Adulto , Máquina de Vectores de Soporte , RadiómicaRESUMEN
This study investigates simulation of pharmaceutical separation via membrane distillation process by computational simulation and machine learning modeling strategy. The efficacy of three regression models, i.e., Multi-layer Perceptron (MLP), Gamma Regression, and Support Vector Regression (SVR) in predicting the solute concentration, C(mol/m³), was evaluated. The hyper-parameters were optimized by fine-tuning the models using the Red Deer Algorithm (RDA). Computational analyses were carried out for removal of pharmaceuticals from solution by membrane distillation in continuous mode. Mass transfer and machine learning models were implemented focusing on concentration of solute in the feed section of membrane. Results indicate that the Multi-layer Perceptron model achieved great accuracy with an R2 of 0.9955, an MAE of 0.0084, and an RMSE of 0.0148, effectively capturing complex nonlinear relationships in the data. Gamma Regression also performed acceptably, with fitting R2 of 0.9214, showing its suitability for positively skewed data. The Support Vector Regression model, while capturing the general trend, showed the lowest performance with an R2 of 0.8710. These findings suggest that the Multi-layer Perceptron is the most accurate model for this dataset, followed by Gamma Regression and Support Vector Regression. This underscores the importance of careful model selection and optimization in regression analysis in combination with computational simulation of membrane processes.
Asunto(s)
Destilación , Aprendizaje Automático , Destilación/métodos , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/análisis , Algoritmos , Simulación por Computador , Máquina de Vectores de Soporte , Membranas ArtificialesRESUMEN
An important abnormality in Optical Coherence Tomography (OCT) images is Hyper-Reflective Foci (HRF). This anomaly can be interpreted as a biomarker of serious retinal diseases such as Age-related Macular Degeneration (AMD) and Diabetic Macular Edema (DME) or the progression of disease from an early stage to a late one. In this paper, a new method is proposed for the identification of HRFs. The new method divides the OCT B-scan into patches and separately verifies each patch to determine whether or not the patch contains an HRF. The procedure of patch verification contains a texture-based framework which assigns appropriate labels according to intensity changes to each column and row. Then, a feature vector is extracted for each patch based on the assigned labels. The feature vectors are utilized in the training step of well-known classifiers like Support Vector Machine (SVM). Then, the classifiers are used to produce the labels for the test OCT images. The new method is evaluated on a public dataset including HRF labels. The experimental results show that the new method is capable of providing outstanding results in terms of speed and accuracy.
Asunto(s)
Retina , Máquina de Vectores de Soporte , Tomografía de Coherencia Óptica , Tomografía de Coherencia Óptica/métodos , Humanos , Retina/diagnóstico por imagen , Retina/patología , Degeneración Macular/diagnóstico por imagen , Degeneración Macular/patología , Retinopatía Diabética/diagnóstico por imagen , Edema Macular/diagnóstico por imagen , Algoritmos , Procesamiento de Imagen Asistido por Computador/métodos , Enfermedades de la Retina/diagnóstico por imagen , Enfermedades de la Retina/patologíaRESUMEN
BACKGROUND: Currently, there is still a lack of valuable neuroimaging markers to assess the clinical severity of stroke patients with small artery occlusion (SAO). Quantitative susceptibility mapping (QSM) is a quantitative processing method for neuroradiological diagnostics. Gray matter (GM) volume changes in stroke patients are also proved to be associated with neurological deficits. This study aims to explore the predictive value of QSM and GM volume in neurological deficits of patients with SAO. METHODS: As neurological deficits, the National Institutes of Health Stroke Scale (NIHSS) was used. Sixty-six SAO participants within 24 h of first onset were enrolled and divided into mild and moderate groups based on NIHSS. QSM values of infarct area and GM volume were calculated from magnetic resonance imaging (MRI) data. Two-sample t-tests were used to compare differences in QSM value and GM volume between the two groups, and the diagnostic efficacy of the combination of QSM value and GM volume was evaluated. RESULTS: The results revealed both the QSM value and GM volume within the infarct area of the moderate group were lower compared to the mild group. Moderate group exhibited lower GM volume in some specific gyrus compared with mild group in the case of voxel-wise GM volume on whole-brain voxel level. The support vector machine (SVM) classifier's analysis showed a high power for the combination of QSM value, GM volume within the infarct area, and voxel-wise GM volume. CONCLUSION: Our research first reported the combination of QSM value, GM volume within the infarct area, and voxel-wise GM volume could be used to predict neurological impairment of patients with SAO, which provides new insights for further understanding the SAO stroke.
Asunto(s)
Sustancia Gris , Imagen por Resonancia Magnética , Humanos , Sustancia Gris/diagnóstico por imagen , Sustancia Gris/patología , Masculino , Femenino , Imagen por Resonancia Magnética/métodos , Persona de Mediana Edad , Anciano , Índice de Severidad de la Enfermedad , Accidente Cerebrovascular/diagnóstico por imagen , Máquina de Vectores de Soporte , Arteriopatías Oclusivas/diagnóstico por imagen , Arteriopatías Oclusivas/patologíaRESUMEN
INTRODUCTION: Primary refractory disease affects 30-40% of patients diagnosed with DLBCL and is a significant challenge in disease management due to its poor prognosis. Predicting refractory status could greatly inform treatment strategies, enabling early intervention. Various options are now available based on patient and disease characteristics. Supervised machine-learning techniques, which can predict outcomes in a medical context, appear highly suitable for this purpose. DESIGN: Retrospective monocentric cohort study. PATIENT POPULATION: Adult patients with a first diagnosis of DLBCL admitted to the hematology unit from 2017 to 2022. AIM: We evaluated in our Center five supervised machine-learning (ML) models as a tool for the prediction of primary refractory DLBCL. MAIN RESULTS: One hundred and thirty patients with Diffuse Large B-cell lymphoma (DLBCL) were included in this study between January 2017 and December 2022. The variables used for analysis included demographic characteristics, clinical condition, disease characteristics, first-line therapy and PET-CT scan realization after 2 cycles of treatment. We compared five supervised ML models: support vector machine (SVM), Random Forest Classifier (RFC), Logistic Regression (LR), Naïve Bayes (NB) Categorical classifier and eXtreme Gradient Boost (XGboost), to predict primary refractory disease. The performance of these models was evaluated using the area under the receiver operating characteristic curve (ROC-AUC), accuracy, false positive rate, sensitivity, and F1-score to identify the best model. After a median follow-up of 19.5 months, the overall survival rate was 60% in the cohort. The Overall Survival at 3 years was 58.5% (95%CI, 51-68.5) and the 3-years Progression Free Survival was 63% (95%CI, 54-71) using Kaplan-Meier method. Of the 124 patients who received a first line treatment, primary refractory disease occurred in 42 patients (33.8%) and 2 patients (1.6%) experienced relapse within 6 months. The univariate analysis on refractory disease status shows age (p = 0.009), Ann Arbor stage (p = 0.013), CMV infection (p = 0.012), comorbidity (p = 0.019), IPI score (p<0.001), first line of treatment (p<0.001), EBV infection (p = 0.008) and socio-economics status (p = 0.02) as influencing factors. The NB Categorical classifier emerged as the top-performing model, boasting a ROC-AUC of 0.81 (95% CI, 0.64-0.96), an accuracy of 83%, a F1-score of 0.82, and a low false positive rate at 10% on the validation set. The eXtreme Gradient Boost (XGboost) model and the Random Forest Classifier (RFC) followed with a ROC-AUC of 0.74 (95%CI, 0.52-0.93) and 0.67 (95%CI, 0.46-0.88) respectively, an accuracy of 78% and 72% respectively, a F1-score of 0.75 and 0.67 respectively, and a false positive rate of 10% for both. The other two models performed worse with ROC-AUC of 0.65 (95%CI, 0.40-0.87) and 0.45 (95%CI, 0.29-0.64) for SVM and LR respectively, an accuracy of 67% and 50% respectively, a f1-score of 0.64 and 0.43 respectively, and a false positive rate of 28% and 37% respectively. CONCLUSION: Machine learning algorithms, particularly the NB Categorical classifier, have the potential to improve the prediction of primary refractory disease in DLBCL patients, thereby providing a novel decision-making tool for managing this condition. To validate these results on a broader scale, multicenter studies are needed to confirm the results in larger cohorts.
Asunto(s)
Linfoma de Células B Grandes Difuso , Aprendizaje Automático , Humanos , Masculino , Linfoma de Células B Grandes Difuso/diagnóstico , Linfoma de Células B Grandes Difuso/patología , Linfoma de Células B Grandes Difuso/mortalidad , Femenino , Persona de Mediana Edad , Anciano , Estudios Retrospectivos , Adulto , Pronóstico , Anciano de 80 o más Años , Máquina de Vectores de Soporte , Curva ROC , Estudios de Cohortes , Tomografía Computarizada por Tomografía de Emisión de PositronesRESUMEN
Phospholipids are the main building components of cell membranes and are also used for cell signaling and as energy storages. Cancer cells alter their lipid metabolism, which ultimately leads to an increase in phospholipids in cancer tissue. Surgical energy instruments use electrical or vibrational energy to heat tissues, which causes intra- and extracellular water to expand rapidly and degrade cell structures, bursting the cells, which causes the formation of a tissue aerosol or smoke depending on the amount of energy used. This gas phase analyte can then be analyzed via gas analysis methods. Differential mobility spectrometry (DMS) is a method that can be used to differentiate malignant tissue from benign tissues in real time via the analysis of surgical smoke produced by energy instruments. Previously, the DMS identification of cancer tissue was based on a 'black box method' by differentiating the 2D dispersion plots of samples. This study sets out to find datapoints from the DMS dispersion plots that represent relevant target molecules. We studied the ability of DMS to differentiate three subclasses of phospholipids (phosphatidylcholine, phosphatidylinositol, and phosphatidylethanolamine) from a control sample using a bovine skeletal muscle matrix with a 5 mg addition of each phospholipid subclass to the sample matrix. We trained binary classifiers using linear discriminant analysis (LDA) and support vector machines (SVM) for sample classification. We were able to identify phosphatidylcholine, -inositol, and -ethanolamine with SVM binary classification accuracies of 91%, 73%, and 66% and with LDA binary classification accuracies of 82%, 74%, and 72%, respectively. Phosphatidylcholine was detected with a reliable classification accuracy, but ion separation setups should be adjusted in future studies to reliably detect other relevant phospholipids such as phosphatidylinositol and phosphatidylethanolamine and improve DMS as a microanalysis method and identify other phospholipids relevant to cancer tissue.
Asunto(s)
Espectrometría de Movilidad Iónica , Neoplasias , Fosfolípidos , Espectrometría de Movilidad Iónica/métodos , Fosfolípidos/metabolismo , Fosfolípidos/análisis , Neoplasias/metabolismo , Animales , Máquina de Vectores de Soporte , Bovinos , Análisis Discriminante , Humanos , Músculo Esquelético/metabolismo , Fosfatidiletanolaminas/metabolismo , Fosfatidiletanolaminas/análisisRESUMEN
A core function of the olfactory system is to determine the valence of odors. In humans, central processing of odor valence perception has been shown to take form already within the olfactory bulb (OB), but the neural mechanisms by which this important information is communicated to, and from, the olfactory cortex (piriform cortex, PC) are not known. To assess communication between the 2 nodes, we simultaneously measured odor-dependent neural activity in the OB and PC from human participants while obtaining trial-by-trial valence ratings. By doing so, we could determine when subjective valence information was communicated, what kind of information was transferred, and how the information was transferred (i.e., in which frequency band). Support vector machine (SVM) learning was used on the coherence spectrum and frequency-resolved Granger causality to identify valence-dependent differences in functional and effective connectivity between the OB and PC. We found that the OB communicates subjective odor valence to the PC in the gamma band shortly after odor onset, while the PC subsequently feeds broader valence-related information back to the OB in the beta band. Decoding accuracy was better for negative than positive valence, suggesting a focus on negative valence. Critically, we replicated these findings in an independent data set using additional odors across a larger perceived valence range. Combined, these results demonstrate that the OB and PC communicate levels of subjective odor pleasantness across multiple frequencies, at specific time points, in a direction-dependent pattern in accordance with a two-stage model of odor processing.
Asunto(s)
Odorantes , Bulbo Olfatorio , Percepción Olfatoria , Corteza Piriforme , Humanos , Masculino , Corteza Piriforme/fisiología , Bulbo Olfatorio/fisiología , Femenino , Adulto , Adulto Joven , Percepción Olfatoria/fisiología , Ritmo beta/fisiología , Ritmo Gamma/fisiología , Máquina de Vectores de Soporte , Olfato/fisiologíaRESUMEN
PURPOSE: To predict bone marrow metastasis in neuroblastoma using contrast-enhanced computed tomography (CECT) radiomics features and explainable machine learning. METHODS: This cohort study retrospectively included a total of 345 neuroblastoma patients who underwent testing for bone marrow metastatic status. Tumor lesions on CECT images were delineated by two radiologists, and 1409 radiomics features were extracted. Correlation analysis, Least Absolute Shrinkage and Selection Operator regression, and one-way analysis of variance were used to identify radiomics features associated with bone marrow metastasis. A predictive model for bone marrow metastasis was then developed using the support vector machine algorithm based on the selected radiomics features. The performance of the radiomics model was evaluated using the area under the curve (AUC), 95% confidence interval (CI), accuracy, sensitivity, and specificity. RESULTS: The radiomics model included 16 features, with a predominant focus on texture features (12/16, 75%). In the training set, the model demonstrated an AUC of 0.891 (95% CI: 0.848-0.933), an accuracy of 0.831 (95% CI: 0.829-0.832), a sensitivity of 0.893 (95% CI: 0.840-0.946), and a specificity of 0.757 (95% CI: 0.677-0.837). In the test set, the AUC, accuracy, sensitivity, and specificity were 0.807 (95% CI: 0.720-0.893), 0.767 (95% CI: 0.764-0.770), 0.696 (95% CI: 0.576-0.817), and 0.851 (95% CI: 0.749-0.953), respectively. CONCLUSION: Radiomics features extracted from CECT images are associated with the presence of bone marrow metastasis in neuroblastoma, providing potential new imaging biomarkers for predicting bone marrow metastasis in this disease.
Asunto(s)
Neoplasias de la Médula Ósea , Medios de Contraste , Aprendizaje Automático , Neuroblastoma , Tomografía Computarizada por Rayos X , Humanos , Neuroblastoma/diagnóstico por imagen , Neuroblastoma/patología , Femenino , Tomografía Computarizada por Rayos X/métodos , Masculino , Neoplasias de la Médula Ósea/secundario , Neoplasias de la Médula Ósea/diagnóstico por imagen , Niño , Preescolar , Lactante , Estudios Retrospectivos , Curva ROC , Algoritmos , Adolescente , Máquina de Vectores de Soporte , Médula Ósea/patología , Médula Ósea/diagnóstico por imagen , RadiómicaRESUMEN
In recent years, statistics and machine learning methods have been widely used to analyze the relationship between human gut microbial metagenome and metabolic diseases, which is of great significance for the functional annotation and development of microbial communities. In this study, we proposed a new and scalable framework for image enhancement and deep learning of gut metagenome, which could be used in the classification of human metabolic diseases. Each data sample in three representative human gut metagenome datasets was transformed into image and enhanced, and put into the machine learning models of logistic regression (LR), support vector machine (SVM), Bayesian network (BN) and random forest (RF), and the deep learning models of multilayer perceptron (MLP) and convolutional neural network (CNN). The accuracy performance of the overall evaluation model for disease prediction was verified by accuracy (A), accuracy (P), recall (R), F1 score (F1), area under ROC curve (AUC) and 10 fold cross-validation. The results showed that the overall performance of MLP model was better than that of CNN, LR, SVM, BN, RF and PopPhy-CNN, and the performance of MLP and CNN models was further improved after data enhancement (random rotation and adding salt-and-pepper noise). The accuracy of MLP model in disease prediction was further improved by 4%-11%, F1 by 1%-6% and AUC by 5%-10%. The above results showed that human gut metagenome image enhancement and deep learning could accurately extract microbial characteristics and effectively predict the host disease phenotype. The source code and datasets used in this study can be publicly accessed in https://github.com/HuaXWu/GM_ML_Classification.git.
Asunto(s)
Aprendizaje Profundo , Microbioma Gastrointestinal , Enfermedades Metabólicas , Metagenoma , Máquina de Vectores de Soporte , Humanos , Microbioma Gastrointestinal/genética , Enfermedades Metabólicas/genética , Enfermedades Metabólicas/microbiología , Redes Neurales de la Computación , Teorema de BayesRESUMEN
Anti-nutrient factors are inherently present in almost all major crops, which impede the absorption of crucial vitamins and minerals upon human consumption. The commonly found anti-nutrients in food crops are saponins, tannins, lectins, and phytates etc. Currently, there is a lack of computational server for identification of proteins that encode for anti-nutritional factors in plants. Consequently, this study represents a computational approach aimed at distinguishing between proteins encoding anti-nutritional factors and those providing essential nutrients. In this work, machine learning algorithms have been employed to identify plant specific anti-nutrient factor proteins from protein sequences by using compositional features. Achieving a five-fold cross-validation training performance of 94.34% AUC-ROC and 94.13% AUC-PR with extreme gradient boosting surpasses the performance of other methods such as support vector machine, random forest, and adaptive boosting. These results suggest the proposed approach is highly reliable in predicting plant-specific anti-nutritional factor proteins. The resulting prediction models have led to the development of an online server named ANPS, freely available at https://nipb-bi.icar.gov.in .
Asunto(s)
Aprendizaje Automático , Proteínas de Plantas , Proteínas de Plantas/metabolismo , Proteínas de Plantas/genética , Máquina de Vectores de Soporte , Programas Informáticos , AlgoritmosRESUMEN
Emotions play a vital role in recognizing a person's thoughts and vary significantly with stress levels. Emotion and stress classification have gained considerable attention in robotics and artificial intelligence applications. While numerous methods based on machine learning techniques provide average classification performance, recent deep learning approaches offer enhanced results. This research presents a hybrid deep learning model that extracts features using AlexNet and DenseNet models, followed by feature fusion and dimensionality reduction via Principal Component Analysis (PCA). The reduced features are then classified using a multi-class Support Vector Machine (SVM) to categorize different types of emotions. The proposed model was evaluated using the DEAP and EEG Brainwave datasets, both well-suited for emotion analysis due to their comprehensive EEG signal recordings and diverse emotional stimuli. The DEAP dataset includes EEG signals from 32 participants who watched 40 one-minute music videos, while the EEG Brainwave dataset categorizes emotions into positive, negative, and neutral based on EEG recordings from participants exposed to six different film clips. The proposed model achieved an accuracy of 95.54% and 97.26% for valence and arousal categories in the DEAP dataset, respectively, and 98.42% for the EEG Brainwave dataset. These results significantly outperform existing methods, demonstrating the model's superior performance in terms of precision, recall, F1-score, specificity, and Mathew correlation coefficient. The integration of AlexNet and DenseNet, combined with PCA and multi-class SVM, makes this approach particularly effective for capturing the intricate patterns in EEG data, highlighting its potential for applications in human-computer interaction and mental health monitoring, marking a significant advancement over traditional methods.
Asunto(s)
Aprendizaje Profundo , Electroencefalografía , Emociones , Máquina de Vectores de Soporte , Humanos , Emociones/fisiología , Electroencefalografía/métodos , Análisis de Componente Principal , Masculino , Femenino , Adulto , Redes Neurales de la ComputaciónRESUMEN
In previous real-time functional magnetic resonance imaging neurofeedback (rtfMRI-NF) studies on smoking craving, the focus has been on within-region activity or between-region connectivity, neglecting the potential predictive utility of broader network activity. Moreover, there is debate over the use and relative predictive power of individual-specific and group-level classifiers. This study aims to further advance rtfMRI-NF for substance use disorders by using whole-brain rtfMRI-NF to assess smoking craving-related brain patterns, evaluate the performance of group-level or individual-level classification (n = 31) and evaluate the performance of an optimized classifier across repeated NF runs. Using real-time individual-level classifiers derived from whole-brain support vector machines, we found that classification accuracy between crave and no-crave conditions and between repeated NF runs increased across repeated runs at both individual and group levels. In addition, individual-level accuracy was significantly greater than group-level accuracy, highlighting the potential increased utility of an individually trained whole-brain classifier for volitional control over brain patterns to regulate smoking craving. This study provides evidence supporting the feasibility of using whole-brain rtfMRI-NF to modulate smoking craving-related brain responses and the potential for learning individual strategies through optimization across repeated feedback runs. This article is part of the theme issue 'Neurofeedback: new territories and neurocognitive mechanisms of endogenous neuromodulation'.
Asunto(s)
Encéfalo , Ansia , Imagen por Resonancia Magnética , Neurorretroalimentación , Neurorretroalimentación/métodos , Humanos , Imagen por Resonancia Magnética/métodos , Ansia/fisiología , Adulto , Masculino , Femenino , Encéfalo/diagnóstico por imagen , Encéfalo/fisiología , Adulto Joven , Máquina de Vectores de Soporte , FumarRESUMEN
The prevalence of depression has increased dramatically over the last several decades: it is frequently overlooked and can have a significant impact on both physical and mental health. Therefore, it is crucial to develop an automated detection system that can instantly identify whether a person is depressed. Currently, machine learning (ML) and artificial neural networks (ANNs) are among the most promising approaches for developing automated computer-based systems to predict several mental health issues, such as depression. This study propose an ensemble of hybrid model-based techniques that aims to build a strong detection model that considers many psychological and sociodemographic characteristics of an individual to detect whether a person is depressed. Support vector machines (SVM) and multilayer perceptrons (MLP) are the two fundamental methods used to construct the suggested ensemble approach. The hybrid DeprMVM served as a meta-learner. In this study, the hybrid DeprMVM is a level-1 learner, whereas the SVM and MLP networks are level-0 learners. After the classifiers are trained and tested at level 0, their outputs are based on both the independent and dependent variables in the new data set that was used to train the meta-classifier. The training data class imbalance was reduced by applying the synthetic minority oversampling technique (SMOTE) and cluster sampling together, which improved the accuracy for detecting depression. Additionally, it can effectively reduce the risk of over-fitting from simply duplicating data points. To further confirm the effectiveness of the proposed method, various performance evaluation metrics were calculated and compared with previous studies conducted on this specific dataset. In conclusion, among all the techniques for identifying depression, the suggested ensemble approach had the best accuracy, at 99.39%, and an F1-score of 99.51%.
Asunto(s)
Depresión , Redes Neurales de la Computación , Máquina de Vectores de Soporte , Humanos , Depresión/diagnóstico , Femenino , Masculino , Diagnóstico Precoz , Adulto , Persona de Mediana Edad , Aprendizaje AutomáticoRESUMEN
This study aims to develop machine learning (ML)-assisted models for analyzing datasets related to Gleason scores in prostate cancer, conducting statistical analyses on the datasets, and identifying meaningful features. We retrospectively collected data from 717 hormone-sensitive prostate cancer (HSPC) patients at Yunnan Cancer Hospital. Of these, data from 526 patients were used for modeling. Seven auxiliary models were established using Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), Extreme gradient boosting tree (XGBoost), Adaptive Boosting (Adaboost), and artificial neural network (ANN) based on 21 clinical biochemical indicators and features. Evaluation metrics included accuracy (ACC), precision (PRE), specificity (SPE), sensitivity (SEN) or regression rate(Recall), and f1 score. Evaluation metrics for the models primarily included ACC, PRE, SPE, SEN or Recall, f1 score, and area under the curve(AUC). Evaluation metrics were visualized using confusion matrices and ROC curves. Among the ensemble learning methods, RF, XGBoost, and Adaboost performed the best. RF achieved a training dataset score of 0.769 (95% CI: 0.759-0.835) and a testing dataset score of 0.755 (95% CI: 0.660-0.760) (AUC: 0.786, 95%CI: 0.722-0.803), while XGBoost achieved a training dataset score of 0.755 (95% CI: 95%CI: 0.711-0.809) and a testing dataset score of 0.745 (95% CI: 0.660-0.764) (AUC: 0.777, 95% CI: 0.726-0.798). Adaboost scored 0.789 on the training dataset (95% CI: 0.782-0.857) and 0.774 on the testing dataset (95% CI: 0.651-0.774) (AUC: 0.799, 95% CI: 0.703-0.802). In terms of feature importance (FI) in ensemble learning, Bone metastases at first visit, prostatic volume, age, and T1-T2 have significant proportions in RF's FI. fPSA, TPSA, and tumor burden have significant proportions in Adaboost's FI, while f/TPSA, LDH, and testosterone have the highest proportions in XGBoost. Our findings indicate that ensemble learning methods demonstrate good performance in classifying HSPC patient data, with TNM staging and fPSA being important classification indicators. These discoveries provide valuable references for distinguishing different Gleason scores, facilitating more accurate patient assessments and personalized treatment plans.
Asunto(s)
Aprendizaje Automático , Clasificación del Tumor , Neoplasias de la Próstata , Humanos , Masculino , Neoplasias de la Próstata/patología , Neoplasias de la Próstata/diagnóstico , Estudios Retrospectivos , Persona de Mediana Edad , Anciano , Curva ROC , Máquina de Vectores de Soporte , Redes Neurales de la ComputaciónRESUMEN
Visible imaging is a fast, cheap, and accurate technique in the assessment of food quality and safety. The technique was used in the present research to detect sea foam adulterant levels in black and red peppers. The fraud levels included 0, 5, 15, 30, and 50%. Sample preparation, image acquisition and preprocessing, and feature engineering (feature extraction, selection, and classification) were the conducted steps in the present research. The efficient features were classified using artificial neural networks and support vector machine methods. The classifiers were evaluated using the specificity, sensitivity, precision, and accuracy metrics. The artificial neural networks had better results than the support vector machine method for the classification of different adulterant levels in black pepper with the metrics' values of 98.89, 95.67, 95.56, and 98.22%, respectively. Reversely, the support vector machine method had higher metrics' values (99.46, 98.00, 97.78, and 99.11%, respectively) for red pepper. The results showed the ability of visible imaging and machine learning methods to detect fraud levels in black and red pepper.
Asunto(s)
Capsicum , Redes Neurales de la Computación , Piper nigrum , Máquina de Vectores de Soporte , Fraude/prevención & control , Contaminación de Alimentos/análisis , Calidad de los AlimentosRESUMEN
Background and Objectives: Intra/postpartum hemorrhage stands as a significant obstetric emergency, ranking among the top five leading causes of maternal mortality. The aim of this study was to assess the predictive performance of four machine learning algorithms for the prediction of postpartum and intrapartum hemorrhage. Materials and Methods: A prospective multicenter study was conducted, involving 203 patients with or without intra/postpartum hemorrhage within the initial 24 h postpartum. The participants were categorized into two groups: those with intra/postpartum hemorrhage (PPH) and those without PPH (control group). The PPH group was further stratified into four classes following the Advanced Trauma Life Support guidelines. Clinical data collected from these patients was included in four machine learning-based algorithms whose predictive performance was assessed. Results: The Naïve Bayes (NB) algorithm exhibited the highest accuracy in predicting PPH, boasting a sensitivity of 96.3% and an accuracy of 98.6%, with a false negative rate of 3.7%. Following closely were the Decision Tree (DT) and Random Forest (RF) algorithms, each achieving sensitivities exceeding 94% with a false negative rate of 5.9%. Regarding severity classification I, the NB and Support Vector Machine (SVM) algorithms demonstrated superior predictive capabilities, achieving a sensitivity of 96.4%, an accuracy of 92.1%, and a false negative rate of 3.6%. The most severe manifestations of HPP were most accurately predicted by the NB algorithm, with a sensitivity of 89.3%, an accuracy of 82.4%, and a false negative rate of 10.7%. Conclusions: The NB algorithm demonstrated the highest accuracy in predicting PPH. A notable discrepancy in algorithm performance was observed between mild and severe forms, with the NB and SVM algorithms displaying superior sensitivity and lower rates of false negatives, particularly for mild forms.