Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters











Publication year range
1.
Sci Rep ; 14(1): 16328, 2024 07 15.
Article in English | MEDLINE | ID: mdl-39009760

ABSTRACT

This study employs machine learning to detect the severity of major depressive disorder (MDD) through binary and multiclass classifications. We compared models that used only biomarkers of oxidative stress with those that incorporate sociodemographic and health-related factors. Data collected from 830 participants, based on the Patient Health Questionnaire (PHQ-9) score, inform our analysis. In binary classification, the Random Forest (RF) classifier achieved the highest Area Under the Curve (AUC) of 0.84 when all features were included. In multiclass classification, the AUC improved from 0.84 with only oxidative stress biomarkers to 0.88 when all characteristics were included. To address data imbalance, weighted classifiers, and Synthetic Minority Over-sampling Technique (SMOTE) approaches were applied. Weighted random forest (WRF) improved multiclass classification, achieving an AUC of 0.91. Statistical tests, including the Friedman test and the Conover post-hoc test, confirmed significant differences between model performances, with WRF using all features outperforming others. Feature importance analysis shows that oxidative stress biomarkers, particularly GSH, are top ranked among all features. Clinicians can leverage the results of this study to improve their decision-making processes by incorporating oxidative stress biomarkers in addition to the standard criteria for depression diagnosis.


Subject(s)
Biomarkers , Depressive Disorder, Major , Machine Learning , Oxidative Stress , Humans , Female , Depressive Disorder, Major/diagnosis , Male , Adult , Middle Aged , Severity of Illness Index , Area Under Curve , Depression/diagnosis , Random Forest
2.
Article in English | MEDLINE | ID: mdl-38691429

ABSTRACT

DNA damage is a critical factor in the onset and progression of cancer. When DNA is damaged, the number of genetic mutations increases, making it necessary to activate DNA repair mechanisms. A crucial factor in the base excision repair process, which helps maintain the stability of the genome, is an enzyme called DNA polymerase [Formula: see text] (Pol[Formula: see text]) encoded by the POLB gene. It plays a vital role in the repair of damaged DNA. Additionally, variations known as Single Nucleotide Polymorphisms (SNPs) in the POLB gene can potentially affect the ability to repair DNA. This study uses bioinformatics tools that extract important features from SNPs to construct a feature matrix, which is then used in combination with machine learning algorithms to predict the likelihood of developing cancer associated with a specific mutation. Eight different machine learning algorithms were used to investigate the relationship between POLB gene variations and their potential role in cancer onset. This study not only highlights the complex link between POLB gene SNPs and cancer, but also underscores the effectiveness of machine learning approaches in genomic studies, paving the way for advanced predictive models in genetic and cancer research.

3.
Sci Rep ; 14(1): 12269, 2024 May 28.
Article in English | MEDLINE | ID: mdl-38806584

ABSTRACT

Solar power is a promising source of energy that is environmentally friendly, sustainable, and renewable. Solar photovoltaic (PV) panels are the most common and mature technology used to harness solar energy. Unfortunately, these panels are prone to dust accumulation, which can have a significant impact on their efficiency. To maintain their effectiveness, solar photovoltaics s must be cleaned regularly. Eight main techniques are used to clean solar panels: natural, manual, mechanical, robotic, drone, coating, electrical, and acoustic. This study aims to identify the best cleaning method using multiple criteria decision-making (MCDM) techniques. Using the Analytical Hierarchy Process (AHP), Quality Function Deployment (QFD), Fuzzy Technique for Order of Preference by Similarities to Ideal Solution (FTOPSIS), and Preference Selection Index (PSI), this research evaluates all eight cleaning methods based on several criteria that are categorized under cost, performance, resource requirement, and safety in Abu Dhabi. The data are collected from surveys completed by experts in solar and sustainable energy. The AHP, QFD, and PSI results identified natural, manual, and surface coating as the best and most effective cleaning methods. Natural cleaning involves using rainwater primarily to remove dirt and dust; manual cleaning requires cleaning agents and wiping clothes; and surface coatings involve applying a layer of hydrophobic material to the panels to repel dust. Identifying the most effective cleaning method for dust removal from solar panels can ensure optimal efficiency recovery at minimal costs and resources.

4.
Article in English | MEDLINE | ID: mdl-38082683

ABSTRACT

Major depressive disorder is one of the major contributors to disability worldwide with an estimated prevalence of 4%. Depression is a heterogeneous disease often characterized by an undefined pathogenesis and multifactorial phenotype that complicate diagnosis and follow-up. Translational research and identification of objective biomarkers including inflammation can assist clinicians in diagnosing depression and disease progression. Investigating inflammation markers using machine learning methods combines recent understanding of the pathogenesis of depression associated with inflammatory changes as part of chronic disease progression that aims to highlight complex interactions. In this paper, 721 patients attending a diabetes health screening clinic (DiabHealth) were classified into no depression (none) to minimal depression (none-minimal), mild depression, and moderate to severe depression (moderate-severe) based on the Patient Health Questionnaire (PHQ-9). Logistic Regression, K-nearest Neighbors, Support Vector Machine, Random Forest, Multi-layer Perceptron, and Extreme Gradient Boosting were applied and compared to predict depression level from inflammatory marker data that included C-reactive protein (CRP), Interleukin (IL)-6, IL-1ß, IL-10, Complement Component 5a (C5a), D-Dimer, Monocyte Chemoattractant Protein (MCP)-1, and Insulin-like Growth Factor (IGF)-1. MCP-1 and IL-1ß were the most significant inflammatory markers for the classification performance of depression level. Extreme Gradient Boosting outperformed the models achieving the highest accuracy and Area Under the Receiver Operator Curve (AUC) of 0.89 and 0.95, respectively.Clinical Relevance- The findings of this study show the potential of machine learning models to aid in clinical practice, leading to a more objective assessment of depression level based on the involvement of MCP-1 and IL-1ß inflammatory markers with disease progression.


Subject(s)
Depressive Disorder, Major , Humans , Depression/diagnosis , Inflammation/diagnosis , Ambulatory Care Facilities , Disease Progression
5.
Article in English | MEDLINE | ID: mdl-38083345

ABSTRACT

In this study, depression severity was defined by the Patient Health Questionnaire (PHQ-9) and five machine learning algorithms were applied to classify depression severity in the presence of diabetes mellitus (DM), cardiovascular disease (CVD), and hypertension (HT) utilizing oxidative stress (OS) biomarkers (8-isoprostane, 8-hydroxydeoxyguanosine, reduced glutathione and oxidized glutathione), demographic details, and medication for eight hundred and thirty participants. The results show that the Random Forest (RF) outperformed other classifiers with the highest accuracy of 92% in a 4-class depression classification when considering all OS biomarkers along with DM, CVD and HT. RF also achieved the highest accuracy of 91% in 3-class classification when studying depression in presence of DM only and an accuracy of 88% and 87% in 5-class classification when investigating depression with CVD and HT, respectively. Moreover, RF performed best in the 3-class depression model with an accuracy of 85% when examining depression severity in the presence of OS biomarkers only. Our findings suggest that depression severity can be accurately identified with RF as a base classifier and that OS is a major contributor to depression severity in the presence of comorbidities. Biomarker analysis can supplement DSM-5-based diagnostics as part of personalized medicine and especially as point of care testing has become available for many of the given OS biomarkers.Clinical Relevance- Depression is the most common form of psychiatric disorder that has an oxidative stress etiology. Current diagnosis relies primarily on the Diagnostic and Statistical Manual for Mental Disorders (DSM-5), which may be too general and not informative for optimal multi-comorbidity diagnostics and treatment. Understanding the role of oxidative stress associated with depression can provide additional information for timely detection, comprehensive assessment, and appropriate intervention of depression illness.


Subject(s)
Cardiovascular Diseases , Diabetes Mellitus , Hypertension , Humans , Depression/diagnosis , Cardiovascular Diseases/complications , Cardiovascular Diseases/diagnosis , Comorbidity , Diabetes Mellitus/diagnosis , Hypertension/complications , Hypertension/diagnosis
6.
Biology (Basel) ; 12(4)2023 Mar 29.
Article in English | MEDLINE | ID: mdl-37106719

ABSTRACT

Gene expression profiling is one of the most recognized techniques for inferring gene regulators and their potential targets in gene regulatory networks (GRN). The purpose of this study is to build a regulatory network for the budding yeast Saccharomyces cerevisiae genome by incorporating the use of RNA-seq and microarray data represented by a wide range of experimental conditions. We introduce a pipeline for data analysis, data preparation, and training models. Several kernel classification models; including one-class, two-class, and rare event classification methods, are used to categorize genes. We test the impact of the normalization techniques on the overall performance of RNA-seq. Our findings provide new insights into the interactions between genes in the yeast regulatory network. The conclusions of our study have significant importance since they highlight the effectiveness of classification and its contribution towards enhancing the present comprehension of the yeast regulatory network. When assessed, our pipeline demonstrates strong performance across different statistical metrics, such as a 99% recall rate and a 98% AUC score.

7.
PLoS One ; 18(1): e0278237, 2023.
Article in English | MEDLINE | ID: mdl-36662704

ABSTRACT

The COVID-19 pandemic has significantly affected all spheres of life, including the healthcare workforce. While the COVID-19 pandemic has started driving organizational and societal shifts, it is vital for healthcare organizations and decision-makers to analyze patterns in the changing workforce. In this study, we aim to identify patterns in healthcare job postings during the pandemic to understand which jobs and associated skills are trending after the advent of COVID-19. Content analysis of job postings was conducted using data-driven approaches over two-time intervals in the pandemic. The proposed framework utilizes Latent Dirichlet Allocation (LDA) for topic modeling to evaluate the patterns in job postings in the US and the UK. The most demanded jobs, skills and tasks for the US job postings are presented based on job posting data from popular job posting websites. This is obtained by mapping the job postings to the jobs, skills and tasks defined in the O*NET database for the healthcare occupations in the US. The topic modeling results clearly show increased hiring for telehealth services in both the US and UK. This study also presents an increase in demand for specific occupations and skills in the USA healthcare industry. The results and methods used in the study can help monitor rapid changes in the job market due to pandemics and guide decision-makers to make organizational shifts in a timely manner.


Subject(s)
COVID-19 , Health Care Sector , Humans , Pandemics , COVID-19/epidemiology , Delivery of Health Care , United Kingdom/epidemiology
8.
Evol Bioinform Online ; 16: 1176934320920310, 2020.
Article in English | MEDLINE | ID: mdl-35173404

ABSTRACT

Computational prediction of gene-gene associations is one of the productive directions in the study of bioinformatics. Many tools are developed to infer the relation between genes using different biological data sources. The association of a pair of genes deduced from the analysis of biological data becomes meaningful when it reflects the directionality and the type of reaction between genes. In this work, we follow another method to construct a causal gene co-expression network while identifying transcription factors in each pair of genes using microarray expression data. We adopt a machine learning technique based on a logistic regression model to tackle the sparsity of the network and to improve the quality of the prediction accuracy. The proposed system classifies each pair of genes into either connected or nonconnected class using the data of the correlation between these genes in the whole Saccharomyces cerevisiae genome. The accuracy of the classification model in predicting related genes was evaluated using several data sets for the yeast regulatory network. Our system achieves high performance in terms of several statistical measures.

9.
BMC Bioinformatics ; 20(1): 70, 2019 Feb 08.
Article in English | MEDLINE | ID: mdl-30736752

ABSTRACT

BACKGROUND: Understanding the genetic networks and their role in chronic diseases (e.g., cancer) is one of the important objectives of biological researchers. In this work, we present a text mining system that constructs a gene-gene-interaction network for the entire human genome and then performs network analysis to identify disease-related genes. We recognize the interacting genes based on their co-occurrence frequency within the biomedical literature and by employing linear and non-linear rare-event classification models. We analyze the constructed network of genes by using different network centrality measures to decide on the importance of each gene. Specifically, we apply betweenness, closeness, eigenvector, and degree centrality metrics to rank the central genes of the network and to identify possible cancer-related genes. RESULTS: We evaluated the top 15 ranked genes for different cancer types (i.e., Prostate, Breast, and Lung Cancer). The average precisions for identifying breast, prostate, and lung cancer genes vary between 80-100%. On a prostate case study, the system predicted an average of 80% prostate-related genes. CONCLUSIONS: The results show that our system has the potential for improving the prediction accuracy of identifying gene-gene interaction and disease-gene associations. We also conduct a prostate cancer case study by using the threshold property in logistic regression, and we compare our approach with some of the state-of-the-art methods.


Subject(s)
Epistasis, Genetic , Gene Regulatory Networks , Genetic Predisposition to Disease , Humans , Logistic Models , Male , Prostatic Neoplasms/genetics , ROC Curve
10.
Sci Rep ; 7(1): 15784, 2017 Nov 17.
Article in English | MEDLINE | ID: mdl-29150626

ABSTRACT

Text mining has become an important tool in bioinformatics research with the massive growth in the biomedical literature over the past decade. Mining the biomedical literature has resulted in an incredible number of computational algorithms that assist many bioinformatics researchers. In this paper, we present a text mining system called Gene Interaction Rare Event Miner (GIREM) that constructs gene-gene-interaction networks for human genome using information extracted from biomedical literature. GIREM identifies functionally related genes based on their co-occurrences in the abstracts of biomedical literature. For a given gene g, GIREM first extracts the set of genes found within the abstracts of biomedical literature associated with g. GIREM aims at enhancing biological text mining approaches by identifying the semantic relationship between each co-occurrence of a pair of genes in abstracts using the syntactic structures of sentences and linguistics theories. It uses a supervised learning algorithm, weighted logistic regression to label pairs of genes to related or un-related classes, and to reflect the population proportion using smaller samples. We evaluated GIREM by comparing it experimentally with other well-known approaches and a protein-protein interactions database. Results showed marked improvement.


Subject(s)
Data Mining , Gene Regulatory Networks , Publications , Genes , ROC Curve
SELECTION OF CITATIONS
SEARCH DETAIL