Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 405
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
J Neurosci ; 44(39)2024 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-39187379

RESUMO

Recording and analysis of neural activity are often biased toward detecting sparse subsets of highly active neurons, masking important signals carried in low-magnitude and variable responses. To investigate the contribution of seemingly noisy activity to odor encoding, we used mesoscale calcium imaging from mice of both sexes to record odor responses from the dorsal surface of bilateral olfactory bulbs (OBs). The outer layer of the mouse OB is comprised of dendrites organized into discrete "glomeruli," which are defined by odor receptor-specific sensory neuron input. We extracted activity from a large population of glomeruli and used logistic regression to classify odors from individual trials with high accuracy. We then used add-in and dropout analyses to determine subsets of glomeruli necessary and sufficient for odor classification. Classifiers successfully predicted odor identity even after excluding sparse, highly active glomeruli, indicating that odor information is redundantly represented across a large population of glomeruli. Additionally, we found that random forest (RF) feature selection informed by Gini inequality (RF Gini impurity, RFGI) reliably ranked glomeruli by their contribution to overall odor classification. RFGI provided a measure of "feature importance" for each glomerulus that correlated with intuitive features like response magnitude. Finally, in agreement with previous work, we found that odor information persists in glomerular activity after the odor offset. Together, our findings support a model of OB odor coding where sparse activity is sufficient for odor identification, but information is widely, redundantly available across a large population of glomeruli, with each glomerulus representing information about more than one odor.


Assuntos
Camundongos Endogâmicos C57BL , Odorantes , Bulbo Olfatório , Vigília , Animais , Bulbo Olfatório/fisiologia , Camundongos , Masculino , Feminino , Vigília/fisiologia , Olfato/fisiologia , Neurônios Receptores Olfatórios/fisiologia
2.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37405873

RESUMO

Nucleic acid-binding proteins are proteins that interact with DNA and RNA to regulate gene expression and transcriptional control. The pathogenesis of many human diseases is related to abnormal gene expression. Therefore, recognizing nucleic acid-binding proteins accurately and efficiently has important implications for disease research. To address this question, some scientists have proposed the method of using sequence information to identify nucleic acid-binding proteins. However, different types of nucleic acid-binding proteins have different subfunctions, and these methods ignore their internal differences, so the performance of the predictor can be further improved. In this study, we proposed a new method, called iDRPro-SC, to predict the type of nucleic acid-binding proteins based on the sequence information. iDRPro-SC considers the internal differences of nucleic acid-binding proteins and combines their subfunctions to build a complete dataset. Additionally, we used an ensemble learning to characterize and predict nucleic acid-binding proteins. The results of the test dataset showed that iDRPro-SC achieved the best prediction performance and was superior to the other existing nucleic acid-binding protein prediction methods. We have established a web server that can be accessed online: http://bliulab.net/iDRPro-SC.


Assuntos
Proteínas de Ligação a DNA , Proteínas de Ligação a RNA , Humanos , Proteínas de Ligação a DNA/metabolismo , Proteínas de Ligação a RNA/genética , DNA/química , Algoritmos
3.
Lab Invest ; 104(3): 100304, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-38092179

RESUMO

Gene expression profiling from formalin-fixed paraffin-embedded (FFPE) renal allograft biopsies is a promising approach for feasibly providing a molecular diagnosis of rejection. However, large-scale studies evaluating the performance of models using NanoString platform data to define molecular archetypes of rejection are lacking. We tested a diverse retrospective cohort of over 1400 FFPE biopsy specimens, rescored according to Banff 2019 criteria and representing 10 of 11 United Network of Organ Sharing regions, using the Banff Human Organ Transplant panel from NanoString and developed a multiclass model from the gene expression data to assign relative probabilities of 4 molecular archetypes: No Rejection, Antibody-Mediated Rejection, T Cell-Mediated Rejection, and Mixed Rejection. Using Least Absolute Shrinkage and Selection Operator regularized regression with 10-fold cross-validation fitted to 1050 biopsies in the discovery cohort and technically validated on an additional 345 biopsies, our model achieved overall accuracy of 85% in the discovery cohort and 80% in the validation cohort, with ≥75% positive predictive value for each class, except for the Mixed Rejection class in the validation cohort (positive predictive value, 53%). This study represents the technical validation of the first model built from a large and diverse sample of diagnostic FFPE biopsy specimens to define and classify molecular archetypes of histologically defined diagnoses as derived from Banff Human Organ Transplant panel gene expression profiling data.


Assuntos
Nefropatias , Transplante de Rim , Transplante de Órgãos , Humanos , Transplante de Rim/efeitos adversos , Estudos de Coortes , Estudos Retrospectivos , Rejeição de Enxerto/diagnóstico , Rejeição de Enxerto/genética , Nefropatias/patologia , Expressão Gênica , Biópsia , Rim/patologia
4.
Int J Cancer ; 154(8): 1335-1339, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-37962056

RESUMO

The incidence of cancer in general, including breast and prostate cancer specifically, is increasing in India. Breast and prostate cancers have genomic classifiers developed to guide therapy decisions. However, these genomic classifiers are often inaccessible in India due to high cost. These classifiers may also be less suitable to the Indian population, as data primarily from patients in wealthy Western countries were used in developing these genomic classifiers. In addition to the limitations in using these existing genomic classifiers, developing and validating new genomic classifiers for breast and prostate cancer in India is challenging due to the heterogeneity in the Indian population. However, there are steps that can be taken to address the various barriers that currently exist for accurate, accessible genomic classifiers for cancer in India.


Assuntos
Neoplasias da Mama , Neoplasias da Próstata , Masculino , Humanos , Neoplasias da Mama/genética , Neoplasias da Mama/epidemiologia , Neoplasias da Próstata/genética , Neoplasias da Próstata/epidemiologia , Genômica , Índia/epidemiologia , Incidência
5.
Cancer ; 130(10): 1766-1772, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38280206

RESUMO

BACKGROUND: The challenge of distinguishing indolent from aggressive prostate cancer (PCa) complicates decision-making for men considering active surveillance (AS). Genomic classifiers (GCs) may improve risk stratification by predicting end points such as upgrading or upstaging (UG/US). The aim of this study was to assess the impact of GCs on UG/US risk prediction in a clinicopathologic model. METHODS: Participants had favorable-risk PCa (cT1-2, prostate-specific antigen [PSA] ≤15 ng/mL, and Gleason grade group 1 [GG1]/low-volume GG2). A prediction model was developed for 864 men at the University of California, San Francisco, with standard clinical variables (cohort 1), and the model was validated for 2267 participants from the Cancer of the Prostate Strategic Urologic Research Endeavor (CaPSURE) registry (cohort 2). Logistic regression was used to compute the area under the receiver operating characteristic curve (AUC) to develop a prediction model for UG/US at prostatectomy. A GC (Oncotype Dx Genomic Prostate Score [GPS] or Prolaris) was then assessed to improve risk prediction. RESULTS: The prediction model included biopsy GG1 versus GG2 (odds ratio [OR], 5.83; 95% confidence interval [CI], 3.73-9.10); PSA (OR, 1.10; 95% CI, 1.01-1.20; per 1 ng/mL), percent positive cores (OR, 1.01; 95% CI, 1.01-1.02; per 1%), prostate volume (OR, 0.98; 95% CI, 0.97-0.99; per mL), and age (OR, 1.05; 95% CI, 1.02-1.07; per year), with AUC 0.70 (cohort 1) and AUC 0.69 (cohort 2). GPS was associated with UG/US (OR, 1.03; 95% CI, 1.01-1.06; p < .01) and AUC 0.72, which indicates a comparable performance to the prediction model. CONCLUSIONS: GCs did not substantially improve a clinical prediction model for UG/US, a short-term and imperfect surrogate for clinically relevant disease outcomes.


Assuntos
Biomarcadores Tumorais , Gradação de Tumores , Neoplasias da Próstata , Humanos , Masculino , Neoplasias da Próstata/genética , Neoplasias da Próstata/patologia , Neoplasias da Próstata/sangue , Pessoa de Meia-Idade , Idoso , Biomarcadores Tumorais/genética , Medição de Risco , Antígeno Prostático Específico/sangue , Estadiamento de Neoplasias , Prostatectomia , Genômica/métodos , Curva ROC
6.
Small ; : e2405087, 2024 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-39155437

RESUMO

Metal-organic frameworks (MOFs) provide an extensive design landscape for nanoporous materials that drive innovation across energy and environmental fields. However, their practical applications are often hindered by water stability challenges. In this study, a machine learning (ML) approach is proposed to accelerate the discovery of water stable MOFs and validated through experimental test. First, the largest database currently available that contains water stability information of 1133 synthesized MOFs is constructed and categorized according to experimental stability. Then, structural and chemical descriptors are applied at various fragmental levels to develop ML classifiers for predicting the water stability of MOFs. The ML classifiers achieve high prediction accuracy and excellent transferability on out-of-sample validation. Next, two MOFs are experimentally synthesized with their water stability tested to validate ML predictions. Finally, the ML classifiers are applied to discover water stable MOFs in the ab initio REPEAT charge MOF (ARC-MOF) database. Among ≈280 000 candidates, ≈130 000 (47%) MOFs are predicted to be water stable; furthermore, through multi-stability analysis, 461 (0.16%) MOFs are identified as not only water stable but also thermal and activation stable. The ML approach is anticipated to serve as a prerequisite filtering tool to streamline the exploration of water stable MOFs for important practical applications.

7.
Mass Spectrom Rev ; 2023 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-37143314

RESUMO

With urinary proteomics profiling (UPP) as exemplary omics technology, this review describes a workflow for the analysis of omics data in large study populations. The proposed workflow includes: (i) planning omics studies and sample size considerations; (ii) preparing the data for analysis; (iii) preprocessing the UPP data; (iv) the basic statistical steps required for data curation; (v) the selection of covariables; (vi) relating continuously distributed or categorical outcomes to a series of single markers (e.g., sequenced urinary peptide fragments identifying the parental proteins); (vii) showing the added diagnostic or prognostic value of the UPP markers over and beyond classical risk factors, and (viii) pathway analysis to identify targets for personalized intervention in disease prevention or treatment. Additionally, two short sections respectively address multiomics studies and machine learning. In conclusion, the analysis of adverse health outcomes in relation to omics biomarkers rests on the same statistical principle as any other data collected in large population or patient cohorts. The large number of biomarkers, which have to be considered simultaneously requires planning ahead how the study database will be structured and curated, imported in statistical software packages, analysis results will be triaged for clinical relevance, and presented.

8.
Mol Pharm ; 21(2): 864-872, 2024 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-38134445

RESUMO

Drug-induced phospholipidosis (PLD) involves the accumulation of phospholipids in cells of multiple tissues, particularly within lysosomes, and it is associated with prolonged exposure to druglike compounds, predominantly cationic amphiphilic drugs (CADs). PLD affects a significant portion of drugs currently in development and has recently been proven to be responsible for confounding antiviral data during drug repurposing for SARS-CoV-2. In these scenarios, it has become crucial to identify potential safe drug candidates in advance and distinguish them from those that may lead to false in vitro antiviral activity. In this work, we developed a series of machine learning classifiers with the aim of predicting the PLD-inducing potential of drug candidates. The models were built on a high-quality chemical collection comprising 545 curated small molecules extracted from ChEMBL v30. The most effective model, obtained using the balanced random forest algorithm, achieved high performance, including an AUC value computed in validation as high as 0.90. The model was made freely available through a user-friendly web platform named AMALPHI (https://www.ba.ic.cnr.it/softwareic/amalphiportal/), which can represent a valuable tool for medicinal chemists interested in conducting an early evaluation of PLD inducer potential.


Assuntos
Lipidoses , Fosfolipídeos , Humanos , Células Hep G2 , Lisossomos , Aprendizado de Máquina , Antivirais/efeitos adversos , Lipidoses/induzido quimicamente
9.
Addict Biol ; 29(2): e13362, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-38380772

RESUMO

Long-term use of methamphetamine (meth) causes cognitive and neuropsychological impairments. Analysing the impact of this substance on the human brain can aid prevention and treatment efforts. In this study, the electroencephalogram (EEG) signals of meth abusers in the abstinence period and healthy subjects were recorded during eyes-closed and eyes-opened states to distinguish the brain regions that meth can significantly influence. In addition, a decision support system (DSS) was introduced as a complementary method to recognize substance users accompanied by biochemical tests. According to these goals, the recorded EEG signals were pre-processed and decomposed into frequency bands using the discrete wavelet transform (DWT) method. For each frequency band, energy, KS entropy, Higuchi and Katz fractal dimensions of signals were calculated. Then, statistical analysis was applied to select features whose channels contain a p-value less than 0.05. These features between two groups were compared, and the location of channels containing more features was specified as discriminative brain areas. Due to evaluating the performance of features and distinguishing the two groups in each frequency band, features were fed into a k-nearest neighbour (KNN), support vector machine (SVM), multilayer perceptron neural networks (MLP) and linear discriminant analysis (LDA) classifiers. The results indicated that prolonged consumption of meth has a considerable impact on the brain areas responsible for working memory, motor function, attention, visual interpretation, and speech processing. Furthermore, the best classification accuracy, almost 95.8%, was attained in the gamma band during the eyes-closed state.


Assuntos
Algoritmos , Encéfalo , Humanos , Análise de Ondaletas , Eletroencefalografia/métodos , Máquina de Vetores de Suporte
10.
Sensors (Basel) ; 24(5)2024 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-38475036

RESUMO

Gait disorder is common among people with neurological disease and musculoskeletal disorders. The detection of gait disorders plays an integral role in designing appropriate rehabilitation protocols. This study presents a clinical gait analysis of patients with polymyalgia rheumatica to determine impaired gait patterns using machine learning models. A clinical gait assessment was conducted at KATH hospital between August and September 2022, and the 25 recruited participants comprised 18 patients and 7 control subjects. The demographics of the participants follow: age 56 years ± 7, height 175 cm ± 8, and weight 82 kg ± 10. Electromyography data were collected from four strained hip muscles of patients, which were the rectus femoris, vastus lateralis, biceps femoris, and semitendinosus. Four classification models were used-namely, support vector machine (SVM), rotation forest (RF), k-nearest neighbors (KNN), and decision tree (DT)-to distinguish the gait patterns for the two groups. SVM recorded the highest accuracy of 85% among the classifiers, while KNN had 75%, RF had 80%, and DT had the lowest accuracy of 70%. Furthermore, the SVM classifier had the highest sensitivity of 92%, while RF had 86%, DT had 90%, and KNN had the lowest sensitivity of 84%. The classifiers achieved significant results in discriminating between the impaired gait pattern of patients with polymyalgia rheumatica and control subjects. This information could be useful for clinicians designing therapeutic exercises and may be used for developing a decision support system for diagnostic purposes.


Assuntos
Polimialgia Reumática , Humanos , Pessoa de Meia-Idade , Marcha/fisiologia , Músculo Esquelético/fisiologia , Eletromiografia/métodos , Movimento , Máquina de Vetores de Suporte
11.
Sensors (Basel) ; 24(4)2024 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-38400338

RESUMO

In order to achieve the Sustainable Development Goals (SDG), it is imperative to ensure the safety of drinking water. The characteristics of each drinkable water, encompassing taste, aroma, and appearance, are unique. Inadequate water infrastructure and treatment can affect these features and may also threaten public health. This study utilizes the Internet of Things (IoT) in developing a monitoring system, particularly for water quality, to reduce the risk of contracting diseases. Water quality components data, such as water temperature, alkalinity or acidity, and contaminants, were obtained through a series of linked sensors. An Arduino microcontroller board acquired all the data and the Narrow Band-IoT (NB-IoT) transmitted them to the web server. Due to limited human resources to observe the water quality physically, the monitoring was complemented by real-time notifications alerts via a telephone text messaging application. The water quality data were monitored using Grafana in web mode, and the binary classifiers of machine learning techniques were applied to predict whether the water was drinkable or not based on the data collected, which were stored in a database. The non-decision tree, as well as the decision tree, were evaluated based on the improvements of the artificial intelligence framework. With a ratio of 60% for data training: at 20% for data validation, and 10% for data testing, the performance of the decision tree (DT) model was more prominent in comparison with the Gradient Boosting (GB), Random Forest (RF), Neural Network (NN), and Support Vector Machine (SVM) modeling approaches. Through the monitoring and prediction of results, the authorities can sample the water sources every two weeks.


Assuntos
Água Potável , Internet das Coisas , Humanos , Inteligência Artificial , Computação em Nuvem , Confiabilidade dos Dados
12.
Sensors (Basel) ; 24(10)2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38794052

RESUMO

Recently, explainability in machine and deep learning has become an important area in the field of research as well as interest, both due to the increasing use of artificial intelligence (AI) methods and understanding of the decisions made by models. The explainability of artificial intelligence (XAI) is due to the increasing consciousness in, among other things, data mining, error elimination, and learning performance by various AI algorithms. Moreover, XAI will allow the decisions made by models in problems to be more transparent as well as effective. In this study, models from the 'glass box' group of Decision Tree, among others, and the 'black box' group of Random Forest, among others, were proposed to understand the identification of selected types of currant powders. The learning process of these models was carried out to determine accuracy indicators such as accuracy, precision, recall, and F1-score. It was visualized using Local Interpretable Model Agnostic Explanations (LIMEs) to predict the effectiveness of identifying specific types of blackcurrant powders based on texture descriptors such as entropy, contrast, correlation, dissimilarity, and homogeneity. Bagging (Bagging_100), Decision Tree (DT0), and Random Forest (RF7_gini) proved to be the most effective models in the framework of currant powder interpretability. The measures of classifier performance in terms of accuracy, precision, recall, and F1-score for Bagging_100, respectively, reached values of approximately 0.979. In comparison, DT0 reached values of 0.968, 0.972, 0.968, and 0.969, and RF7_gini reached values of 0.963, 0.964, 0.963, and 0.963. These models achieved classifier performance measures of greater than 96%. In the future, XAI using agnostic models can be an additional important tool to help analyze data, including food products, even online.


Assuntos
Algoritmos , Inteligência Artificial , Aprendizado de Máquina , Pós , Ribes , Pós/química , Ribes/química , Árvores de Decisões
13.
J Clin Monit Comput ; 2024 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-39305451

RESUMO

Measuring spontaneous swallowing frequencies (SSF), coughing frequencies (CF), and the temporal relationships between swallowing and coughing in patients could provide valuable clinical insights into swallowing function, dysphagia, and the risk of pneumonia development. Medical technology with these capabilities has potential applications in hospital settings. In the management of intensive care unit (ICU) patients, monitoring SSF and CF could contribute to predictive models for successful weaning from ventilatory support, extubation, or tracheal decannulation. Furthermore, the early prediction of pneumonia in hospitalized patients or home care residents could offer additional diagnostic value over current practices. However, existing technologies for measuring SSF and CF, such as electromyography and acoustic sensors, are often complex and challenging to implement in real-world settings. Therefore, there is a need for a simple, flexible, and robust method for these measurements. The primary objective of this study was to develop a system that is both low in complexity and sufficiently flexible to allow for wide clinical applicability. To construct this model, we recruited forty healthy volunteers. Each participant was equipped with two medical-grade sensors (Movesense MD), one attached to the cricoid cartilage and the other positioned in the epigastric region. Both sensors recorded tri-axial accelerometry and gyroscopic movements. Participants were instructed to perform various conscious actions on cue, including swallowing, talking, throat clearing, and coughing. The recorded signals were then processed to create a model capable of accurately identifying conscious swallowing and coughing, while effectively discriminating against other confounding actions. Training of the algorithm resulted in a model with a sensitivity of 70% (14/20), a specificity of 71% (20/28), and a precision of 66.7% (14/21) for the detection of swallowing and, a sensitivity of 100% (20/20), a specificity of 83.3% (25/30), and a precision of 80% (20/25) for the detection of coughing. SSF, CF and the temporal relationship between swallowing and coughing are parameters that could have value as predictive tools for diagnosis and therapeutic guidance. Based on 2 tri-axial accelerometry and gyroscopic sensors, a model was developed with an acceptable sensitivity and precision for the detection of swallowing and coughing movements. Also due to simplicity and robustness of the set-up, the model is promising for further scientific research in a wide range of clinical indications.

14.
Int J Mol Sci ; 25(11)2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38892144

RESUMO

In this study, we present an innovative approach to improve the prediction of protein-protein interactions (PPIs) through the utilization of an ensemble classifier, specifically focusing on distinguishing between native and non-native interactions. Leveraging the strengths of various base models, including random forest, gradient boosting, extreme gradient boosting, and light gradient boosting, our ensemble classifier integrates these diverse predictions using a logistic regression meta-classifier. Our model was evaluated using a comprehensive dataset generated from molecular dynamics simulations. While the gains in AUC and other metrics might seem modest, they contribute to a model that is more robust, consistent, and adaptable. To assess the effectiveness of various approaches, we compared the performance of logistic regression to four baseline models. Our results indicate that logistic regression consistently underperforms across all evaluated metrics. This suggests that it may not be well-suited to capture the complex relationships within this dataset. Tree-based models, on the other hand, appear to be more effective for problems involving molecular dynamics simulations. Extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) are optimized for performance and speed, handling datasets effectively and incorporating regularizations to avoid over-fitting. Our findings indicate that the ensemble method enhances the predictive capability of PPIs, offering a promising tool for computational biology and drug discovery by accurately identifying potential interaction sites and facilitating the understanding of complex protein functions within biological systems.


Assuntos
Simulação de Dinâmica Molecular , Mapeamento de Interação de Proteínas , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Proteínas/metabolismo , Biologia Computacional/métodos , Algoritmos , Ligação Proteica , Modelos Logísticos
15.
Entropy (Basel) ; 26(4)2024 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-38667853

RESUMO

In the signal analysis context, the entropy concept can characterize signal properties for detecting anomalies or non-representative behaviors in fiscal systems. In motor fault detection theory, entropy can measure disorder or uncertainty, aiding in detecting and classifying faults or abnormal operation conditions. This is especially relevant in industrial processes, where early motor fault detection can prevent progressive damage, operational interruptions, or potentially dangerous situations. The study of motor fault detection based on entropy theory holds significant academic relevance too, effectively bridging theoretical frameworks with industrial exigencies. As industrial sectors progress, applying entropy-based methodologies becomes indispensable for ensuring machinery integrity based on control and monitoring systems. This academic endeavor enhances the understanding of signal processing methodologies and accelerates progress in artificial intelligence and other modern knowledge areas. A wide variety of entropy-based methods have been employed for motor fault detection. This process involves assessing the complexity of measured signals from electrical motors, such as vibrations or stator currents, to form feature vectors. These vectors are then fed into artificial-intelligence-based classifiers to distinguish between healthy and faulty motor signals. This paper discusses some recent references to entropy methods and a summary of the most relevant results reported for fault detection over the last 10 years.

16.
Entropy (Basel) ; 26(7)2024 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-39056933

RESUMO

This paper highlights that metrics from the machine learning field (e.g., entropy and information gain) used to qualify a classifier model can be used to evaluate the effectiveness of separation systems. To evaluate the efficiency of separation systems and their operation units, entropy- and information gain-based metrics were developed. The receiver operating characteristic (ROC) curve is used to determine the optimal cut point in a separation system. The proposed metrics are verified by simulation experiments conducted on the stochastic model of a waste-sorting system.

17.
BMC Bioinformatics ; 24(1): 177, 2023 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-37122001

RESUMO

There is strong evidence to support that mutations and dysregulation of miRNAs are associated with a variety of diseases, including cancer. However, the experimental methods used to identify disease-related miRNAs are expensive and time-consuming. Effective computational approaches to identify disease-related miRNAs are in high demand and would aid in the detection of lncRNA biomarkers for disease diagnosis, treatment, and prevention. In this study, we develop an ensemble learning framework to reveal the potential associations between miRNAs and diseases (ELMDA). The ELMDA framework does not rely on the known associations when calculating miRNA and disease similarities and uses multi-classifiers voting to predict disease-related miRNAs. As a result, the average AUC of the ELMDA framework was 0.9229 for the HMDD v2.0 database in a fivefold cross-validation. All potential associations in the HMDD V2.0 database were predicted, and 90% of the top 50 results were verified with the updated HMDD V3.2 database. The ELMDA framework was implemented to investigate gastric neoplasms, prostate neoplasms and colon neoplasms, and 100%, 94%, and 90%, respectively, of the top 50 potential miRNAs were validated by the HMDD V3.2 database. Moreover, the ELMDA framework can predict isolated disease-related miRNAs. In conclusion, ELMDA appears to be a reliable method to uncover disease-associated miRNAs.


Assuntos
Neoplasias do Colo , MicroRNAs , Neoplasias da Próstata , Masculino , Humanos , MicroRNAs/genética , Predisposição Genética para Doença , Algoritmos , Neoplasias da Próstata/genética , Neoplasias do Colo/genética , Biologia Computacional/métodos
18.
BMC Bioinformatics ; 24(1): 337, 2023 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-37697283

RESUMO

BACKGROUND AND OBJECTIVE: Diabetes is a life-threatening chronic disease with a growing global prevalence, necessitating early diagnosis and treatment to prevent severe complications. Machine learning has emerged as a promising approach for diabetes diagnosis, but challenges such as limited labeled data, frequent missing values, and dataset imbalance hinder the development of accurate prediction models. Therefore, a novel framework is required to address these challenges and improve performance. METHODS: In this study, we propose an innovative pipeline-based multi-classification framework to predict diabetes in three classes: diabetic, non-diabetic, and prediabetes, using the imbalanced Iraqi Patient Dataset of Diabetes. Our framework incorporates various pre-processing techniques, including duplicate sample removal, attribute conversion, missing value imputation, data normalization and standardization, feature selection, and k-fold cross-validation. Furthermore, we implement multiple machine learning models, such as k-NN, SVM, DT, RF, AdaBoost, and GNB, and introduce a weighted ensemble approach based on the Area Under the Receiver Operating Characteristic Curve (AUC) to address dataset imbalance. Performance optimization is achieved through grid search and Bayesian optimization for hyper-parameter tuning. RESULTS: Our proposed model outperforms other machine learning models, including k-NN, SVM, DT, RF, AdaBoost, and GNB, in predicting diabetes. The model achieves high average accuracy, precision, recall, F1-score, and AUC values of 0.9887, 0.9861, 0.9792, 0.9851, and 0.999, respectively. CONCLUSION: Our pipeline-based multi-classification framework demonstrates promising results in accurately predicting diabetes using an imbalanced dataset of Iraqi diabetic patients. The proposed framework addresses the challenges associated with limited labeled data, missing values, and dataset imbalance, leading to improved prediction performance. This study highlights the potential of machine learning techniques in diabetes diagnosis and management, and the proposed framework can serve as a valuable tool for accurate prediction and improved patient care. Further research can build upon our work to refine and optimize the framework and explore its applicability in diverse datasets and populations.


Assuntos
Diabetes Mellitus , Humanos , Teorema de Bayes , Diabetes Mellitus/diagnóstico , Sistemas Computacionais , Aprendizado de Máquina , Curva ROC
19.
Hum Brain Mapp ; 44(2): 801-812, 2023 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-36222055

RESUMO

Whether brain matter volume is correlated with cognitive functioning and higher intelligence is controversial. We explored this relationship by analysis of data collected on 193 healthy young and older adults through the "Leipzig Study for Mind-Body-Emotion Interactions" (LEMON) study. Our analysis involved four cognitive measures: fluid intelligence, crystallized intelligence, cognitive flexibility, and working memory. Brain subregion volumes were determined by magnetic resonance imaging. We normalized each subregion volume to the estimated total intracranial volume and conducted training simulations to compare the predictive power of normalized volumes of large regions of the brain (i.e., gray matter, cortical white matter, and cerebrospinal fluid), normalized subcortical volumes, and combined normalized volumes of large brain regions and normalized subcortical volumes. Statistical tests showed significant differences in the performance accuracy and feature importance of the subregion volumes in predicting cognitive skills for young and older adults. Random forest feature selection analysis showed that cortical white matter was the key feature in predicting fluid intelligence in both young and older adults. In young adults, crystallized intelligence was best predicted by caudate nucleus, thalamus, pallidum, and nucleus accumbens volumes, whereas putamen, amygdala, nucleus accumbens, and hippocampus volumes were selected for older adults. Cognitive flexibility was best predicted by the caudate, nucleus accumbens, and hippocampus in young adults and caudate and amygdala in older adults. Finally, working memory was best predicted by the putamen, pallidum, and nucleus accumbens in the younger group, whereas amygdala and hippocampus volumes were predictive in the older group. Thus, machine learning predictive models demonstrated an age-dependent association between subcortical volumes and cognitive measures. These approaches may be useful in predicting the likelihood of age-related cognitive decline and in testing of approaches for targeted improvement of cognitive functioning in older adults.


Assuntos
Encéfalo , Substância Cinzenta , Adulto Jovem , Humanos , Idoso , Encéfalo/diagnóstico por imagem , Encéfalo/patologia , Substância Cinzenta/diagnóstico por imagem , Substância Cinzenta/patologia , Núcleo Accumbens/patologia , Núcleo Caudado , Imageamento por Ressonância Magnética/métodos , Cognição
20.
Mem Cognit ; 51(3): 601-622, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36542319

RESUMO

One of the central issues in cognition is identifying universal and culturally specific patterns of thought. In this study, we examined how one aspect of culture, a linguistic part of speech known asclassifiers, are related to categorization of solid objects. In Experiment 1, we used a numeral classifier elicitation task to examine the classifiers used by speakers of Hmong, Japanese, and Mandarin Chinese (N = 34) with 135 nouns that referred to solid objects. In Experiment 2, adult speakers of English, Japanese, Mandarin Chinese, and Hmong (N = 64) rated the similarity of 39 pictured objects that depicted a subset of the nouns. All groups classified the objects into natural kinds and artifacts, with the category of humans anchoring both divisions. The main difference that emerged from the study was that speakers of Japanese and English rated humans and animals as more similar to each other than Hmong speakers; Mandarin speakers' ratings of the similarity between humans and animals fell in between those of Hmong and English speakers. However, the pattern of categorization of humans and animals found among speakers of the classifier languages contradicted their patterns of classifier use. The findings help to tease apart the effects of language from other cultural factors that impact cognition.


Assuntos
Comparação Transcultural , Idioma , Adulto , Humanos , Cognição , Fala
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA