Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Biomed Eng Online ; 23(1): 60, 2024 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-38909231

RESUMO

BACKGROUND: Left ventricular enlargement (LVE) is a common manifestation of cardiac remodeling that is closely associated with cardiac dysfunction, heart failure (HF), and arrhythmias. This study aimed to propose a machine learning (ML)-based strategy to identify LVE in HF patients by means of pulse wave signals. METHOD: We constructed two high-quality pulse wave datasets comprising a non-LVE group and an LVE group based on the 264 HF patients. Fourier series calculations were employed to determine if significant frequency differences existed between the two datasets, thereby ensuring their validity. Then, the ML-based identification was undertaken by means of classification and regression models: a weighted random forest model was employed for binary classification of the datasets, and a densely connected convolutional network was utilized to directly estimate the left ventricular diastolic diameter index (LVDdI) through regression. Finally, the accuracy of the two models was validated by comparing their results with clinical measurements, using accuracy and the area under the receiver operating characteristic curve (AUC-ROC) to assess their capability for identifying LVE patients. RESULTS: The classification model exhibited superior performance with an accuracy of 0.91 and an AUC-ROC of 0.93. The regression model achieved an accuracy of 0.88 and an AUC-ROC of 0.89, indicating that both models can quickly and accurately identify LVE in HF patients. CONCLUSION: The proposed ML methods are verified to achieve effective classification and regression with good performance for identifying LVE in HF patients based on pulse wave signals. This study thus demonstrates the feasibility and potential of the ML-based strategy for clinical practice while offering an effective and robust tool for diagnosing and intervening ventricular remodeling.


Assuntos
Insuficiência Cardíaca , Aprendizado de Máquina , Análise de Onda de Pulso , Humanos , Insuficiência Cardíaca/fisiopatologia , Feminino , Masculino , Pessoa de Meia-Idade , Idoso , Processamento de Sinais Assistido por Computador , Hipertrofia Ventricular Esquerda/fisiopatologia , Hipertrofia Ventricular Esquerda/diagnóstico por imagem
2.
Remote Sens Environ ; 2712022 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-37033879

RESUMO

Wildland fire smoke contains large amounts of PM2.5 that can traverse tens to hundreds of kilometers, resulting in significant deterioration of air quality and excess mortality and morbidity in downwind regions. Estimating PM2.5 levels while considering the impact of wildfire smoke has been challenging due to the lack of ground monitoring coverage near the smoke plumes. We aim to estimate total PM2.5 concentration during the Camp Fire episode, the deadliest wildland fire in California history. Our random forest (RF) model combines calibrated low-cost sensor data (PurpleAir) with regulatory monitor measurements (Air Quality System, AQS) to bolster ground observations, Geostationary Operational Environmental Satellite-16 (GOES-16)'s high temporal resolution to achieve hourly predictions, and oversampling techniques (Synthetic Minority Oversampling Technique, SMOTE) to reduce model underestimation at high PM2.5 levels. In addition, meteorological fields at 3 km resolution from the High-Resolution Rapid Refresh model and land use variables were also included in the model. Our AQS-only model achieved an out of bag (OOB) R2 (RMSE) of 0.84 (12.00 µg/m3) and spatial and temporal cross-validation (CV) R2 (RMSE) of 0.74 (16.28 µg/m3) and 0.73 (16.58 µg/m3), respectively. Our AQS + Weighted PurpleAir Model achieved OOB R2 (RMSE) of 0.86 (9.52 µg/m3) and spatial and temporal CV R2 (RMSE) of 0.75 (14.93 µg/m3) and 0.79 (11.89 µg/m3), respectively. Our AQS + Weighted PurpleAir + SMOTE Model achieved OOB R2 (RMSE) of 0.92 (10.44 µg/m3) and spatial and temporal CV R2 (RMSE) of 0.84 (12.36 µg/m3) and 0.85 (14.88 µg/m3), respectively. Hourly predictions from our model may aid in epidemiological investigations of intense and acute exposure to PM2.5 during the Camp Fire episode.

3.
Sensors (Basel) ; 18(12)2018 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-30518132

RESUMO

In recent years, researchers of deep neural networks (DNNs)-based facial expression recognition (FER) have reported results showing that these approaches overcome the limitations of conventional machine learning-based FER approaches. However, as DNN-based FER approaches require an excessive amount of memory and incur high processing costs, their application in various fields is very limited and depends on the hardware specifications. In this paper, we propose a fast FER algorithm for monitoring a driver's emotions that is capable of operating in low specification devices installed in vehicles. For this purpose, a hierarchical weighted random forest (WRF) classifier that is trained based on the similarity of sample data, in order to improve its accuracy, is employed. In the first step, facial landmarks are detected from input images and geometric features are extracted, considering the spatial position between landmarks. These feature vectors are then implemented in the proposed hierarchical WRF classifier to classify facial expressions. Our method was evaluated experimentally using three databases, extended Cohn-Kanade database (CK+), MMI and the Keimyung University Facial Expression of Drivers (KMU-FED) database, and its performance was compared with that of state-of-the-art methods. The results show that our proposed method yields a performance similar to that of deep learning FER methods as 92.6% for CK+ and 76.7% for MMI, with a significantly reduced processing cost approximately 3731 times less than that of the DNN method. These results confirm that the proposed method is optimized for real-time embedded applications having limited computing resources.


Assuntos
Condução de Veículo/psicologia , Face/fisiologia , Expressão Facial , Memória/fisiologia , Bases de Dados Factuais , Aprendizado Profundo , Emoções , Humanos , Aprendizado de Máquina , Redes Neurais de Computação
4.
medRxiv ; 2024 Feb 28.
Artigo em Inglês | MEDLINE | ID: mdl-37546990

RESUMO

In early 2020, the Coronavirus Disease 19 (COVID-19) rapidly spread across the United States (US), exhibiting significant geographic variability. While several studies have examined the predictive relationships of differing factors on COVID-19 deaths, few have looked at spatiotemporal variation at refined geographic scales. The objective of this analysis is to examine this spatiotemporal variation in COVID-19 deaths with respect to association with socioeconomic, health, demographic, and political factors. We use multivariate regression applied to Health and Human Services (HHS) regions as well as nationwide county-level geographically weighted random forest (GWRF) models. Analyses were performed on data from three separate time frames which correspond to the spread of distinct viral variants in the US: pandemic onset until May 2021, May 2021 through November 2021, and December 2021 until April 2022. Multivariate regression results for all regions across three time windows suggest that existing measures of social vulnerability for disaster preparedness (SVI) are predictive of a higher degree of mortality from COVID-19. In comparison, GWRF models provide a more robust evaluation of feature importance and prediction, exposing the value of local features for prediction, such as obesity, which is obscured by coarse-grained analysis. Overall, GWRF results indicate that this more nuanced modeling strategy is useful for determining the spatial variation in the importance of sociodemographic risk factors for predicting COVID-19 mortality.

5.
BioData Min ; 17(1): 10, 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38627770

RESUMO

BACKGROUND: Gene network information is believed to be beneficial for disease module and pathway identification, but has not been explicitly utilized in the standard random forest (RF) algorithm for gene expression data analysis. We investigate the performance of a network-guided RF where the network information is summarized into a sampling probability of predictor variables which is further used in the construction of the RF. RESULTS: Our simulation results suggest that network-guided RF does not provide better disease prediction than the standard RF. In terms of disease gene discovery, if disease genes form module(s), network-guided RF identifies them more accurately. In addition, when disease status is independent from genes in the given network, spurious gene selection results can occur when using network information, especially on hub genes. Our empirical analysis on two balanced microarray and RNA-Seq breast cancer datasets from The Cancer Genome Atlas (TCGA) for classification of progesterone receptor (PR) status also demonstrates that network-guided RF can identify genes from PGR-related pathways, which leads to a better connected module of identified genes. CONCLUSIONS: Gene networks can provide additional information to aid the gene expression analysis for disease module and pathway identification. But they need to be used with caution and validation on the results need to be carried out to guard against spurious gene selection. More robust approaches to incorporate such information into RF construction also warrant further study.

6.
Sci Total Environ ; 871: 162005, 2023 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-36758700

RESUMO

Environmental stressors including high temperature and air pollution cause health problems. However, understanding how the combined exposure to heat and air pollution affects both physical and mental health remains insufficient due to the complexity of such effects mingling with human society, urban and natural environments. Our study roots in the Social Ecological Theory and employs a tri-environmental conceptual framework (i.e., across social, built and natural environment) to examine how the combined exposure to heat and air pollution affect self-reported physical and mental health via, for the first time, the fine-grained nationwide investigation in Australia and highlight how such effects vary across inter- and intra-urban areas. We conducted an ecological study to explore the importance of heat and air quality to physical and mental health by considering 48 tri-environmental confounders through the global and local random forest regression models, as advanced machine learning methods with the advantage of revealing the spatial heterogeneity of variables. Our key findings are threefold. First, the social and built environmental factors are important to physical and mental health in both urban and rural areas, and even more important than exposure to heat and air pollution. Second, the relationship between temperature and air quality and health follows a V-shape, reflecting people's different adaptation and tolerance to temperature and air quality. Third, the important roles that heat and air pollution play in physical and mental health are most obvious in the inner-city and near inner-city areas of the major capital cities, as well as in the industrial zones in peri-urban regions and in Darwin city with a low-latitude. We draw several policy implications to minimise the inter- and intra-urban differences in healthcare access and service distribution to populations with different sensitivity to heat and air quality across urban and rural areas. Our conceptual framework can also be applied to examine the relationship between other environmental problems and health outcomes in the era of a warming climate.


Assuntos
Poluentes Atmosféricos , Poluição do Ar , Humanos , Temperatura Alta , Cidades , Clima , Temperatura , Poluentes Atmosféricos/análise
7.
Genes (Basel) ; 13(12)2022 12 12.
Artigo em Inglês | MEDLINE | ID: mdl-36553611

RESUMO

In the studies of Alzheimer's disease (AD), jointly analyzing imaging data and genetic data provides an effective method to explore the potential biomarkers of AD. AD can be separated into healthy controls (HC), early mild cognitive impairment (EMCI), late mild cognitive impairment (LMCI) and AD. In the meantime, identifying the important biomarkers of AD progression, and analyzing these biomarkers in AD provide valuable insights into understanding the mechanism of AD. In this paper, we present a novel data fusion method and a genetic weighted random forest method to mine important features. Specifically, we amplify the difference among AD, LMCI, EMCI and HC by introducing eigenvalues calculated from the gene p-value matrix for feature fusion. Furthermore, we construct the genetic weighted random forest using the resulting fused features. Genetic evolution is used to increase the diversity among decision trees and the decision trees generated are weighted by weights. After training, the genetic weighted random forest is analyzed further to detect the significant fused features. The validation experiments highlight the performance and generalization of our proposed model. We analyze the biological significance of the results and identify some significant genes (CSMD1, CDH13, PTPRD, MACROD2 and WWOX). Furthermore, the calcium signaling pathway, arrhythmogenic right ventricular cardiomyopathy and the glutamatergic synapse pathway were identified. The investigational findings demonstrate that our proposed model presents an accurate and efficient approach to identifying significant biomarkers in AD.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Humanos , Encéfalo , Algoritmo Florestas Aleatórias , Imageamento por Ressonância Magnética/métodos , Disfunção Cognitiva/diagnóstico por imagem , Disfunção Cognitiva/genética , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/genética , Biomarcadores
8.
Int J Numer Method Biomed Eng ; 37(12): e3525, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34431606

RESUMO

Recently, a significant way to diagnose the disease is using the model of medical data mining. The most challenging task in the healthcare field is to face a large amount of data during disease analyzes and prediction. Once the data are transformed into valuable data by means of data mining models then the actual prediction and decision making is easier. The existing studies met few shortcomings because of higher execution time, more computational complexities, less scalability, slow convergence, and lack of providing the solution. In this article, we have proposed an ensemble SVM-based sample weighted random forests (eSVM-swRF) with novel improved colliding body optimization (NICBO) algorithm to predict liver diseases. The extraction, loading, transformation, and analysis (ELTA) are used to pre-process the patient data. The significant feature with a suitable model is generated depending upon the filter-based method. Based on eSVM-swRF, the parameter values such as penalty parameter (P), threshold (T), and mTry are optimized via a novel improved colliding boding optimization (NICBO) algorithm. The UCI dataset provides liver disease data for this study. The implementation platform of RapidMiner Studio version 7.6 with different evaluation measures is used to validate the performance of eSVM-swRF with the NICBO method. Anyway, the proposed method yields outstanding performance than other existing methods such as Particle Swarm Optimization-based Support Vector Machine (PSO-SVM), fuzzy adaptive, and neighbor weighted k-NN (FuzzyANWKNN), Naïve Bayes-based Support Vector Machine (NB-SVM), and Neural network.


Assuntos
Hepatopatias , Máquina de Vetores de Suporte , Algoritmos , Teorema de Bayes , Humanos , Hepatopatias/diagnóstico , Redes Neurais de Computação
9.
Comput Methods Programs Biomed ; 211: 106420, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34555589

RESUMO

OBJECTIVE: Hepatic encephalopathy (HE) is among the most common complications of cirrhosis. Data for cirrhosis with HE is typically unbalanced. Traditional statistical methods and machine learning algorithms thus cannot identify a few classes. In this paper, we use machine learning algorithms to construct a risk prediction model for liver cirrhosis complicated by HE to improve the efficiency of its prediction. METHOD: We collected medical data from 1,256 patients with cirrhosis and performed preprocessing to extract 81 features from these irregular data. To predict HE in cirrhotic patients, we compared several classification methods: logistic regression, weighted random forest (WRF), SVM, and weighted SVM (WSVM). We also used an additional 722 patients with cirrhosis for external validation of the model. RESULTS: The WRF, WSVM, and logistic regression models exhibited better recognition ability for patients with HE than traditional machine learning models (sensitivity> 0.70), but their ability to identify patients with uncomplicated HE was slightly lower (specificity approximately 85%). The comprehensive evaluation index of the traditional model was higher than those of other models (G-means> 0.80 and F-measure> 0.40). For the WRF, the G-means (0.82), F-measure (0.46), and AUC (0.82) were superior to those of the logistic regression and WSVM models, which means that it can better predict the incidence of HE in patients. CONCLUSION: The WRF model is more suitable for the classification of unbalanced medical data and can be used to construct a risk prediction and evaluation system for liver cirrhosis complicated with HE. The probabilistic prediction models of WRF can help clinicians identify high-risk patients with HE.


Assuntos
Encefalopatia Hepática , Algoritmos , Encefalopatia Hepática/diagnóstico , Encefalopatia Hepática/etiologia , Humanos , Cirrose Hepática/complicações , Modelos Logísticos , Aprendizado de Máquina
10.
Electronics (Basel) ; 9(1)2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-32051761

RESUMO

Construction of an ensemble model is a process of combining many diverse base predictive learners. It arises questions of how to weight each model and how to tune the parameters of the weighting process. The most straightforward approach is simply to average the base models. However, numerous studies have shown that a weighted ensemble can provide superior prediction results to a simple average of models. The main goals of this article are to propose a new weighting algorithm applicable for each tree in the Random Forest model and the comprehensive examination of the optimal parameter tuning. Importantly, the approach is motivated by its flexibility, good performance, stability, and resistance to overfitting. The proposed scheme is examined and evaluated on the Physionet/Computing in Cardiology Challenge 2015 data set. It consists of signals (electrocardiograms and pulsatory waveforms) from intensive care patients which triggered an alarm for five cardiac arrhythmia types (Asystole, Bradycardia, Tachycardia, Ventricular Tachycardia, and Ventricular Fultter/Fibrillation). The classification problem regards whether the alarm should or should not have been generated. It was proved that the proposed weighting approach improved classification accuracy for the three most challenging out of the five investigated arrhythmias comparing to the standard Random Forest model.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA