Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

Application of machine learning techniques in the diagnosis of endometriosis.

Zhao, Ningning; Hao, Ting; Zhang, Fengge; Ni, Qin; Zhu, Dan; Wang, Yanan; Shi, Yali; Mi, Xin.

BMC Womens Health ; 24(1): 491, 2024 Sep 05.

Artigo em Inglês | MEDLINE | ID: mdl-39237940

RESUMO

OBJECTIVE: The aim of this study is to assess the use of machine learning methodologies in the diagnosis of endometriosis (EM). METHODS: This study included a total of 106 patients with EM and 203 patients with non-EM conditions (like simple cysts and simple uterine fibroids), all admitted to the Shunyi Women's and Children's Hospital of Beijing Children's Hospital between January 2017 and September 2022. All participants were free of comorbidities and their diagnoses were confirmed via postoperative pathology. Comparative analysis was conducted between the EM and non-EM groups. Baseline data were assessed, including white blood cell count, neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio, lymphocyte-to-monocyte ratio, mean platelet volume, hemoglobin, carbohydrate antigen 125 (CA125), carbohydrate antigen 199, coagulation parameters, and other serologic indicators. An optimal predictive model was developed using an artificial intelligence algorithm to determine the presence of EM. The objective is to provide new insights for the clinical diagnosis and treatment of EM. RESULTS: The random forest algorithm demonstrated superior performance when compared to decision trees, LogitBoost, artificial neural networks, naïve Bayes, support vector machines, and linear regression in machine learning methods. Combining CA125 with the NLR yielded a better prediction of EM than using CA125 alone when applying the random forest algorithm. The accuracy of predicting EM with CA125 combined with NLR was 78.16%, with a sensitivity of 86.21% and an area under the curve (AUC) of 0.85 (P < 0.05). In contrast, using CA125 alone resulted in an EM prediction accuracy of 75.8%, with a sensitivity of 79.3% and an AUC of 0.82 (P < 0.05). CONCLUSION: The diagnostic value of serum CA125 combined with the NLR for EM is higher than that of serum CA125 alone. This finding indicates that NLR could serve as a new supplementary biomarker along with serum CA125 in the diagnosis of EM.

Assuntos

Antígeno Ca-125 , Endometriose , Aprendizado de Máquina , Humanos , Feminino , Endometriose/diagnóstico , Endometriose/sangue , Antígeno Ca-125/sangue , Adulto , Neutrófilos , Algoritmos

2.

Identifying Patient Subpopulations with Significant Race-Sex Differences in Emergency Department Disposition Decisions.

Lin, Peter; Argon, Nilay T; Cheng, Qian; Evans, Christopher S; Linthicum, Benjamin; Liu, Yufeng; Mehrotra, Abhishek; Murphy, Laura; Patel, Mehul D; Ziya, Serhan.

Health Serv Insights ; 17: 11786329241277724, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39247491

RESUMO

Background/objectives: The race-sex differences in emergency department (ED) disposition decisions have been reported widely. Our objective is to identify demographic and clinical subgroups for which this difference is most pronounced, which will facilitate future targeted research on potential disparities and interventions. Methods: We performed a retrospective analysis of 93 987 White and African-American adults assigned an Emergency Severity Index of 3 at 3 large EDs from January 2019 to February 2020. Using random forests, we identified the Elixhauser comorbidity score, age, and insurance status as important variables to divide data into subpopulations. Logistic regression models were then fitted to test race-sex differences within each subpopulation while controlling for other patient characteristics and ED conditions. Results: In each subpopulation, African-American women were less likely to be admitted than White men with odds ratios as low as 0.304 (95% confidence interval (CI): [0.229, 0.404]). African-American men had smaller admission odds compared to White men in subpopulations of 41+ years of age or with very low/high Elixhauser scores, odds ratios being as low as 0.652 (CI: [0.590, 0.747]). White women were less likely to be admitted than White men in subpopulations of 18 to 40 or 41 to 64 years of age, with low Elixhauser scores, or with Self-Pay or Medicaid insurance status with odds ratios as low as 0.574 (CI: [0.421, 0.784]). Conclusions: While differences in likelihood of admission were lessened by younger age for African-American men, and by older age, higher Elixhauser score, and Medicare or Commercial insurance for White women, they persisted in all subgroups for African-American women. In general, patients of age 64 years or younger, with low comorbidity scores, or with Medicaid or no insurance appeared most prone to potential disparities in admissions.

3.

Radio Signal Modulation Recognition Method Based on Hybrid Feature and Ensemble Learning: For Radar and Jamming Signals.

Zhou, Yu; Cao, Ronggang; Zhang, Anqi; Li, Ping.

Sensors (Basel) ; 24(15)2024 Jul 24.

Artigo em Inglês | MEDLINE | ID: mdl-39123855

RESUMO

The detection performance of radar is significantly impaired by active jamming and mutual interference from other radars. This paper proposes a radio signal modulation recognition method to accurately recognize these signals, which helps in the jamming cancellation decisions. Based on the ensemble learning stacking algorithm improved by meta-feature enhancement, the proposed method adopts random forests, K-nearest neighbors, and Gaussian naive Bayes as the base-learners, with logistic regression serving as the meta-learner. It takes the multi-domain features of signals as input, which include time-domain features including fuzzy entropy, slope entropy, and Hjorth parameters; frequency-domain features, including spectral entropy; and fractal-domain features, including fractal dimension. The simulation experiment, including seven common signal types of radar and active jamming, was performed for the effectiveness validation and performance evaluation. Results proved the proposed method's performance superiority to other classification methods, as well as its ability to meet the requirements of low signal-to-noise ratio and few-shot learning.

4.

Accurate space-group prediction from composition.

Venkatraman, Vishwesh; Carvalho, Patricia Almeida.

J Appl Crystallogr ; 57(Pt 4): 975-985, 2024 Aug 01.

Artigo em Inglês | MEDLINE | ID: mdl-39108811

RESUMO

Predicting crystal symmetry simply from chemical composition has remained challenging. Several machine-learning approaches can be employed, but the predictive value of popular crystallographic databases is relatively modest due to the paucity of data and uneven distribution across the 230 space groups. In this work, virtually all crystallographic information available to science has been compiled and used to train and test multiple machine-learning models. Composition-driven random-forest classification relying on a large set of descriptors showed the best performance. The predictive models for crystal system, Bravais lattice, point group and space group of inorganic compounds are made publicly available as easy-to-use software downloadable from https://gitlab.com/vishsoft/cosy.

5.

The accuracy of predicting maladaptation to new environments with genomic data.

Lind, Brandon M; Lotterhos, Katie E.

Mol Ecol Resour ; : e14008, 2024 Aug 30.

Artigo em Inglês | MEDLINE | ID: mdl-39212146

RESUMO

Rapid environmental change poses unprecedented challenges to species persistence. To understand the extent that continued change could have, genomic offset methods have been used to forecast maladaptation of natural populations to future environmental change. However, while their use has become increasingly common, little is known regarding their predictive performance across a wide array of realistic and challenging scenarios. Here, we evaluate the performance of currently available offset methods (gradientForest, the Risk-Of-Non-Adaptedness, redundancy analysis with and without structure correction and LFMM2) using an extensive set of simulated data sets that vary demography, adaptive architecture and the number and spatial patterns of adaptive environments. For each data set, we train models using either all, adaptive or neutral marker sets and evaluate performance using in silico common gardens by correlating known fitness with projected offset. Using over 4,849,600 of such evaluations, we find that (1) method performance is largely due to the degree of local adaptation across the metapopulation (LA), (2) adaptive marker sets provide minimal performance advantages, (3) performance within the species range is variable across gardens and declines when offset models are trained using additional non-adaptive environments and (4) despite (1) performance declines more rapidly in globally novel climates (i.e. a climate without an analogue within the species range) for metapopulations with greater LA than lesser LA. We discuss the implications of these results for management, assisted gene flow and assisted migration.

6.

Identification of Dementia & Mild Cognitive Impairment in Chinese Elderly Using Machine Learning.

Ying, Tong-Tong; Zhuang, Li-Ying; Xu, Shan-Hu; Zhang, Shu-Feng; Huang, Li-Jun; Gao, Wei-Wei; Liu, Lu; Lai, Qi-Lun; Lou, Yue; Liu, Xiao-Li.

Am J Alzheimers Dis Other Demen ; 39: 15333175241275215, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39133478

RESUMO

OBJECTIVE: To assess the role of Machine Learning (ML) in identification critical factors of dementia and mild cognitive impairment. METHODS: 371 elderly individuals were ultimately included in the ML analysis. Demographic information (including gender, age, parity, visual acuity, auditory function, mobility, and medication history) and 35 features from 10 assessment scales were used for modeling. Five machine learning classifiers were used for evaluation, employing a procedure involving feature extraction, selection, model training, and performance assessment to identify key indicative factors. RESULTS: The Random Forest model, after data preprocessing, Information Gain, and Meta-analysis, utilized three training features and four meta-features, achieving an area under the curve of 0.961 and a accuracy of 0.894, showcasing exceptional accuracy for the identification of dementia and mild cognitive impairment. CONCLUSIONS: ML serves as a identification tool for dementia and mild cognitive impairment. Using Information Gain and Meta-feature analysis, Clinical Dementia Rating (CDR) and Neuropsychiatric Inventory (NPI) scale information emerged as crucial for training the Random Forest model.

Assuntos

Disfunção Cognitiva , Demência , Aprendizado de Máquina , Humanos , Disfunção Cognitiva/diagnóstico , Feminino , Idoso , Masculino , Demência/diagnóstico , China , Idoso de 80 Anos ou mais , Testes Neuropsicológicos/normas , Testes Neuropsicológicos/estatística & dados numéricos , População do Leste Asiático

7.

Identifying early language predictors: A replication of Gasparini et al. (2023) confirming applicability in a general population cohort.

Gasparini, Loretta; Shepherd, Daisy A; Wang, Jing; Wake, Melissa; Morgan, Angela T.

Int J Lang Commun Disord ; 2024 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-38948964

RESUMO

BACKGROUND: Identifying language disorders earlier can help children receive the support needed to improve developmental outcomes and quality of life. Despite the prevalence and impacts of persistent language disorder, there are surprisingly no robust predictor tools available. This makes it difficult for researchers to recruit young children into early intervention trials, which in turn impedes advances in providing effective early interventions to children who need it. AIMS: To validate externally a predictor set of six variables previously identified to be predictive of language at 11 years of age, using data from the Longitudinal Study of Australian Children (LSAC) birth cohort. Also, to examine whether additional LSAC variables arose as predictive of language outcome. METHODS & PROCEDURES: A total of 5107 children were recruited to LSAC with developmental measures collected from 0 to 3 years. At 11-12 years, children completed the Clinical Evaluation of Language Fundamentals, 4th Edition, Recalling Sentences subtest. We used SuperLearner to estimate the accuracy of six previously identified parent-reported variables from ages 2-3 years in predicting low language (sentence recall score ≥ 1.5 SD below the mean) at 11-12 years. Random forests were used to identify any additional variables predictive of language outcome. OUTCOMES & RESULTS: Complete data were available for 523 participants (52.20% girls), 27 (5.16%) of whom had a low language score. The six predictors yielded fair accuracy: 78% sensitivity (95% confidence interval (CI) = [58, 91]) and 71% specificity (95% CI = [67, 75]). These predictors relate to sentence complexity, vocabulary and behaviour. The random forests analysis identified similar predictors. CONCLUSIONS & IMPLICATIONS: We identified an ultra-short set of variables that predicts 11-12-year language outcome with 'fair' accuracy. In one of few replication studies of this scale in the field, these methods have now been conducted across two population-based cohorts, with consistent results. An imminent practical implication of these findings is using these predictors to aid recruitment into early language intervention studies. Future research can continue to refine the accuracy of early predictors to work towards earlier identification in a clinical context. WHAT THIS PAPER ADDS: What is already known on the subject There are no robust predictor sets of child language disorder despite its prevalence and far-reaching impacts. A previous study identified six variables collected at age 2-3 years that predicted 11-12-year language with 75% sensitivity and 81% specificity, which warranted replication in a separate cohort. What this study adds to the existing knowledge We used machine learning methods to identify a set of six questions asked at age 2-3 years with ≥ 71% sensitivity and specificity for predicting low language outcome at 11-12 years, now showing consistent results across two large-scale population-based cohort studies. What are the potential or clinical implications of this work? This predictor set is more accurate than existing feasible methods and can be translated into a low-resource and time-efficient recruitment tool for early language intervention studies, leading to improved clinical service provision for young children likely to have persisting language difficulties.

8.

Exploratory subgroup identification in the heterogeneous Cox model: A relatively simple procedure.

León, Larry F; Jemielita, Thomas; Guo, Zifang; Marceau West, Rachel; Anderson, Keaven M.

Stat Med ; 43(20): 3921-3942, 2024 Sep 10.

Artigo em Inglês | MEDLINE | ID: mdl-38951867

RESUMO

For survival analysis applications we propose a novel procedure for identifying subgroups with large treatment effects, with focus on subgroups where treatment is potentially detrimental. The approach, termed forest search, is relatively simple and flexible. All-possible subgroups are screened and selected based on hazard ratio thresholds indicative of harm with assessment according to the standard Cox model. By reversing the role of treatment one can seek to identify substantial benefit. We apply a splitting consistency criteria to identify a subgroup considered "maximally consistent with harm." The type-1 error and power for subgroup identification can be quickly approximated by numerical integration. To aid inference we describe a bootstrap bias-corrected Cox model estimator with variance estimated by a Jacknife approximation. We provide a detailed evaluation of operating characteristics in simulations and compare to virtual twins and generalized random forests where we find the proposal to have favorable performance. In particular, in our simulation setting, we find the proposed approach favorably controls the type-1 error for falsely identifying heterogeneity with higher power and classification accuracy for substantial heterogeneous effects. Two real data applications are provided for publicly available datasets from a clinical trial in oncology, and HIV.

Assuntos

Simulação por Computador , Infecções por HIV , Modelos de Riscos Proporcionais , Humanos , Análise de Sobrevida

9.

Analysis of the Correspondence of the Degree of Fragility with the Way to Exercise the Force of the Hand.

Guindal, E P; Parra, X; Musté, M; Pérez, C; Macho, O; Català, A.

J Frailty Aging ; 13(3): 248-253, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39082769

RESUMO

BACKGROUND: Frailty is a geriatric syndrome characterized by increased individual vulnerability with an increase in both dependence and mortality when exposed to external stressors. The use of Frailty Indices in routine clinical practice is limited by several factors, such as the cognitive status of the patient, times of consultation, or lack of prior information from the patient. OBJECTIVES: In this study, we propose the generation of an objective measure of frailty, based on the signal from hand grip strength (HGS). DESIGN AND MEASUREMENTS: This signal was recorded with a modified Deyard dynamometer and processed using machine learning strategies based on supervised learning methods to train classifiers. A database was generated from a cohort of 138 older adults in a transverse pilot study that combined classical geriatric questionnaires with physiological data. PARTICIPANTS: Participants were patients selected by geriatricians of medical services provided by collaborating entities. SETTING AND RESULTS: To process the generated information 20 selected significant features of the HGS dataset were filtered, cleaned, and extracted. A technique based on a combination of the Synthetic Minority Oversampling Technique (SMOTE) to generate new samples from the smallest group and ENN (technique based on K-nearest neighbors) to remove noisy samples provided the best results as a well-balanced distribution of data. CONCLUSION: A Random Forest Classifier was trained to predict the frailty label with 92.9% of accuracy, achieving sensitivities higher than 90%.

Assuntos

Fragilidade , Avaliação Geriátrica , Força da Mão , Humanos , Força da Mão/fisiologia , Idoso , Feminino , Masculino , Fragilidade/diagnóstico , Avaliação Geriátrica/métodos , Idoso de 80 Anos ou mais , Projetos Piloto , Idoso Fragilizado , Aprendizado de Máquina , Dinamômetro de Força Muscular

10.

U-Net Convolutional Neural Network for Mapping Natural Vegetation and Forest Types from Landsat Imagery in Southeastern Australia.

Boston, Tony; Van Dijk, Albert; Thackway, Richard.

J Imaging ; 10(6)2024 Jun 13.

Artigo em Inglês | MEDLINE | ID: mdl-38921620

RESUMO

Accurate and comparable annual mapping is critical to understanding changing vegetation distribution and informing land use planning and management. A U-Net convolutional neural network (CNN) model was used to map natural vegetation and forest types based on annual Landsat geomedian reflectance composite images for a 500 km × 500 km study area in southeastern Australia. The CNN was developed using 2018 imagery. Label data were a ten-class natural vegetation and forest classification (i.e., Acacia, Callitris, Casuarina, Eucalyptus, Grassland, Mangrove, Melaleuca, Plantation, Rainforest and Non-Forest) derived by combining current best-available regional-scale maps of Australian forest types, natural vegetation and land use. The best CNN generated using six Landsat geomedian bands as input produced better results than a pixel-based random forest algorithm, with higher overall accuracy (OA) and weighted mean F1 score for all vegetation classes (93 vs. 87% in both cases) and a higher Kappa score (86 vs. 74%). The trained CNN was used to generate annual vegetation maps for 2000-2019 and evaluated for an independent test area of 100 km × 100 km using statistics describing accuracy regarding the label data and temporal stability. Seventy-six percent of pixels did not change over the 20 years (2000-2019), and year-on-year results were highly correlated (94-97% OA). The accuracy of the CNN model was further verified for the study area using 3456 independent vegetation survey plots where the species of interest had ≥ 50% crown cover. The CNN showed an 81% OA compared with the plot data. The model accuracy was also higher than the label data (76%), which suggests that imperfect training data may not be a major obstacle to CNN-based mapping. Applying the CNN to other regions would help to test the spatial transferability of these techniques and whether they can support the automated production of accurate and comparable annual maps of natural vegetation and forest types required for national reporting.

11.

[Prediction Model of Groundwater Sulphate Based on Combined Multi-source Spatio-temporal Data].

Li, Ru-Yue; Zeng, Yan-Yan; Zhou, Jin-Long; Sun, Ying; Yan, Zhi-Yun.

Huan Jing Ke Xue ; 45(6): 3153-3164, 2024 Jun 08.

Artigo em Chinês | MEDLINE | ID: mdl-38897739

RESUMO

The accurate prediction of spatial variation trends in groundwater SO42- is of great significance for improving groundwater quality and regional groundwater management level. The multi-source spatio-temporal data such as land cover data, soil parameter data, digital elevation data, and groundwater pH value in the plain area of the Yarkant River Basin in 2011, 2014, 2017, and 2020 were used as characteristic variables to analyze their correlation with groundwater SO42- concentration. To enhance the prediction accuracy, the Bayesian optimization algorithm (BOA) was used to optimize the random forest regression (RFR). Based on the BOA-RFR model, the importance of the characteristic variables was analyzed, the prediction accuracy of the model was evaluated, and the groundwater SO42- prediction map was generated. The results showed that pH value, ground elevation (GE), and percentage of bare land (BAR) in the contribution area were important parameters influencing groundwater hydrochemical composition, which were significantly negatively correlated with groundwater SO42- concentration, and the importance of impact factors for predicting groundwater SO42- concentration exceeded 25 %. The geostatistical interpolation method was used as an auxiliary tool for the predictive modeling of spatial distribution. After adding auxiliary samples, the R2 of groundwater SO42- concentration prediction of the BOA-RFR model was greater than 0.96, and the maximum values of RMSE and MAE were reduced by 4.7 % and 23.8 %, respectively, compared with the minimum values of the model with fewer samples. The SO42- concentration prediction map showed that high SO42- groundwater was enriched in the northeast of the plain area of the Yarkand River Basin, an area that was expanding.

12.

AI-driven discovery of blood xenobiotic biomarkers in neovascular age-related macular degeneration using iterative random forests.

Künzel, Steffen E; Frentzel, Dominik P; Flesch, Leonie T M; Knecht, Vitus A; Rübsam, Anne; Dreher, Felix; Schütte, Moritz; Dubrac, Alexandre; Lange, Bodo; Yaspo, Marie-Laure; Lehrach, Hans; Joussen, Antonia M; Zeitz, Oliver.

Graefes Arch Clin Exp Ophthalmol ; 2024 Jun 06.

Artigo em Inglês | MEDLINE | ID: mdl-38842593

RESUMO

PURPOSE: To investigate the xenobiotic profiles of patients with neovascular age-related macular degeneration (nAMD) undergoing anti-vascular endothelial growth factor (anti-VEGF) intravitreal therapy (IVT) to identify biomarkers indicative of clinical phenotypes through advanced AI methodologies. METHODS: In this cross-sectional observational study, we analyzed 156 peripheral blood xenobiotic features in a cohort of 46 nAMD patients stratified by choroidal neovascularization (CNV) control under anti-VEGF IVT. We employed Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS) for measurement and leveraged an AI-driven iterative Random Forests (iRF) approach for robust pattern recognition and feature selection, aligning molecular profiles with clinical phenotypes. RESULTS: AI-augmented iRF models effectively refined the metabolite spectrum by discarding non-predictive elements. Perfluorooctanesulfonate (PFOS) and Ethyl ß-glucopyranoside were identified as significant biomarkers through this process, associated with various clinically relevant phenotypes. Unlike single metabolite classes, drug metabolites were distinctly correlated with subretinal fluid presence. CONCLUSIONS: This study underscores the enhanced capability of AI, particularly iRF, in dissecting complex metabolomic data to elucidate the xenobiotic landscape of nAMD and environmental impact on the disease. The preliminary biomarkers discovered offer promising directions for personalized treatment strategies, although further validation in broader cohorts is essential for clinical application.

13.

High-Resolution Spatiotemporal Modeling for PM_2.5 Major Components in the Pearl River Delta and Its Implications for Epidemiological Studies.

Wen, Li; Kang, Ning; Wang, Lijie; Wei, Qiannan; Zhang, Hedi; Shen, Jianling; Yue, Dingli; Zhai, Yuhong; Lin, Weiwei.

Environ Sci Technol ; 58(25): 10920-10931, 2024 Jun 25.

Artigo em Inglês | MEDLINE | ID: mdl-38861590

RESUMO

Distinguishing the effects of different fine particulate matter components (PMCs) is crucial for mitigating their effects on human health. However, the sparse distribution of locations where PM is collected for component analysis makes it challenging to investigate the relevant health effects. This study aimed to investigate the agreement between data-fusion-enhanced exposure assessment and site monitoring data in estimating the effects of PMCs on gestational diabetes mellitus (GDM). We first improved the spatial resolution and accuracy of exposure assessment for five major PMCs (EC, OM, NO3-, NH4+, and SO42-) in the Pearl River Delta region by a data fusion model that combined inputs from multiple sources using a random forest model (10-fold cross-validation R2: 0.52 to 0.61; root mean square error: 0.55 to 2.26 µg/m3). Next, we compared the associations between exposures to PMCs during pregnancy and GDM in a hospital-based cohort of 1148 pregnant women in Heshan, China, using both site monitoring data and data-fusion model estimates. The comparative analysis showed that the data-fusion-based exposure generated stronger estimates of identifying statistical disparities. This study suggests that data-fusion-enhanced estimates can improve exposure assessment and potentially mitigate the misclassification of population exposure arising from the utilization of site monitoring data.

Assuntos

Material Particulado , Material Particulado/análise , Humanos , China , Feminino , Rios/química , Gravidez , Poluentes Atmosféricos/análise , Monitoramento Ambiental/métodos , Estudos Epidemiológicos , Exposição Ambiental , Diabetes Gestacional/epidemiologia

14.

Applying the Kolmogorov-Zurbenko filter followed by random forest models to ⁷Be observations in Spain (2006-2021).

Nafarrate, Ander; Petisco-Ferrero, Susana; Idoeta, Raquel; Herranz, Margarita; Sáenz, Jon; Ulazia, Alain; Ibarra-Berastegui, Gabriel.

Heliyon ; 10(9): e30820, 2024 May 15.

Artigo em Inglês | MEDLINE | ID: mdl-38765117

RESUMO

In this study, we analysed 7Be weekly surface measurements from six Spanish laboratories from 2006 to 2021. The Kolmogorov-Zurbenko filter was applied to the six 7Be time series, and following an iterative process, the original data were divided into two fractions: one related to variations characterized by periods above 33 days (including, among others, the seasonal cycle) and the second noisier fraction related to mechanisms originating from variations with periods below 33 days. Both fractions were independent at the six locations. The second machine-based step using random forest models was applied with the aim of identifying the most influential inputs to the observed 7Be concentrations, and machine learning-inspired regression models were fitted. With respect to seasonal components, the results indicated that the memory of the system was the most influential input, as expected by the large fraction of variance explained by the seasonal cycle, followed by that of humidity and wind-related variables. For the fraction corresponding to periods below 33 d, precipitation-, humidity-, and radiation-related variables were the most influential. This methodology has made it possible to successfully describe the major mechanisms known to be involved in the generation of the surface 7Be concentrations observed in Spain.

15.

Prediction of jumbo drill penetration rate in underground mines using various machine learning approaches and traditional models.

Heydari, Sasan; Hoseinie, Seyed Hadi; Bagherpour, Raheb.

Sci Rep ; 14(1): 8928, 2024 04 18.

Artigo em Inglês | MEDLINE | ID: mdl-38637673

RESUMO

Estimating penetration rates of Jumbo drills is crucial for optimizing underground mining drilling processes, aiming to reduce costs and time. This study investigates various regression and machine learning methods, including Multilayer Perceptron (MLP), Support Vector Regression (SVR), and Random Forests (RF), to predict the penetration rates (ROP) using multivariate inputs such as operation parameters and rock mass characteristics. The Rock Mass Drillability Index (RDi), incorporating both intact rock properties and structural parameters, was utilized to characterize the rock mass. The dataset was split into 80% for training and 20% for testing. Performance metrics including correlation coefficient (R2), variance accounted for (VAF), mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE) were calculated for each method to evaluate the accuracy of the predictions. SVR exhibited the best prediction performance for ROP, achieving the highest R2, lowest RMSE, MAE, and MAPE, as well as the largest VAF values of 0.94, 0.15, 0.11, 4.84, and 94.13 during training, and 0.91, 0.19, 0.13, 6.02, and 91.11 during testing, respectively. With this high accuracy, we conclude that the proposed machine learning algorithms are valuable and efficient predictors for estimating jumbo drill penetration rates in underground mining operations.

Assuntos

Aprendizado de Máquina , Máquina de Vetores de Suporte , Redes Neurais de Computação , Algoritmos

16.

Pattern analysis using lower body human walking data to identify the gaitprint.

Wiles, Tyler M; Kim, Seung Kyeom; Stergiou, Nick; Likens, Aaron D.

Comput Struct Biotechnol J ; 24: 281-291, 2024 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-38644928

RESUMO

All people have a fingerprint that is unique to them and persistent throughout life. Similarly, we propose that people have a gaitprint, a persistent walking pattern that contains unique information about an individual. To provide evidence of a unique gaitprint, we aimed to identify individuals based on basic spatiotemporal variables. 81 adults were recruited to walk overground on an indoor track at their own pace for four minutes wearing inertial measurement units. A total of 18 trials per participant were completed between two days, one week apart. Four methods of pattern analysis, a) Euclidean distance, b) cosine similarity, c) random forest, and d) support vector machine, were applied to our basic spatiotemporal variables such as step and stride lengths to accurately identify people. Our best accuracy (98.63%) was achieved by random forest, followed by support vector machine (98.40%), and the top 10 most similar trials from cosine similarity (98.40%). Our results clearly demonstrate a persistent walking pattern with sufficient information about the individual to make them identifiable, suggesting the existence of a gaitprint.

17.

Improving dengue fever predictions in Taiwan based on feature selection and random forests.

Kuo, Chao-Yang; Yang, Wei-Wen; Su, Emily Chia-Yu.

BMC Infect Dis ; 24(Suppl 2): 334, 2024 Mar 20.

Artigo em Inglês | MEDLINE | ID: mdl-38509486

RESUMO

BACKGROUND: Dengue fever is a well-studied vector-borne disease in tropical and subtropical areas of the world. Several methods for predicting the occurrence of dengue fever in Taiwan have been proposed. However, to the best of our knowledge, no study has investigated the relationship between air quality indices (AQIs) and dengue fever in Taiwan. RESULTS: This study aimed to develop a dengue fever prediction model in which meteorological factors, a vector index, and AQIs were incorporated into different machine learning algorithms. A total of 805 meteorological records from 2013 to 2015 were collected from government open-source data after preprocessing. In addition to well-known dengue-related factors, we investigated the effects of novel variables, including particulate matter with an aerodynamic diameter < 10 µm (PM10), PM2.5, and an ultraviolet index, for predicting dengue fever occurrence. The collected dataset was randomly divided into an 80% training set and a 20% test set. The experimental results showed that the random forests achieved an area under the receiver operating characteristic curve of 0.9547 for the test set, which was the best compared with the other machine learning algorithms. In addition, the temperature was the most important factor in our variable importance analysis, and it showed a positive effect on dengue fever at < 30 °C but had less of an effect at > 30 °C. The AQIs were not as important as temperature, but one was selected in the process of filtering the variables and showed a certain influence on the final results. CONCLUSIONS: Our study is the first to demonstrate that AQI negatively affects dengue fever occurrence in Taiwan. The proposed prediction model can be used as an early warning system for public health to prevent dengue fever outbreaks.

Assuntos

Dengue , Algoritmo Florestas Aleatórias , Humanos , Dengue/epidemiologia , Taiwan/epidemiologia , Temperatura , Surtos de Doenças

18.

Deciphering the molecular pathways underlying dopaminergic neuronal damage in Parkinson's disease associated with SARS-CoV-2 infection.

Xu, Qiuhan; Jiang, Sisi; Kang, Ruiqing; Wang, Yiling; Zhang, Baorong; Tian, Jun.

Comput Biol Med ; 171: 108200, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38428099

RESUMO

BACKGROUND: The COVID-19 pandemic caused by SARS-CoV-2 has led to significant global morbidity and mortality, with potential neurological consequences, such as Parkinson's disease (PD). However, the underlying mechanisms remain elusive. METHODS: To address this critical question, we conducted an in-depth transcriptome analysis of dopaminergic (DA) neurons in both COVID-19 and PD patients. We identified common pathways and differentially expressed genes (DEGs), performed enrichment analysis, constructed proteinâprotein interaction networks and gene regulatory networks, and employed machine learning methods to develop disease diagnosis and progression prediction models. To further substantiate our findings, we performed validation of hub genes using a single-cell sequencing dataset encompassing DA neurons from PD patients, as well as transcriptome sequencing of DA neurons from a mouse model of MPTP(1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine)-induced PD. Furthermore, a drug-protein interaction network was also created. RESULTS: We gained detailed insights into biological functions and signaling pathways, including ion transport and synaptic signaling pathways. CD38 was identified as a potential key biomarker. Disease diagnosis and progression prediction models were specifically tailored for PD. Molecular docking simulations and molecular dynamics simulations were employed to predict potential therapeutic drugs, revealing that genistein holds significant promise for exerting dual therapeutic effects on both PD and COVID-19. CONCLUSIONS: Our study provides innovative strategies for advancing PD-related research and treatment in the context of the ongoing COVID-19 pandemic by elucidating the common pathogenesis between COVID-19 and PD in DA neurons.

Assuntos

COVID-19 , Doença de Parkinson , Animais , Camundongos , Humanos , Doença de Parkinson/genética , Doença de Parkinson/metabolismo , 1-Metil-4-Fenil-1,2,3,6-Tetra-Hidropiridina/farmacologia , 1-Metil-4-Fenil-1,2,3,6-Tetra-Hidropiridina/uso terapêutico , Simulação de Acoplamento Molecular , Pandemias , SARS-CoV-2 , Modelos Animais de Doenças

19.

Application of machine learning algorithms to predict dead on arrival of broiler chickens raised without antibiotic program.

Pirompud, Pranee; Sivapirunthep, Panneepa; Punyapornwithaya, Veerasak; Chaosap, Chanporn.

Poult Sci ; 103(4): 103504, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38335671

RESUMO

Understanding the factors of dead-on-arrival (DOA) incidents during pre-slaughter handling is crucial for informed decision-making, improving broiler welfare, and optimizing farm profitability. In this study, 3 different machine learning (ML) algorithms - least absolute shrinkage and selection operator (LASSO), classification tree (CT), and random forest (RF) - were used together with 4 sampling techniques to optimize imbalanced data. The dataset comes from 22,115 broiler truckloads from a large producer in Thailand (2021-2022) and includes 14 independent variables covering the rearing, catching, and transportation stages. The study focuses on DOA% in the range of 0.10 to 1.20%, with a threshold for high DOA% above 0.3%, and records DOA% per truckload during pre-slaughter ante-mortem inspection. With a high DOA rate of 25.2%, the imbalanced dataset prompts the implementation of 4 methods to tune the imbalance parameters: random over sampling (ROS), random under sampling (RUS), both sampling (BOTH), and synthetic sampling or random over sampling example (ROSE). The aim is to improve the performance of the prediction model in classifying and predicting high DOA%. The comparative analysis of the different error metrics shows that RF outperforms the other models in a balanced dataset. In particular, RUS shows a significant improvement in prediction performance across all models compared to the original unbalanced dataset. The identification of the 4 most important variables for predicting high DOA percentages - mortality and culling rate, rearing stocking density, season, and mean body weight - emphasizes their importance for broiler production. This study provides valuable insights into the prediction of DOA status using an ML approach and contributes to the development of more effective strategies to mitigate high DOA percentages in commercial broiler production.

Assuntos

Matadouros , Galinhas , Animais , Algoritmos , Aprendizado de Máquina , Antibacterianos

20.

Exploring the predictors affecting the sense of community of Korean high school students: application of random forests and SHAP.

Jang, Eunah; Chung, Hyewon.

Front Psychol ; 15: 1337512, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38379618

RESUMO

Adolescence is a stage during which individuals develop social adaptability through meaningful interactions with others. During this period, students gradually expand their social networks outside the home, forming a sense of community. The aim of the current study was to explore the key predictors related to sense of community among Korean high school students and to develop supportive policies that enhance their sense of community. Accordingly, random forests and SHapley Additive exPlanations (SHAP) were applied to the 7th wave (11th graders) of the Korean Education Longitudinal Study 2013 data (n = 6,077). As a result, 6 predictors positively associated with sense of community were identified, including self-related variables, "multicultural acceptance," "behavioral regulation strategy," and "peer attachment," consistent with previous findings. Newly derived variables that predict sense of community include "positive recognition of volunteering," "creativity," "observance of rules" and "class attitude," which are also positively related to sense of community. The implications of these results and some suggestions for future research are also discussed.

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA