RESUMEN
In this study, we investigate the adaptability of artificial agents within a noisy T-maze that use Markov decision processes (MDPs) and successor feature (SF) and predecessor feature (PF) learning algorithms. Our focus is on quantifying how varying the hyperparameters, specifically the reward learning rate (αr) and the eligibility trace decay rate (λ), can enhance their adaptability. Adaptation is evaluated by analyzing the hyperparameters of cumulative reward, step length, adaptation rate, and adaptation step length and the relationships between them using Spearman's correlation tests and linear regression. Our findings reveal that an αr of 0.9 consistently yields superior adaptation across all metrics at a noise level of 0.05. However, the optimal setting for λ varies by metric and context. In discussing these results, we emphasize the critical role of hyperparameter optimization in refining the performance and transfer learning efficacy of learning algorithms. This research advances our understanding of the functionality of PF and SF algorithms, particularly in navigating the inherent uncertainty of transfer learning tasks. By offering insights into the optimal hyperparameter configurations, this study contributes to the development of more adaptive and robust learning algorithms, paving the way for future explorations in artificial intelligence and neuroscience.
Asunto(s)
Algoritmos , Aprendizaje Espacial , Aprendizaje Espacial/fisiología , Inteligencia Artificial , Cadenas de Markov , Aprendizaje por Laberinto/fisiología , Humanos , RecompensaRESUMEN
Introduction: As the global prevalence of obesity continues to rise, it has become a major public health concern requiring more accurate prediction methods. Traditional regression models often fail to capture the complex interactions between genetic, environmental, and behavioral factors contributing to obesity. Methods: This study explores the potential of machine-learning techniques to improve obesity risk prediction. Various supervised learning algorithms, including the novel ANN-PSO hybrid model, were applied following comprehensive data preprocessing and evaluation. Results: The proposed ANN-PSO model achieved a remarkable accuracy rate of 92%, outperforming traditional regression methods. SHAP was employed to analyze feature importance, offering deeper insights into the influence of various factors on obesity risk. Discussion: The findings highlight the transformative role of advanced machine-learning models in public health research, offering a pathway for personalized healthcare interventions. By providing detailed obesity risk profiles, these models enable healthcare providers to tailor prevention and treatment strategies to individual needs. The results underscore the need to integrate innovative machine-learning approaches into global public health efforts to combat the growing obesity epidemic.
RESUMEN
Lung cancer (LC) is a life-threatening and dangerous disease all over the world. However, earlier diagnoses and treatment can save lives. Earlier diagnoses of malevolent cells in the lungs responsible for oxygenating the human body and expelling carbon dioxide due to significant procedures are critical. Even though a computed tomography (CT) scan is the best imaging approach in the healthcare sector, it is challenging for physicians to identify and interpret the tumour from CT scans. LC diagnosis in CT scan using artificial intelligence (AI) can help radiologists in earlier diagnoses, enhance performance, and decrease false negatives. Deep learning (DL) for detecting lymph node contribution on histopathological slides has become popular due to its great significance in patient diagnoses and treatment. This study introduces a computer-aided diagnosis for LC by utilizing the Waterwheel Plant Algorithm with DL (CADLC-WWPADL) approach. The primary aim of the CADLC-WWPADL approach is to classify and identify the existence of LC on CT scans. The CADLC-WWPADL method uses a lightweight MobileNet model for feature extraction. Besides, the CADLC-WWPADL method employs WWPA for the hyperparameter tuning process. Furthermore, the symmetrical autoencoder (SAE) model is utilized for classification. An investigational evaluation is performed to demonstrate the significant detection outputs of the CADLC-WWPADL technique. An extensive comparative study reported that the CADLC-WWPADL technique effectively performs with other models with a maximum accuracy of 99.05% under the benchmark CT image dataset.
Asunto(s)
Algoritmos , Aprendizaje Profundo , Diagnóstico por Computador , Neoplasias Pulmonares , Tomografía Computarizada por Rayos X , Humanos , Neoplasias Pulmonares/diagnóstico por imagen , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/patología , Tomografía Computarizada por Rayos X/métodos , Diagnóstico por Computador/métodosRESUMEN
Solar photovoltaic (PV) systems, integral for sustainable energy, face challenges in forecasting due to the unpredictable nature of environmental factors influencing energy output. This study explores five distinct machine learning (ML) models which are built and compared to predict energy production based on four independent weather variables: wind speed, relative humidity, ambient temperature, and solar irradiation. The evaluated models include multiple linear regression (MLR), decision tree regression (DTR), random forest regression (RFR), support vector regression (SVR), and multi-layer perceptron (MLP). These models were hyperparameter tuned using chimp optimization algorithm (ChOA) for a performance appraisal. The models are subsequently validated on the data from a 264 kWp PV system, installed at the Applied Science University (ASU) in Amman, Jordan. Of all 5 models, MLP shows best root mean square error (RMSE), with the corresponding value of 0.503, followed by mean absolute error (MAE) of 0.397 and a coefficient of determination (R2) value of 0.99 in predicting energy from the observed environmental parameters. Finally, the process highlights the fact that fine-tuning of ML models for improved prediction accuracy in energy production domain still involves the use of advanced optimization techniques like ChOA, compared with other widely used optimization algorithms from the literature.
RESUMEN
Laryngeal cancer (LC) represents a substantial world health problem, with diminished survival rates attributed to late-stage diagnoses. Correct treatment for LC is complex, particularly in the final stages. This kind of cancer is a complex malignancy inside the head and neck region of patients. Recently, researchers serving medical consultants to recognize LC efficiently develop different analysis methods and tools. However, these existing tools and techniques have various problems regarding performance constraints, like lesser accuracy in detecting LC at the early stages, additional computational complexity, and colossal time utilization in patient screening. Deep learning (DL) approaches have been established that are effective in the recognition of LC. Therefore, this study develops an efficient LC Detection using the Chaotic Metaheuristics Integration with the DL (LCD-CMDL) technique. The LCD-CMDL technique mainly focuses on detecting and classifying LC utilizing throat region images. In the LCD-CMDL technique, the contrast enhancement process uses the CLAHE approach. For feature extraction, the LCD-CMDL technique applies the Squeeze-and-Excitation ResNet (SE-ResNet) model to learn the complex and intrinsic features from the image preprocessing. Moreover, the hyperparameter tuning of the SE-ResNet approach is performed using a chaotic adaptive sparrow search algorithm (CSSA). Finally, the extreme learning machine (ELM) model was applied to detect and classify the LC. The performance evaluation of the LCD-CMDL approach occurs utilizing a benchmark throat region image database. The experimental values implied the superior performance of the LCD-CMDL approach over recent state-of-the-art approaches.
RESUMEN
Various studies have emphasized the importance of identifying the optimal Trigger Timing (TT) for the trigger shot in In Vitro Fertilization (IVF), which is crucial for the successful maturation and release of oocytes, especially in minimal ovarian stimulation treatments. Despite its significance for the ultimate success of IVF, determining the precise TT remains a complex challenge for physicians due to the involvement of multiple variables. This study aims to enhance TT by developing a machine learning multi-output model that predicts the expected number of retrieved oocytes, mature oocytes (MII), fertilized oocytes (2 PN), and useable blastocysts within a 48-h window after the trigger shot in minimal stimulation cycles. By utilizing this model, physicians can identify patients with possible early, late, or on-time trigger shots. The study found that approximately 27 % of treatments administered the trigger shot on a suboptimal day, but optimizing the TT using the developed Artificial Intelligence (AI) model can potentially increase useable blastocyst production by 46 %. These findings highlight the potential of predictive models as a supplementary tool for optimizing trigger shot timing and improving IVF outcomes, particularly in minimal ovarian stimulation. The experimental results underwent statistical validation, demonstrating the accuracy and performance of the model. Overall, this study emphasizes the value of AI prediction models in enhancing TT and making the IVF process safer and more efficient.
Asunto(s)
Fertilización In Vitro , Aprendizaje Automático , Inducción de la Ovulación , Humanos , Femenino , Inducción de la Ovulación/métodos , Fertilización In Vitro/métodos , AdultoRESUMEN
BACKGROUND: The process of optimizing in vitro shoot proliferation is a complicated task, as it is influenced by interactions of many factors as well as genotype. This study investigated the role of various concentrations of plant growth regulators (zeatin and gibberellic acid) in the successful in vitro shoot proliferation of three Punica granatum cultivars ('Faroogh', 'Atabaki' and 'Shirineshahvar'). Also, the utility of five Machine Learning (ML) algorithms-Support Vector Regression (SVR), Random Forest (RF), Extreme Gradient Boosting (XGB), Ensemble Stacking Regression (ESR) and Elastic Net Multivariate Linear Regression (ENMLR)-as modeling tools were evaluated on in vitro multiplication of pomegranate. A new automatic hyperparameter optimization method named Adaptive Tree Pazen Estimator (ATPE) was developed to tune the hyperparameters. The performance of the models was evaluated and compared using statistical indicators (MAE, RMSE, RRMSE, MAPE, R and R2), while a specific Global Performance Indicator (GPI) was introduced to rank the models based on a single parameter. Moreover, Nondominated Sorting Genetic AlgorithmII (NSGAII) was employed to optimize the selected prediction model. RESULTS: The results demonstrated that the ESR algorithm exhibited higher predictive accuracy in comparison to other ML algorithms. The ESR model was subsequently introduced for optimization by NSGAII. ESR-NSGAII revealed that the highest proliferation rate (3.47, 3.84, and 3.22), shoot length (2.74, 3.32, and 1.86 cm), leave number (18.18, 19.76, and 18.77), and explant survival (84.21%, 85.49%, and 56.39%) could be achieved with a medium containing 0.750, 0.654, and 0.705 mg/L zeatin, and 0.50, 0.329, and 0.347 mg/L gibberellic acid in the 'Atabaki', 'Faroogh', and 'Shirineshahvar' cultivars, respectively. CONCLUSIONS: This study demonstrates that the 'Shirineshahvar' cultivar exhibited lower shoot proliferation success compared to the other cultivars. The results indicated the good performance of ESR-NSGA-II in modeling and optimizing in vitro propagation. ESR-NSGA-II can be applied as an up-to-date and reliable computational tool for future studies in plant in vitro culture.
RESUMEN
Artificial intelligence is steadily permeating various sectors, including healthcare. This research specifically addresses lung cancer, the world's deadliest disease with the highest mortality rate. Two primary factors contribute to its onset: genetic predisposition and environmental factors, such as smoking and exposure to pollutants. Recognizing the need for more effective diagnosis techniques, our study embarked on devising a machine learning strategy tailored to boost precision in lung cancer detection. Our aim was to devise a diagnostic method that is both less invasive and cost-effective. To this end, we proposed four methods, benchmarking them against prevalent techniques using a universally recognized dataset from Kaggle. Among our methods, one emerged as particularly promising, outperforming the competition in accuracy, precision and sensitivity. This method utilized hyperparameter tuning, focusing on the Gamma and C parameters, which were set at a value of 10. These parameters influence kernel width and regularization strength, respectively. As a result, we achieved an accuracy of 99.16%, a precision of 98% and a sensitivity rate of 100%. In conclusion, our enhanced prediction mechanism has proven to surpass traditional and contemporary strategies in lung cancer detection.
RESUMEN
Pyrolysis, a thermochemical conversion approach of transforming plastic waste to energy has tremendous potential to manage the exponentially increasing plastic waste. However, understanding the process kinetics is fundamental to engineering a sustainable process. Conventional analysis techniques do not provide insights into the influence of characteristics of feedstock on the process kinetics. Present study exemplifies the efficacy of using machine learning for predictive modeling of pyrolysis of waste plastics to understand the complexities of the interrelations of predictor variables and their influence on activation energy. The activation energy for pyrolysis of waste plastics was evaluated using machine learning models namely Random Forest, XGBoost, CatBoost, and AdaBoost regression models. Feature selection based on the multicollinearity of data and hyperparameter tuning of the models utilizing RandomizedSearchCV was conducted. Random forest model outperformed the other models with coefficient of determination (R2) value of 0.941, root mean square error (RMSE) value of 14.69 and mean absolute error (MAE) value of 8.66 for the testing dataset. The explainable artificial intelligence-based feature importance plot and the summary plot of the shapely additive explanations projected fixed carbon content, ash content, conversion value, and carbon content as significant parameters of the model in the order; fixed carbon > carbon > ash content > degree of conversion. Present study highlighted the potential of machine learning as a powerful tool to understand the influence of the characteristics of plastic waste and the degree of conversion on the activation energy of a process that is essential for designing the large-scale operations and future scale-up of the process.
Asunto(s)
Inteligencia Artificial , Plásticos , Pirólisis , Plásticos/química , Aprendizaje Automático , Modelos TeóricosRESUMEN
This paper presents a comprehensive exploration of machine learning algorithms (MLAs) and feature selection techniques for accurate heart disease prediction (HDP) in modern healthcare. By focusing on diverse datasets encompassing various challenges, the research sheds light on optimal strategies for early detection. MLAs such as Decision Trees (DT), Random Forests (RF), Support Vector Machines (SVM), Gaussian Naive Bayes (NB), and others were studied, with precision and recall metrics emphasized for robust predictions. Our study addresses challenges in real-world data through data cleaning and one-hot encoding, enhancing the integrity of our predictive models. Feature extraction techniques-Recursive Feature Extraction (RFE), Principal Component Analysis (PCA), and univariate feature selection-play a crucial role in identifying relevant features and reducing data dimensionality. Our findings showcase the impact of these techniques on improving prediction accuracy. Optimized models for each dataset have been achieved through grid search hyperparameter tuning, with configurations meticulously outlined. Notably, a remarkable 99.12 % accuracy was achieved on the first Kaggle dataset, showcasing the potential for accurate HDP. Model robustness across diverse datasets was highlighted, with caution against overfitting. The study emphasizes the need for validation of unseen data and encourages ongoing research for generalizability. Serving as a practical guide, this research aids researchers and practitioners in HDP model development, influencing clinical decisions and healthcare resource allocation. By providing insights into effective algorithms and techniques, the paper contributes to reducing heart disease-related morbidity and mortality, supporting the healthcare community's ongoing efforts.
Asunto(s)
Cardiopatías , Aprendizaje Automático , Medicina de Precisión , Humanos , Medicina de Precisión/métodos , Algoritmos , Máquina de Vectores de SoporteRESUMEN
Identifying key statements in large volumes of short, user-generated texts is essential for decision-makers to quickly grasp their key content. To address this need, this research introduces a novel abstractive key point generation (KPG) approach applicable to unlabeled text corpora, using an unsupervised approach, a feature not yet seen in existing abstractive KPG methods. The proposed method uniquely combines topic modeling for unsupervised data space segmentation with abstractive summarization techniques to efficiently generate semantically representative key points from text collections. This is further enhanced by hyperparameter tuning to optimize both the topic modeling and abstractive summarization processes. The hyperparameter tuning of the topic modeling aims at making the cluster assignment more deterministic as the probabilistic nature of the process would otherwise lead to high variability in the output. The abstractive summarization process is optimized using a Davies-Bouldin Index specifically adapted to this use case, so that the generated key points more accurately reflect the characteristic properties of this cluster. In addition, our research recommends an automated evaluation that provides a quantitative complement to the traditional qualitative analysis of KPG. This method regards KPG as a specialized form of Multidocument summarization (MDS) and employs both word-based and word-embedding-based metrics for evaluation. These criteria allow for a comprehensive and nuanced analysis of the KPG output. Demonstrated through application to a political debate on Twitter, the versatility of this approach extends to various domains, such as product review analysis and survey evaluation. This research not only paves the way for innovative development in abstractive KPG methods but also sets a benchmark for their evaluation.
RESUMEN
Resource recycling is considered necessary for sustainable development, especially in smart cities where increased urbanization and the variety of waste generated require the development of automated waste management models. The development of smart technology offers a possible alternative to traditional waste management techniques that are proving insufficient to reduce the harmful effects of trash on the environment. This paper proposes an intelligent waste classification model to enhance the classification of waste materials, focusing on the critical aspect of waste classification. The proposed model leverages the InceptionV3 deep learning architecture, augmented by multi-objective beluga whale optimization (MBWO) for hyperparameter optimization. In MBWO, sensitivity and specificity evaluation criteria are integrated linearly as the objective function to find the optimal values of the dropout period, learning rate, and batch size. A benchmark dataset, namely TrashNet is adopted to verify the proposed model's performance. By strategically integrating MBWO, the model achieves a considerable increase in accuracy and efficiency in identifying waste materials, contributing to more effective waste management strategies while encouraging sustainable waste management practices. The proposed intelligent waste classification model outperformed the state-of-the-art models with an accuracy of 97.75%, specificity of 99.55%, F1-score of 97.58%, and sensitivity of 98.88%.
Asunto(s)
Aprendizaje Profundo , Administración de Residuos , Animales , Administración de Residuos/métodos , Ballena Beluga , ReciclajeRESUMEN
Breast cancer has rapidly increased in prevalence in recent years, making it one of the leading causes of mortality worldwide. Among all cancers, it is by far the most common. Diagnosing this illness manually requires significant time and expertise. Since detecting breast cancer is a time-consuming process, preventing its further spread can be aided by creating machine-based forecasts. Machine learning and Explainable AI are crucial in classification as they not only provide accurate predictions but also offer insights into how the model arrives at its decisions, aiding in the understanding and trustworthiness of the classification results. In this study, we evaluate and compare the classification accuracy, precision, recall, and F1 scores of five different machine learning methods using a primary dataset (500 patients from Dhaka Medical College Hospital). Five different supervised machine learning techniques, including decision tree, random forest, logistic regression, naive bayes, and XGBoost, have been used to achieve optimal results on our dataset. Additionally, this study applied SHAP analysis to the XGBoost model to interpret the model's predictions and understand the impact of each feature on the model's output. We compared the accuracy with which several algorithms classified the data, as well as contrasted with other literature in this field. After final evaluation, this study found that XGBoost achieved the best model accuracy, which is 97%.
Asunto(s)
Neoplasias de la Mama , Humanos , Femenino , Neoplasias de la Mama/diagnóstico , Teorema de Bayes , Bangladesh/epidemiología , Mama , Aprendizaje Automático , HidrolasasRESUMEN
This paper offers a thorough investigation of hyperparameter tuning for neural network architectures using datasets encompassing various combinations of Methylene Blue (MB) Reduction by Ascorbic Acid (AA) reactions with different solvents and concentrations. The aim is to predict coefficients of decay plots for MB absorbance, shedding light on the complex dynamics of chemical reactions. Our findings reveal that the optimal model, determined through our investigation, consists of five hidden layers, each with sixteen neurons and employing the Swish activation function. This model yields an NMSE of 0.05, 0.03, and 0.04 for predicting the coefficients A, B, and C, respectively, in the exponential decay equation A + B · e-x/C. These findings contribute to the realm of drug design based on machine learning, providing valuable insights into optimizing chemical reaction predictions.
Asunto(s)
Ácido Ascórbico , Azul de Metileno , Diseño de Fármacos , Aprendizaje Automático , Redes Neurales de la ComputaciónRESUMEN
Hyperparameter tuning plays a pivotal role in the accuracy and reliability of convolutional neural network (CNN) models used in brain tumor diagnosis. These hyperparameters exert control over various aspects of the neural network, encompassing feature extraction, spatial resolution, non-linear mapping, convergence speed, and model complexity. We propose a meticulously refined CNN hyperparameter model designed to optimize critical parameters, including filter number and size, stride padding, pooling techniques, activation functions, learning rate, batch size, and the number of layers. Our approach leverages two publicly available brain tumor MRI datasets for research purposes. The first dataset comprises a total of 7,023 human brain images, categorized into four classes: glioma, meningioma, no tumor, and pituitary. The second dataset contains 253 images classified as "yes" and "no." Our approach delivers exceptional results, demonstrating an average 94.25% precision, recall, and F1-score with 96% accuracy for dataset 1, while an average 87.5% precision, recall, and F1-score, with accuracy of 88% for dataset 2. To affirm the robustness of our findings, we perform a comprehensive comparison with existing techniques, revealing that our method consistently outperforms these approaches. By systematically fine-tuning these critical hyperparameters, our model not only enhances its performance but also bolsters its generalization capabilities. This optimized CNN model provides medical experts with a more precise and efficient tool for supporting their decision-making processes in brain tumor diagnosis.
RESUMEN
BACKGROUND: Brain tumor is a grave illness causing worldwide fatalities. The current detection methods for brain tumors are manual, invasive, and rely on histopathological analysis. Determining the type of brain tumor after its detection relies on biopsy measures and involves human subjectivity. The use of automated CAD techniques for brain tumor detection and classification can overcome these drawbacks. OBJECTIVE: The paper aims to create two deep learning-based CAD frameworks for automatic detection and severity grading of brain tumors - the first model for brain tumor detection in brain MR images and model 2 for the classification of tumors into three types: Glioma, Meningioma, and Pituitary based on severity grading. METHODS: The novelty of the research work includes the architectural design of deep learning frameworks for detection and classification of brain tumor using brain MR images. The hyperparameter tuning of the proposed models is done to achieve the optimal parameters that result in maximizing the models' performance and minimizing losses. RESULTS: The proposed CNN models outperform the existing state of the art models in terms of accuracy and complexity of the models. The proposed model developed for detection of brain tumors achieved an accuracy of 98.56% and CNN Model developed for severity grading of brain tumor achieved an accuracy of 92.36% on BraTs dataset. CONCLUSION: The proposed models have an edge over the existing CNN models in terms of less complexity of the structure and appreciable accuracy with low training and test errors. The proposed CNN Models can be employed for clinical diagnostic purposes to aid the medical fraternity in validating their initial screening for brain tumor detection and its multi-classification.
Asunto(s)
Neoplasias Encefálicas , Aprendizaje Profundo , Imagen por Resonancia Magnética , Clasificación del Tumor , Humanos , Neoplasias Encefálicas/diagnóstico por imagen , Imagen por Resonancia Magnética/métodos , Glioma/diagnóstico por imagen , Redes Neurales de la Computación , Meningioma/diagnóstico por imagen , Interpretación de Imagen Asistida por Computador/métodosRESUMEN
Biochar is a carbon-neutral tool for combating climate change. Artificial intelligence applications to estimate the biochar mitigation effect on greenhouse gases (GHGs) can assist scientists in making more informed solutions. However, there is also evidence indicating that biochar promotes, rather than reduces, N2O emissions. Thus, the effect of biochar on N2O remains uncertain in constructed wetlands (CWs), and there is not a characterization metric for this effect, which increases the difficulty and inaccuracy of biochar-driven alleviation effect projections. Here, we provide new insight by utilizing machine learning-based, tree-structured Parzen Estimator (TPE) optimization assisted by a meta-analysis to estimate the potency of biochar-driven N2O mitigation. We first synthesized datasets that contained 80 studies on global biochar-amended CWs. The mitigation effect size was then calculated and further introduced as a new metric. TPE optimization was then applied to automatically tune the hyperparameters of the built extreme gradient boosting (XGBoost) and random forest (RF), and the optimum TPE-XGBoost obtained adequately achieved a satisfactory prediction accuracy for N2O flux (R2 = 91.90%, RPD = 3.57) and the effect size (R2 = 92.61%, RPD = 3.59). Results indicated that a high influent chemical oxygen demand/total nitrogen (COD/TN) ratio and the COD removal efficiency interpreted by the Shapley value significantly enhanced the effect size contribution. COD/TN ratio made the most and the second greatest positive contributions among 22 input variables to N2O flux and to the effect size that were up to 18% and 14%, respectively. By combining with a structural equation model analysis, NH4+-N removal rate had significant negative direct effects on the N2O flux. This study implied that the application of granulated biochar derived from C-rich feedstocks would maximize the net climate benefit of N2O mitigation driven by biochar for future biochar-based CWs.
Asunto(s)
Inteligencia Artificial , Humedales , Óxido Nitroso/análisis , Carbón Orgánico , Nitrógeno/análisis , Aprendizaje Automático , Suelo/químicaRESUMEN
Wastewater pollution caused by organic dyes is a growing concern due to its negative impact on human health and aquatic life. To tackle this issue, the use of advanced wastewater treatment with nano photocatalysts has emerged as a promising solution. However, experimental procedures for identifying the optimal conditions for dye degradation could be time-consuming and expensive. To overcome this, machine learning methods have been employed to predict the degradation of organic dyes in a more efficient manner by recognizing patterns in the process and addressing its feasibility. The objective of this study is to develop a machine learning model to predict the degradation of organic dyes and identify the main variables affecting the photocatalytic degradation capacity and removal of organic dyes from wastewater. Nine machine learning algorithms were tested including multiple linear regression, polynomial regression, decision trees, random forest, adaptive boosting, extreme gradient boosting, k-nearest neighbors, support vector machine, and artificial neural network. The study found that the XGBoosting algorithm outperformed the other models, making it ideal for predicting the photocatalytic degradation capacity of BiVO4. The results suggest that XGBoost is a suitable model for predicting the photocatalytic degradation of wastewater using BiVO4 with different dopants.
Asunto(s)
Nanopartículas , Aguas Residuales , Humanos , Algoritmos , Colorantes , Aprendizaje AutomáticoRESUMEN
Tuning hyperparameters, such as the regularization parameter in Ridge or Lasso regression, is often aimed at improving the predictive performance of risk prediction models. In this study, various hyperparameter tuning procedures for clinical prediction models were systematically compared and evaluated in low-dimensional data. The focus was on out-of-sample predictive performance (discrimination, calibration, and overall prediction error) of risk prediction models developed using Ridge, Lasso, Elastic Net, or Random Forest. The influence of sample size, number of predictors and events fraction on performance of the hyperparameter tuning procedures was studied using extensive simulations. The results indicate important differences between tuning procedures in calibration performance, while generally showing similar discriminative performance. The one-standard-error rule for tuning applied to cross-validation (1SE CV) often resulted in severe miscalibration. Standard non-repeated and repeated cross-validation (both 5-fold and 10-fold) performed similarly well and outperformed the other tuning procedures. Bootstrap showed a slight tendency to more severe miscalibration than standard cross-validation-based tuning procedures. Differences between tuning procedures were larger for smaller sample sizes, lower events fractions and fewer predictors. These results imply that the choice of tuning procedure can have a profound influence on the predictive performance of prediction models. The results support the application of standard 5-fold or 10-fold cross-validation that minimizes out-of-sample prediction error. Despite an increased computational burden, we found no clear benefit of repeated over non-repeated cross-validation for hyperparameter tuning. We warn against the potentially detrimental effects on model calibration of the popular 1SE CV rule for tuning prediction models in low-dimensional settings.
Asunto(s)
Proyectos de Investigación , Humanos , Simulación por Computador , Tamaño de la MuestraRESUMEN
Ovarian cancer, a deadly female reproductive system disease, is a significant challenge in medical research due to its notorious lethality. Addressing ovarian cancer in the current medical landscape has become more complex than ever. This research explores the complex field of Ovarian Cancer Subtype Classification and the crucial task of Outlier Detection, driven by a progressive automated system, as the need to fight this unforgiving illness becomes critical. This study primarily uses a unique dataset painstakingly selected from 20 esteemed medical institutes. The dataset includes a wide range of images, such as tissue microarray (TMA) images at 40× magnification and whole-slide images (WSI) at 20× magnification. The research is fully committed to identifying abnormalities within this complex environment, going beyond the classification of subtypes of ovarian cancer. We proposed a new Attention Embedder, a state-of-the-art model with effective results in ovarian cancer subtype classification and outlier detection. Using images magnified WSI, the model demonstrated an astonishing 96.42% training accuracy and 95.10% validation accuracy. Similarly, with images magnified via a TMA, the model performed well, obtaining a validation accuracy of 94.90% and a training accuracy of 93.45%. Our fine-tuned hyperparameter testing resulted in exceptional performance on independent images. At 20× magnification, we achieved an accuracy of 93.56%. Even at 40× magnification, our testing accuracy remained high, at 91.37%. This study highlights how machine learning can revolutionize the medical field's ability to classify ovarian cancer subtypes and identify outliers, giving doctors a valuable tool to lessen the severe effects of the disease. Adopting this novel method is likely to improve the practice of medicine and give people living with ovarian cancer worldwide hope.