Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 102
Filtrar
1.
Front Big Data ; 7: 1393758, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39364222

RESUMO

Detecting lung diseases in medical images can be quite challenging for radiologists. In some cases, even experienced experts may struggle with accurately diagnosing chest diseases, leading to potential inaccuracies due to complex or unseen biomarkers. This review paper delves into various datasets and machine learning techniques employed in recent research for lung disease classification, focusing on pneumonia analysis using chest X-ray images. We explore conventional machine learning methods, pretrained deep learning models, customized convolutional neural networks (CNNs), and ensemble methods. A comprehensive comparison of different classification approaches is presented, encompassing data acquisition, preprocessing, feature extraction, and classification using machine vision, machine and deep learning, and explainable-AI (XAI). Our analysis highlights the superior performance of transfer learning-based methods using CNNs and ensemble models/features for lung disease classification. In addition, our comprehensive review offers insights for researchers in other medical domains too who utilize radiological images. By providing a thorough overview of various techniques, our work enables the establishment of effective strategies and identification of suitable methods for a wide range of challenges. Currently, beyond traditional evaluation metrics, researchers emphasize the importance of XAI techniques in machine and deep learning models and their applications in classification tasks. This incorporation helps in gaining a deeper understanding of their decision-making processes, leading to improved trust, transparency, and overall clinical decision-making. Our comprehensive review serves as a valuable resource for researchers and practitioners seeking not only to advance the field of lung disease detection using machine learning and XAI but also from other diverse domains.

2.
Front Plant Sci ; 15: 1373318, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39086911

RESUMO

Coffee Breeding programs have traditionally relied on observing plant characteristics over years, a slow and costly process. Genomic selection (GS) offers a DNA-based alternative for faster selection of superior cultivars. Stacking Ensemble Learning (SEL) combines multiple models for potentially even more accurate selection. This study explores SEL potential in coffee breeding, aiming to improve prediction accuracy for important traits [yield (YL), total number of the fruits (NF), leaf miner infestation (LM), and cercosporiosis incidence (Cer)] in Coffea Arabica. We analyzed data from 195 individuals genotyped for 21,211 single-nucleotide polymorphism (SNP) markers. To comprehensively assess model performance, we employed a cross-validation (CV) scheme. Genomic Best Linear Unbiased Prediction (GBLUP), multivariate adaptive regression splines (MARS), Quantile Random Forest (QRF), and Random Forest (RF) served as base learners. For the meta-learner within the SEL framework, various options were explored, including Ridge Regression, RF, GBLUP, and Single Average. The SEL method was able to predict the predictive ability (PA) of important traits in Coffea Arabica. SEL presented higher PA compared with those obtained for all base learner methods. The gains in PA in relation to GBLUP were 87.44% (the ratio between the PA obtained from best Stacking model and the GBLUP), 37.83%, 199.82%, and 14.59% for YL, NF, LM and Cer, respectively. Overall, SEL presents a promising approach for GS. By combining predictions from multiple models, SEL can potentially enhance the PA of GS for complex traits.

3.
J Imaging Inform Med ; 2024 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-39138748

RESUMO

Pneumonia is a severe health concern, particularly for vulnerable groups, needing early and correct classification for optimal treatment. This study addresses the use of deep learning combined with machine learning classifiers (DLxMLCs) for pneumonia classification from chest X-ray (CXR) images. We deployed modified VGG19, ResNet50V2, and DenseNet121 models for feature extraction, followed by five machine learning classifiers (logistic regression, support vector machine, decision tree, random forest, artificial neural network). The approach we suggested displayed remarkable accuracy, with VGG19 and DenseNet121 models obtaining 99.98% accuracy when combined with random forest or decision tree classifiers. ResNet50V2 achieved 99.25% accuracy with random forest. These results illustrate the advantages of merging deep learning models with machine learning classifiers in boosting the speedy and accurate identification of pneumonia. The study underlines the potential of DLxMLC systems in enhancing diagnostic accuracy and efficiency. By integrating these models into clinical practice, healthcare practitioners could greatly boost patient care and results. Future research should focus on refining these models and exploring their application to other medical imaging tasks, as well as including explainability methodologies to better understand their decision-making processes and build trust in their clinical use. This technique promises promising breakthroughs in medical imaging and patient management.

4.
Stud Health Technol Inform ; 316: 1812-1816, 2024 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-39176843

RESUMO

This study employs machine learning techniques to identify factors that influence extended Emergency Department (ED) length of stay (LOS) and derives transparent decision rules to complement the results. Leveraging a comprehensive dataset, Gradient Boosting exhibited marginally superior predictive performance compared to Random Forest for LOS classification. Notably, variables like triage acuity and the Elixhauser Comorbidity Index (ECI) emerged as robust predictors. The extracted rules optimize LOS stratification and resource allocation, demonstrating the critical role of data-driven methodologies in improving ED workflow efficiency and patient care delivery.


Assuntos
Serviço Hospitalar de Emergência , Tempo de Internação , Aprendizado de Máquina , Humanos , Triagem
5.
SLAS Technol ; 29(4): 100159, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38909655

RESUMO

In today's digital world, with growing population and increasing pollution, unhealthy lifestyle habits like irregular eating, junk food consumption, and lack of exercise are becoming more common, leading to various health problems, including kidney issues. These factors directly affect human kidney health. To address this, we require early detection techniques that rely on text data. Text data contains detailed information about a patient's medical history, symptoms, test results, and treatment plans, giving a complete picture of kidney health and enabling timely intervention. In this research paper, we proposed a range of sophisticated models, such as Gradient Boosting Classifier, Light GBM, CatBoost, Support Vector Classifier (SVC), Random Boost, Logistic Regression, XGBoost, Deep Neural Network (DNN), and an Improved DNN. The Improved DNN demonstrated exceptional performance, with an accuracy of 90 %, precision of 89 %, recall of 90 %, and an F1-Score of 89.5 %. By combining traditional machine learning and deep neural networks, this integrative approach enables the identification of intricate patterns in datasets. The model's data-driven processes consistently update internal parameters, guaranteeing flexibility in response to evolving healthcare settings. This research represents a notable advancement in the progress of creating a more detailed and individualised ability to diagnose kidney stones, which could potentially lead to better clinical results and patient treatment.


Assuntos
Cálculos Renais , Redes Neurais de Computação , Cálculos Renais/diagnóstico , Humanos , Aprendizado de Máquina
6.
Sci Rep ; 14(1): 12328, 2024 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-38811628

RESUMO

This research proposes a novel, three-tier AI-based scheme for the allocation of carbon-neutral mobility hubs. Initially, it identified optimal sites using a genetic algorithm, which optimized travel times and achieved a high fitness value of 77,000,000. Second, it involved an Ensemble-based suitability analysis of the pinpointed locations, using factors such as land use mix, densities of population and employment, and proximities of parking, biking, and transit. Each factor is weighted by its carbon emissions contribution, then incorporated into a suitability analysis model, generating scores that guide the final selection of the most suitable mobility hub sites. The final step employs a traffic assignment model to evaluate these sites' environmental and economic impacts. This includes measuring reductions in vehicle kilometers traveled and calculating other cost savings. Focusing on addressing sustainable development goals 11 and 9, this study leverages advanced techniques to enhance transportation planning policies. The Ensemble model demonstrated strong predictive accuracy, achieving an R-squared of 95% in training and 53% in testing. The identified hubs' sites reduced daily vehicle travel by 771,074 km, leading to annual savings of 225.5 million USD. This comprehensive approach integrates carbon-focused analyses and post-assessment evaluations, thereby offering a comprehensive framework for sustainable mobility hub planning.

7.
Heliyon ; 10(9): e30002, 2024 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-38774065

RESUMO

Forecasting is of great importance in the field of renewable energies because it allows us to know the quantity of energy that can be produced, and thus, to have an efficient management of energy sources. However, determining which prediction system is more adequate is very complex, as each energy infrastructure is different. This work studies the influence of some variables when making predictions using ensemble methods for different locations. In particular, the proposal analyzes the influence of the aspects: the variation of the sampling frequency of solar panel systems, the influence of the type of neural network architecture and the number of ensemble method blocks for each model. Following comprehensive experimentation across multiple locations, our study has identified the most effective solar energy prediction model tailored to the specific conditions of each energy infrastructure. The results offer a decisive framework for selecting the optimal system for accurate and efficient energy forecasting. The key point is the use of short time intervals, which is independent of type of prediction model and of their ensemble method.

8.
Life (Basel) ; 14(5)2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38792608

RESUMO

Obstructive sleep apnea/hypopnea syndrome (OSAHS) is a condition linked to severe cardiovascular and neuropsychological consequences, characterized by recurrent episodes of partial or complete upper airway obstruction during sleep, leading to compromised ventilation, hypoxemia, and micro-arousals. Polysomnography (PSG) serves as the gold standard for confirming OSAHS, yet its extended duration, high cost, and limited availability pose significant challenges. In this paper, we employ a range of machine learning techniques, including Neural Networks, Decision Trees, Random Forests, and Extra Trees, for OSAHS diagnosis. This approach aims to achieve a diagnostic process that is not only more accessible but also more efficient. The dataset utilized in this study consists of records from 601 adults assessed between 2014 and 2016 at a specialized sleep medical center in Colombia. This research underscores the efficacy of ensemble methods, specifically Random Forests and Extra Trees, achieving an area under the Receiver Operating Characteristic (ROC) curve of 89.2% and 89.6%, respectively. Additionally, a web application has been devised, integrating the optimal model, empowering qualified medical practitioners to make informed decisions through patient registration, an input of 18 variables, and the utilization of the Random Forests model for OSAHS screening.

9.
Sci Rep ; 14(1): 7453, 2024 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-38548774

RESUMO

The recent developments in quantum technology have opened up new opportunities for machine learning algorithms to assist the healthcare industry in diagnosing complex health disorders, such as heart disease. In this work, we summarize the effectiveness of QuEML in heart disease prediction. To evaluate the performance of QuEML against traditional machine learning algorithms, the Kaggle heart disease dataset was used which contains 1190 samples out of which 53% of samples are labeled as positive samples and rest 47% samples are labeled as negative samples. The performance of QuEML was evaluated in terms of accuracy, precision, recall, specificity, F1 score, and training time against traditional machine learning algorithms. From the experimental results, it has been observed that proposed quantum approaches predicted around 50.03% of positive samples as positive and an average of 44.65% of negative samples are predicted as negative whereas traditional machine learning approaches could predict around 49.78% of positive samples as positive and 44.31% of negative samples as negative. Furthermore, the computational complexity of QuEML was measured which consumed average of 670 µs for its training whereas traditional machine learning algorithms could consume an average 862.5 µs for training. Hence, QuEL was found to be a promising approach in heart disease prediction with an accuracy rate of 0.6% higher and training time of 192.5 µs faster than that of traditional machine learning approaches.


Assuntos
Algoritmos , Cardiopatias , Humanos , Aprendizado de Máquina
10.
J Cheminform ; 16(1): 34, 2024 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-38520014

RESUMO

Kinetic process models are widely applied in science and engineering, including atmospheric, physiological and technical chemistry, reactor design, or process optimization. These models rely on numerous kinetic parameters such as reaction rate, diffusion or partitioning coefficients. Determining these properties by experiments can be challenging, especially for multiphase systems, and researchers often face the task of intuitively selecting experimental conditions to obtain insightful results. We developed a numerical compass (NC) method that integrates computational models, global optimization, ensemble methods, and machine learning to identify experimental conditions with the greatest potential to constrain model parameters. The approach is based on the quantification of model output variance in an ensemble of solutions that agree with experimental data. The utility of the NC method is demonstrated for the parameters of a multi-layer model describing the heterogeneous ozonolysis of oleic acid aerosols. We show how neural network surrogate models of the multiphase chemical reaction system can be used to accelerate the application of the NC for a comprehensive mapping and analysis of experimental conditions. The NC can also be applied for uncertainty quantification of quantitative structure-activity relationship (QSAR) models. We show that the uncertainty calculated for molecules that are used to extend training data correlates with the reduction of QSAR model error. The code is openly available as the Julia package KineticCompass.

11.
Neural Netw ; 173: 106183, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38382397

RESUMO

The rising global incidence of human Mpox cases necessitates prompt and accurate identification for effective disease control. Previous studies have predominantly delved into traditional ensemble methods for detection, we introduce a novel approach by leveraging a metaheuristic-based ensemble framework. In this research, we present an innovative CGO-Ensemble framework designed to elevate the accuracy of detecting Mpox infection in patients. Initially, we employ five transfer learning base models that integrate feature integration layers and residual blocks. These components play a crucial role in capturing significant features from the skin images, thereby enhancing the models' efficacy. In the next step, we employ a weighted averaging scheme to consolidate predictions generated by distinct models. To achieve the optimal allocation of weights for each base model in the ensemble process, we leverage the Chaos Game Optimization (CGO) algorithm. This strategic weight assignment enhances classification outcomes considerably, surpassing the performance of randomly assigned weights. Implementing this approach yields notably enhanced prediction accuracy compared to using individual models. We evaluate the effectiveness of our proposed approach through comprehensive experiments conducted on two widely recognized benchmark datasets: the Mpox Skin Lesion Dataset (MSLD) and the Mpox Skin Image Dataset (MSID). To gain insights into the decision-making process of the base models, we have performed Gradient Class Activation Mapping (Grad-CAM) analysis. The experimental results showcase the outstanding performance of the CGO-ensemble, achieving an impressive accuracy of 100% on MSLD and 94.16% on MSID. Our approach significantly outperforms other state-of-the-art optimization algorithms, traditional ensemble methods, and existing techniques in the context of Mpox detection on these datasets. These findings underscore the effectiveness and superiority of the CGO-Ensemble in accurately identifying Mpox cases, highlighting its potential in disease detection and classification.


Assuntos
Mpox , Humanos , Algoritmos , Redes Neurais de Computação , Benchmarking , Aprendizagem
12.
BMC Bioinformatics ; 25(1): 56, 2024 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-38308205

RESUMO

BACKGROUND: Genome-wide association studies have successfully identified genetic variants associated with human disease. Various statistical approaches based on penalized and machine learning methods have recently been proposed for disease prediction. In this study, we evaluated the performance of several such methods for predicting asthma using the Korean Chip (KORV1.1) from the Korean Genome and Epidemiology Study (KoGES). RESULTS: First, single-nucleotide polymorphisms were selected via single-variant tests using logistic regression with the adjustment of several epidemiological factors. Next, we evaluated the following methods for disease prediction: ridge, least absolute shrinkage and selection operator, elastic net, smoothly clipped absolute deviation, support vector machine, random forest, boosting, bagging, naïve Bayes, and k-nearest neighbor. Finally, we compared their predictive performance based on the area under the curve of the receiver operating characteristic curves, precision, recall, F1-score, Cohen's Kappa, balanced accuracy, error rate, Matthews correlation coefficient, and area under the precision-recall curve. Additionally, three oversampling algorithms are used to deal with imbalance problems. CONCLUSIONS: Our results show that penalized methods exhibit better predictive performance for asthma than that achieved via machine learning methods. On the other hand, in the oversampling study, randomforest and boosting methods overall showed better prediction performance than penalized methods.


Assuntos
Algoritmos , Estudo de Associação Genômica Ampla , Humanos , Teorema de Bayes , Aprendizado de Máquina , República da Coreia/epidemiologia
13.
J Magn Reson Imaging ; 2024 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-38243677

RESUMO

Anomaly detection in medical imaging, particularly within the realm of magnetic resonance imaging (MRI), stands as a vital area of research with far-reaching implications across various medical fields. This review meticulously examines the integration of artificial intelligence (AI) in anomaly detection for MR images, spotlighting its transformative impact on medical diagnostics. We delve into the forefront of AI applications in MRI, exploring advanced machine learning (ML) and deep learning (DL) methodologies that are pivotal in enhancing the precision of diagnostic processes. The review provides a detailed analysis of preprocessing, feature extraction, classification, and segmentation techniques, alongside a comprehensive evaluation of commonly used metrics. Further, this paper explores the latest developments in ensemble methods and explainable AI, offering insights into future directions and potential breakthroughs. This review synthesizes current insights, offering a valuable guide for researchers, clinicians, and medical imaging experts. It highlights AI's crucial role in improving the precision and speed of detecting key structural and functional irregularities in MRI. Our exploration of innovative techniques and trends furthers MRI technology development, aiming to refine diagnostics, tailor treatments, and elevate patient care outcomes. LEVEL OF EVIDENCE: 5 TECHNICAL EFFICACY: Stage 1.

14.
Neural Netw ; 170: 364-375, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38029718

RESUMO

Neural network ensembling is a common and robust way to increase model efficiency. In this paper, we propose a new neural network ensemble algorithm based on Audibert's empirical star algorithm. We provide optimal theoretical minimax bound on the excess squared risk. Additionally, we empirically study this algorithm on regression and classification tasks and compare it to most popular ensembling methods.


Assuntos
Algoritmos , Redes Neurais de Computação
15.
Materials (Basel) ; 16(20)2023 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-37895669

RESUMO

The redox properties of quinones underlie their unique characteristics as organic battery components that outperform the conventional inorganic ones. Furthermore, these redox properties could be precisely tuned by using different substituent groups. Machine learning and statistics, on the other hand, have proven to be very powerful approaches for the efficient in silico design of novel materials. Herein, we demonstrated the machine learning approach for the prediction of the redox activity of quinones that potentially can serve as organic battery components. For the needs of the present study, a database of small quinone-derived molecules was created. A large number of quantum chemical and chemometric descriptors were generated for each molecule and, subsequently, different statistical approaches were applied to select the descriptors that most prominently characterized the relationship between the structure and the redox potential. Various machine learning methods for the screening of prospective organic battery electrode materials were deployed to select the most trustworthy strategy for the machine learning-aided design of organic redox materials. It was found that Ridge regression models perform better than Regression decision trees and Decision tree-based ensemble algorithms.

16.
Sensors (Basel) ; 23(16)2023 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-37631821

RESUMO

Vision-based object detection is essential for safe and efficient field operation for autonomous agricultural vehicles. However, one of the challenges in transferring state-of-the-art object detectors to the agricultural domain is the limited availability of labeled datasets. This paper seeks to address this challenge by utilizing two object detection models based on YOLOv5, one pre-trained on a large-scale dataset for detecting general classes of objects and one trained to detect a smaller number of agriculture-specific classes. To combine the detections of the models at inference, we propose an ensemble module based on a hierarchical structure of classes. Results show that applying the proposed ensemble module increases mAP@.5 from 0.575 to 0.65 on the test dataset and reduces the misclassification of similar classes detected by different models. Furthermore, by translating detections from base classes to a higher level in the class hierarchy, we can increase the overall mAP@.5 to 0.701 at the cost of reducing class granularity.

17.
Adv Exp Med Biol ; 1424: 241-246, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37486500

RESUMO

The high-throughput sequencing method known as RNA-Seq records the whole transcriptome of individual cells. Single-cell RNA sequencing, also known as scRNA-Seq, is widely utilized in the field of biomedical research and has resulted in the generation of huge quantities and types of data. The noise and artifacts that are present in the raw data require extensive cleaning before they can be used. When applied to applications for machine learning or pattern recognition, feature selection methods offer a method to reduce the amount of time spent on calculation while simultaneously improving predictions and offering a better knowledge of the data. The process of discovering biomarkers is analogous to feature selection methods used in machine learning and is especially helpful for applications in the medical field. An attempt is made by a feature selection algorithm to cut down on the total number of features by eliminating those that are unnecessary or redundant while retaining those that are the most helpful.We apply FS algorithms designed for scRNA-Seq to Alzheimer's disease, which is the most prevalent neurodegenerative disease in the western world and causes cognitive and behavioral impairment. AD is clinically and pathologically varied, and genetic studies imply a diversity of biological mechanisms and pathways. Over 20 new Alzheimer's disease susceptibility loci have been discovered through linkage, genome-wide association, and next-generation sequencing (Tosto G, Reitz C, Mol Cell Probes 30:397-403, 2016). In this study, we focus on the performance of three different approaches to marker gene selection methods and compare them using the support vector machine (SVM), k-nearest neighbors' algorithm (k-NN), and linear discriminant analysis (LDA), which are mainly supervised classification algorithms.


Assuntos
Doença de Alzheimer , Doenças Neurodegenerativas , Humanos , Doença de Alzheimer/genética , Estudo de Associação Genômica Ampla , Algoritmos , RNA-Seq
18.
Methods Mol Biol ; 2690: 401-417, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37450162

RESUMO

The attachment of a virion to a respective cellular receptor on the host organism occurring through the virus-host protein-protein interactions (PPIs) is a decisive step for viral pathogenicity and infectivity. Therefore, a vast number of wet-lab experimental techniques are used to study virus-host PPIs. Taking the great number and enormous variety of virus-host PPIs and the cost as well as labor of laboratory work, however, computational approaches toward analyzing the available interaction data and predicting previously unidentified interactions have been on the rise. Among them, machine-learning-based models are getting increasingly more attention with a great body of resources and tools proposed recently.In this chapter, we first provide the methodology with major steps toward the development of a virus-host PPI prediction tool. Next, we discuss the challenges involved and evaluate several existing machine-learning-based virus-host PPI prediction tools. Finally, we describe our experience with several ensemble techniques as utilized on available prediction results retrieved from individual PPI prediction tools. Overall, based on our experience, we recognize there is still room for the development of new individual and/or ensemble virus-host PPI prediction tools that leverage existing tools.


Assuntos
Mapeamento de Interação de Proteínas , Vírus , Mapeamento de Interação de Proteínas/métodos , Aprendizado de Máquina , Biologia Computacional/métodos
19.
J Electr Eng Technol ; 18(2): 719-733, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37521955

RESUMO

With increasing demand for energy, the penetration of alternative sources such as renewable energy in power grids has increased. Solar energy is one of the most common and well-known sources of energy in existing networks. But because of its non-stationary and non-linear characteristics, it needs to predict solar irradiance to provide more reliable Photovoltaic (PV) plants and manage the power of supply and demand. Although there are various methods to predict the solar irradiance. This paper gives the overview of recent studies with focus on solar irradiance forecasting with ensemble methods which are divided into two main categories: competitive and cooperative ensemble forecasting. In addition, parameter diversity and data diversity are considered as competitive ensemble forecasting and also preprocessing and post-processing are as cooperative ensemble forecasting. All these ensemble forecasting methods are investigated in this study. In the end, the conclusion has been drawn and the recommendations for future studies have been discussed.

20.
Entropy (Basel) ; 25(2)2023 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-36832611

RESUMO

Today's world faces a serious public health problem with cancer. One type of cancer that begins in the breast and spreads to other body areas is breast cancer (BC). Breast cancer is one of the most prevalent cancers that claim the lives of women. It is also becoming clearer that most cases of breast cancer are already advanced when they are brought to the doctor's attention by the patient. The patient may have the evident lesion removed, but the seeds have reached an advanced stage of development or the body's ability to resist them has weakened considerably, rendering them ineffective. Although it is still much more common in more developed nations, it is also quickly spreading to less developed countries. The motivation behind this study is to use an ensemble method for the prediction of BC, as an ensemble model aims to automatically manage the strengths and weaknesses of each of its separate models, resulting in the best decision being made overall. The main objective of this paper is to predict and classify breast cancer using Adaboost ensemble techniques. The weighted entropy is computed for the target column. Taking each attribute's weights results in the weighted entropy. Each class's likelihood is represented by the weights. The amount of information gained increases with a decrease in entropy. Both individual and homogeneous ensemble classifiers, created by mixing Adaboost with different single classifiers, have been used in this work. In order to deal with the class imbalance issue as well as noise, the synthetic minority over-sampling technique (SMOTE) was used as part of the data mining pre-processing. The suggested approach uses a decision tree (DT) and naive Bayes (NB), with Adaboost ensemble techniques. The experimental findings shown 97.95% accuracy for prediction using the Adaboost-random forest classifier.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...