Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.159
Filtrar
1.
Water Res ; 267: 122544, 2024 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-39383645

RESUMO

Remote sensing water quality monitoring technology can effectively supplement the shortcomings of traditional water quality monitoring methods in spatiotemporal dynamic monitoring capabilities. At present, although the spectral feature-based remote sensing water quality inversion models have achieved many successes, there could still be a problem of insufficient generalization ability in monitoring the water quality of complex river networks in large cities. In this paper, we propose a spectro-environmental factors integrated ensemble learning model for urban river network water quality inversion. We analyzed the correlation between water quality parameters, spectral reflectance, and environmental factors based on an in-situ dataset collected in the northern part of Shanghai. Using the Hot Spot Analysis (Getis-Ord Gi*), we found that river network water quality parameters have different patterns in different urban functional zones. Furthermore, daily average temperature, total rainfall within the seven days, and several band combinations were also selected as the environmental and spectral features using factor analysis and Pearson correlation coefficient analysis. After the feature analysis, the spectro-environmental factors integrated ensemble learning model was trained. Compared with the spectral-based machine learning inversion models, the coefficients of determination R2 increased by about 0.50. Our model was also tested in three different test areas within and outside the in-situ sampling areas in Shanghai based on low-altitude multispectral remote sensing images. The R2 results for total phosphorus (TP), ammonia nitrogen (NH3-N), and chemical oxygen demand (COD) within the in-situ sampling areas were 0.52, 0.58, and 0.56 respectively. The mean absolute percentage error (MAPE) results were 53.36%, 63.95%, and 22.46% respectively. After adding the area outside the in-situ sampling areas, the R2 results for TP, NH3-N, and COD were 0.47, 0.47, and 0.53. The MAPE were 49.38%, 74.46%, and 20.49%. Our research provided a new remote sensing water quality inversion method to be utilized in complex urban river networks which exhibited solid accuracy and generalization ability.

2.
Genome Biol ; 25(1): 260, 2024 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-39379999

RESUMO

BACKGROUND: Polygenic risk score (PRS) is a major research topic in human genetics. However, a significant gap exists between PRS methodology and applications in practice due to often unavailable individual-level data for various PRS tasks including model fine-tuning, benchmarking, and ensemble learning. RESULTS: We introduce an innovative statistical framework to optimize and benchmark PRS models using summary statistics of genome-wide association studies. This framework builds upon our previous work and can fine-tune virtually all existing PRS models while accounting for linkage disequilibrium. In addition, we provide an ensemble learning strategy named PUMAS-ensemble to combine multiple PRS models into an ensemble score without requiring external data for model fitting. Through extensive simulations and analysis of many complex traits in the UK Biobank, we demonstrate that this approach closely approximates gold-standard analytical strategies based on external validation, and substantially outperforms state-of-the-art PRS methods. CONCLUSIONS: Our method is a powerful and general modeling technique that can continue to combine the best-performing PRS methods out there through ensemble learning and could become an integral component for all future PRS applications.


Assuntos
Benchmarking , Estudo de Associação Genômica Ampla , Herança Multifatorial , Estudo de Associação Genômica Ampla/métodos , Humanos , Modelos Genéticos , Predisposição Genética para Doença , Desequilíbrio de Ligação , Estratificação de Risco Genético
3.
Sci Rep ; 14(1): 23516, 2024 Oct 09.
Artigo em Inglês | MEDLINE | ID: mdl-39384798

RESUMO

TextNetTopics (Yousef et al. in Front Genet 13:893378, 2022. https://doi.org/10.3389/fgene.2022.893378 ) is a recently developed approach that performs text classification-based topics (a topic is a group of terms or words) extracted from a Latent Dirichlet Allocation topic modeling as features rather than individual words. Following this approach enables TextNetTopics to fulfill dimensionality reduction while preserving and embedding more thematic and semantic information into the text document representations. In this article, we introduced a novel approach, the Ensemble Topic Model for Topic Selection (ENTM-TS), an advancement of TextNetTopics. ENTM-TS integrates multiple topic models using the Grouping, Scoring, and Modeling approach, thereby mitigating the performance variability introduced by employing individual topic modeling methods within TextNetTopics. Additionally, we performed a thorough comparative study to evaluate TextNetTopics' performance using eleven state-of-the-art topic modeling algorithms. We used the extracted topics for each as input to the G component in the TextNetTopics tool to select the most compelling topic model regarding their predictive behavior for text classification. We conducted our comprehensive evaluation utilizing the Drug-Induced Liver Injury textual dataset from the CAMDA community and the WOS-5736 dataset. The experimental results show that the Latent Semantic Indexing provides comparable performance measures with fewer discriminative features when compared with other topic modeling methods. Moreover, our evaluation reveals that the performance of ENTM-TS surpasses or aligns with the optimal outcomes obtained from individual topic models across the two datasets, establishing it as a robust and effective enhancement in text classification tasks.

4.
Heliyon ; 10(16): e36097, 2024 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-39247275

RESUMO

Cassava is a most important carbohydrate human food consumed in many African and Asian countries. Cassava leaf disease is the major issue which affects production. Automatic early cassava leaf disease detection through deep learning models and transfer learning models were used for multiclass classification with different approaches. Existing approaches deal with imbalanced dataset for predicting the classes. This research work develops an approach based on hybrid Ensemble - deep transfer model approach for early leaf disease detection. Data augmentation was applied to the raw data for balancing the dataset. Three distinct new hybrid models namely Ensemble(InceptionV3+DenseNet-BC-121-32 + Xception), Ensemble(ResNet50V2+DenseNet-BC-121-32), Ensemble(ResNet50V2+ResNet50) were developed. The proposed model shows high performance results. A broad comparison of the proposed model was performed with custom based Convolutional Neural Network and pre-trained models. Highest accuracy of 88.83% and 97.89% was obtained in ensemble based approach that combined InceptionV3, Xception, DenseNet-BC-121-32 for five class and two class classification respectively.

5.
PeerJ ; 12: e17975, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39247551

RESUMO

Link prediction (LP) is a task for the identification of potential, missing and spurious links in complex networks. Protein-protein interaction (PPI) networks are important for understanding the underlying biological mechanisms of diseases. Many complex networks have been constructed using LP methods; however, there are a limited number of studies that focus on disease-related gene predictions and evaluate these genes using various evaluation criteria. The main objective of the study is to investigate the effect of a simple ensemble method in disease related gene predictions. Local similarity indices (LSIs) based disease related gene predictions were integrated by a simple ensemble decision method, simple majority voting (SMV), on the PPI network to detect accurate disease related genes. Human PPI network was utilized to discover potential disease related genes using four LSIs for the gene prediction. LSIs discovered potential links between disease related genes, which were obtained from OMIM database for gastric, colorectal, breast, prostate and lung cancers. LSIs based disease related genes were ranked due to their LSI scores in descending order for retrieving the top 10, 50 and 100 disease related genes. SMV integrated four LSIs based predictions to obtain SMV based the top 10, 50 and 100 disease related genes. The performance of LSIs based and SMV based genes were evaluated separately by employing overlap analyses, which were performed with GeneCard disease-gene relation dataset and Gene Ontology (GO) terms. The GO-terms were used for biological assessment for the inferred gene lists by LSIs and SMV on all cancer types. Adamic-Adar (AA), Resource Allocation Index (RAI), and SMV based gene lists are generally achieved good performance results on all cancers in both overlap analyses. SMV also outperformed on breast cancer data. The increment in the selection of the number of the top ranked disease related genes also enhanced the performance results of SMV.


Assuntos
Biologia Computacional , Humanos , Biologia Computacional/métodos , Mapas de Interação de Proteínas/genética , Neoplasias/genética , Bases de Dados Genéticas , Redes Reguladoras de Genes/genética , Predisposição Genética para Doença , Algoritmos
6.
Comput Biol Med ; 182: 109084, 2024 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-39250874

RESUMO

BACKGROUND: This study aimed to assess the efficacy of various supervised longitudinal learning approaches, comparing traditional statistical models and machine learning algorithms for prediction with longitudinal data. The primary objectives were to evaluate the predictive performance of different supervised longitudinal learning methods for low birth weight (LBW) and very low birth weight (VLBW) based on prenatal ultrasound measurements. Additionally, the study sought to extract interpretable risk features for disease prediction. METHODS: The evaluation involved benchmarking the performance of longitudinal models against conventional machine learning methods. Classification accuracy for LBW and VLBW at birth, as well as prediction accuracy for birth weight using prenatal sonographic ultrasound measurements, were assessed. RESULTS: Among the learning approaches we investigated in this study, the longitudinal machine learning approach, specifically, the mixed effect random forest (MERF), delivered the overall best performance in predicting birthweights and classifying LBW/VLBW disease status. CONCLUSION: The MERF combined the power of advanced machine learning algorithms to accommodate the inherent within-individual dependence in the observed data, delivering satisfactory performance in predicting the birthweight and classifying LBW/VLBW disease status. The study emphasized the importance of incorporating previous ultrasound measurements and considering correlations between repeated measurements for accurate prediction. The interpretable trees algorithm used for risk feature extraction proved reliable and applicable to other learning algorithms. These findings underscored the potential of longitudinal learning methods in improving birth weight prediction and highlighted the relevance of consistent risk features in line with established literature.

7.
Appl Spectrosc ; : 37028241276013, 2024 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-39252509

RESUMO

The miniature fiber Raman spectroscopy detection technology can reflect the properties of biomolecules through spectral characteristics and has the advantages of noninvasiveness, real-time, safety, label-free operation, and potential for early cancer diagnosis. This technology holds promise for developing portable, low-cost, intraoperative tumor detection instruments. Glioma is one of the most common malignant tumors of the central nervous system with rapid growth and a short disease course. However, the considerable heterogeneity of the glioma sample leads to substantial intraclass variance in collected spectra, coupled with the miniature Raman spectrometer's low signal-to-noise ratio. These factors diminish the accuracy of the brain glioma recognition model. To address this issue, a glioma identification method based on digital multimodal spectra integrated with deep learning features fusion (DMS-DLFF) using the miniature Raman spectrometer is proposed. Different from existing multimodal tumor detection methods employing multiple spectral instruments, DMS-DLFF enhances tumor identification accuracy without increasing hardware costs. The method mathematically decomposes the original spectra to Raman and fluorescence spectra, so as to augment the biospectral information. Then, the deep learning method is used to extract the feature information of the two kinds of spectra, respectively, and the digital multimodal spectral fusion is realized at the feature level. Moreover, a two-layer pattern recognition model is constructed based on the ensemble strategy, amalgamating the strengths of diverse classifiers. Meanwhile, the bagging strategy is introduced to improve support vector machine algorithms, one of the basic classifiers. Compared with traditional methodologies, DMS-DLFF operates at both the feature level and decision level, employing high-information-density feature vectors to train ensemble classification models for increasing overall recognition accuracy. This study collected 260 Raman spectra of glioma and 151 Raman spectra of normal brain tissue. The accuracy, sensitivity, and specificity were 91.9%, 96.7%, and 80.8%, respectively. The proposed method outperforms traditional algorithms in brain glioma detection, which helps doctors formulate precise surgical plans and thereby improve patient prognosis.

8.
IEEE Trans Comput Soc Syst ; 11(1): 247-266, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-39239536

RESUMO

Adaptive interpretable ensemble model based on three-dimensional Convolutional Neural Network (3DCNN) and Genetic Algorithm (GA), i.e., 3DCNN+EL+GA, was proposed to differentiate the subjects with Alzheimer's Disease (AD) or Mild Cognitive Impairment (MCI) and further identify the discriminative brain regions significantly contributing to the classifications in a data-driven way. Plus, the discriminative brain sub-regions at a voxel level were further located in these achieved brain regions, with a gradient-based attribution method designed for CNN. Besides disclosing the discriminative brain sub-regions, the testing results on the datasets from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the Open Access Series of Imaging Studies (OASIS) indicated that 3DCNN+EL+GA outperformed other state-of-the-art deep learning algorithms and that the achieved discriminative brain regions (e.g., the rostral hippocampus, caudal hippocampus, and medial amygdala) were linked to emotion, memory, language, and other essential brain functions impaired early in the AD process. Future research is needed to examine the generalizability of the proposed method and ideas to discern discriminative brain regions for other brain disorders, such as severe depression, schizophrenia, autism, and cerebrovascular diseases, using neuroimaging.

9.
Comput Struct Biotechnol J ; 23: 3175-3185, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-39253057

RESUMO

5-formylcytidine (f5C) is a unique post-transcriptional RNA modification found in mRNA and tRNA at the wobble site, playing a crucial role in mitochondrial protein synthesis and potentially contributing to the regulation of translation. Recent studies have unveiled that the f5C modifications may drive mitochondrial mRNA translation to power cancer metastasis. Accurate identification of f5C sites is essential for further unraveling their molecular functions and regulatory mechanisms, but there are currently no computational methods available for predicting their locations. In this study, we introduce an innovative ensemble approach, successfully enabling the computational recognition of Saccharomyces cerevisiae f5C. We conducted a comprehensive model selection process that involved multiple basic machine learning and deep learning algorithms such as recurrent neural networks, convolutional neural networks and Transformer-based models. Initially trained only on sequence information, these individual models achieved an AUROC ranging from 0.7104 to 0.7492. Through the integration of 32 novel domain-derived genomic features, the performance of individual models has significantly improved to an AUROC between 0.7309 and 0.8076. To further enhance accuracy and robustness, we then constructed the ensembles of these individual models with different combinations. The best performance attained by our ensemble models reached an AUROC of 0.8391. Shapley additive explanations were conducted to explain the significant contributions of genomic features, providing insights into the putative distribution of f5C across various topological regions and potentially paving the way for revealing their functional relevance within distinct genomic contexts. A freely accessible web server that allows real-time analysis of user-uploaded sites can be accessed at: www.rnamd.org/Resf5C-Pred.

10.
PeerJ Comput Sci ; 10: e2254, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39314734

RESUMO

Code smells refer to poor design and implementation choices by software engineers that might affect the overall software quality. Code smells detection using machine learning models has become a popular area to build effective models that are capable of detecting different code smells in multiple programming languages. However, the process of building of such effective models has not reached a state of stability, and most of the existing research focuses on Java code smells detection. The main objective of this article is to propose dynamic ensembles using two strategies, namely greedy search and backward elimination, which are capable of accurately detecting code smells in two programming languages (i.e., Java and Python), and which are less complex than full stacking ensembles. The detection performance of dynamic ensembles were investigated within the context of four Java and two Python code smells. The greedy search and backward elimination strategies yielded different base models lists to build dynamic ensembles. In comparison to full stacking ensembles, dynamic ensembles yielded less complex models when they were used to detect most of the investigated Java and Python code smells, with the backward elimination strategy resulting in less complex models. Dynamic ensembles were able to perform comparably against full stacking ensembles with no significant detection loss. This article concludes that dynamic stacking ensembles were able to facilitate the effective and stable detection performance of Java and Python code smells over all base models and with less complexity than full stacking ensembles.

11.
PeerJ Comput Sci ; 10: e2289, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39314740

RESUMO

Given the exponential growth of available data in large networks, the need for an accurate and explainable intrusion detection system has become of high necessity to effectively discover attacks in such networks. To deal with this challenge, we propose a two-phase Explainable Ensemble deep learning-based method (EED) for intrusion detection. In the first phase, a new ensemble intrusion detection model using three one-dimensional long short-term memory networks (LSTM) is designed for an accurate attack identification. The outputs of three classifiers are aggregated using a meta-learner algorithm resulting in refined and improved results. In the second phase, interpretability and explainability of EED outputs are enhanced by leveraging the capabilities of SHape Additive exPplanations (SHAP). Factors contributing to the identification and classification of attacks are highlighted which allows security experts to understand and interpret the attack behavior and then implement effective response strategies to improve the network security. Experiments conducted on real datasets have shown the effectiveness of EED compared to conventional intrusion detection methods in terms of both accuracy and explainability. The EED method exhibits high accuracy in accurately identifying and classifying attacks while providing transparency and interpretability.

12.
Talanta ; 280: 126793, 2024 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-39222596

RESUMO

Dry matter content (DMC), firmness and soluble solid content (SSC) are important indicators for assessing the quality attributes and determining the maturity of kiwifruit. However, traditional measurement methods are time-consuming, labor-intensive, and destructive to the kiwifruit, leading to resource wastage. In order to solve this problem, this study has tracked the flowering, fruiting, maturing and collecting processes of Ya'an red-heart kiwifruit, and has proposed a non-destructive method for kiwifruit quality attribute assessment and maturity identification that combines fluorescence hyperspectral imaging (FHSI) technology and chemometrics. Specifically, first of all, three different spectral data preprocessing methods were adopted, and PLSR was used to evaluate the quality attributes (DMC, firmness, and SSC) of kiwifruit. Next, the differences in accuracy of different models in discriminating kiwifruit maturity were compared, and an ensemble learning model based on LightGBM and GBDT models was constructed. The results indicate that the ensemble learning model outperforms single machine learning models. In addition, the application effects of the 'Convolutional Neural Network'-'Multilayer Perceptron' (CNN-MLP) model under different optimization algorithms were compared. To improve the robustness of the model, an improved whale optimization algorithm (IWOA) was introduced by modifying the acceleration factor. Overall, the IWOA-CNN-MLP model performs the best in discriminating the maturity of kiwifruit, with Accuracytest of 0.916 and Loss of 0.23. In addition, compared with the basic model, the accuracy of the integrated learning model SG-MSC-SEL was improved by about 12%-20 %. The research findings will provide new perspectives for the evaluation of kiwifruit quality and maturity discrimination using FHSI and chemometric methods, thereby promoting further research and applications in this field.


Assuntos
Actinidia , Frutas , Imageamento Hiperespectral , Actinidia/química , Actinidia/crescimento & desenvolvimento , Imageamento Hiperespectral/métodos , Frutas/química , Frutas/crescimento & desenvolvimento , Quimiometria , Redes Neurais de Computação , Qualidade dos Alimentos , Fluorescência , Controle de Qualidade
13.
Heliyon ; 10(17): e36631, 2024 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-39281628

RESUMO

Commodity futures are an important hedging tool in material trade, and by accurately predicting prices, countries and firms are able to make informed production and consumption decisions. This paper introduces a novel machine learning ensemble method that combines decomposition algorithms and physical optimization algorithms to predict commodity futures prices. First, the VMD(Variational mode decomposition) is optimized by the RIME algorithm (Rime optimization algorithm) to obtain the optimal modal decomposition results, and the trend and seasonal terms are predicted using the ELM (Extreme Learning Machines) and FA (Fourier Attention) models, respectively, and the results are finally synthesized. The results show that the MAPE(mean absolute percentage error) of one-step, three-step, and six-step methods for predicting crude oil prices are 0.48%, 0.66%, and 0.75%, respectively, and the MAPE of soybean prediction results are 0.22%, 0.27%, and 0.37%, respectively. The empirical results and ablation experiments show that it outperforms other benchmark models in terms of both horizontal and directional accuracy. Notably, it outperforms in predicting soybean futures prices, which demonstrates the ability of our model to better capture the characteristics of both the time and frequency domains of the series, to take sufficient consideration of the series characteristics, and to ensure robustness.

14.
Appl Spectrosc ; : 37028241278902, 2024 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-39233644

RESUMO

Diabetes mellitus is a prevalent chronic disease necessitating timely identification for effective management. This paper introduces a reliable, straightforward, and efficient method for the minimally invasive identification of diabetes mellitus through nanosecond pulsed laser-induced breakdown spectroscopy (LIBS) by integrating a state-of-the-art machine learning approach. LIBS spectra were collected from urine samples of diabetic and healthy individuals. Principal component analysis and an ensemble learning classification model were used to identify significant changes in LIBS peak intensity between the diseased and normal urine samples. The model, integrating six distinct classifiers and cross-validation techniques, exhibited high accuracy (96.5%) in predicting diabetes mellitus. Our findings emphasize the potential of LIBS for diabetes mellitus identification in urine samples. This technique may hold potential for future applications in diagnosing other health conditions.

15.
J Orthop Surg Res ; 19(1): 539, 2024 Sep 04.
Artigo em Inglês | MEDLINE | ID: mdl-39227869

RESUMO

BACKGROUND: Machine learning (ML) is extensively employed for forecasting the outcome of various illnesses. The objective of the study was to develop ML based classifiers using a stacking ensemble strategy to predict the Japanese Orthopedic Association (JOA) recovery rate for patients with degenerative cervical myelopathy (DCM). METHODS: A total of 672 patients with DCM were included in the study and labeled with JOA recovery rate by 1-year follow-up. All data were collected during 2012-2023 and were randomly divided into training and testing (8:2) sub-datasets. A total of 91 initial ML classifiers were developed, and the top 3 initial classifiers with the best performance were further stacked into an ensemble classifier with a supported vector machine (SVM) classifier. The area under the curve (AUC) was the main indicator to assess the prediction performance of all classifiers. The primary predicted outcome was the JOA recovery rate. RESULTS: By applying an ensemble learning strategy (e.g., stacking), the accuracy of the ML classifier improved following combining three widely used ML models (e.g., RFE-SVM, EmbeddingLR-LR, and RFE-AdaBoost). Decision curve analysis showed the merits of the ensemble classifiers, as the curves of the top 3 initial classifiers varied a lot in predicting JOA recovery rate in DCM patients. CONCLUSIONS: The ensemble classifiers successfully predict the JOA recovery rate in DCM patients, which showed a high potential for assisting physicians in managing DCM patients and making full use of medical resources.


Assuntos
Vértebras Cervicais , Aprendizado de Máquina , Humanos , Vértebras Cervicais/cirurgia , Masculino , Feminino , Pessoa de Meia-Idade , Resultado do Tratamento , Idoso , Doenças da Medula Espinal/cirurgia , Máquina de Vetores de Suporte , Recuperação de Função Fisiológica , Seguimentos , Previsões
16.
Heliyon ; 10(16): e35792, 2024 Aug 30.
Artigo em Inglês | MEDLINE | ID: mdl-39229515

RESUMO

Dynamic ensemble selection has emerged as a promising approach for hyperspectral image classification. However, selecting relevant features and informative samples remains a pressing challenge. To address this issue, we introduce two novel dynamic residual ensemble learning methods. The first proposed method is called multi-features driven dynamic weighted residuals ensemble learning (MF-DWRL). This method leverages various combinations of features to construct classifier pools that incorporate feature differences. The K-Nearest Neighbors algorithm is employed to establish the region of competence (RoC) in the dynamic ensemble selection process. By assessing the performance of the RoC, the feature sets that yield the highest classification accuracy are identified as the optimal feature combinations. Additionally, the classification accuracy is utilized as prior information to guide the residual adjustments of each classifier. The second method, known as features and samples double-driven dynamic weighted residual ensemble learning (FS-DWRL), further enhances the performance of the ensemble. This approach not only considers the selection of feature combinations but also takes into account the informative samples. By jointly optimizing the feature and sample selection processes, FS-DWRL achieves superior classification accuracy compared to existing state-of-the-art methods. To evaluate the effectiveness of the proposed methods, three hyperspectral datasets from China-WHU-Hi-HanChuan, WHU-Hi-LongKou, and WHU-Hi-HongHu-are used for classification experiments. For these datasets, the proposed methods achieve the highest classification accuracies of 90.57 %, 98.77 %, and 91.08 %, respectively. The MF-DWRL and FS-DWRL methods exhibit significant improvements in classification accuracy.

17.
Heliyon ; 10(17): e37141, 2024 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-39319161

RESUMO

Agriculture has notably become one of the fields experiencing intensive digital transformation. Leveraging state-of-the-art techniques in this domain has provided numerous advantages for agricultural activities. Deep learning (DL) algorithms have proven beneficial in addressing various agricultural challenges. This study presents a comprehensive investigation into applying DL models for palm disease detection and classification in the context of smart agriculture. The research aims to address the limitations observed in previous studies and improve the robustness and generalizability of the results. To achieve this, a two-stage optimization methodology is employed. First, transfer learning and fine-tuning techniques are applied using various pre-trained deep neural network models. The experiments show promising results, with all models achieving high accuracy rates during training and validation. Furthermore, their performance on unseen test data is also assessed to ensure practical applicability. The top-performing models are MobileNetV2 (92.48 %), ResNet (92.42 %), ResNetRS50 (92.30 %), and DenseNet121 (92.01 %). Second, a deep ensemble learning approach is applied to enhance the models' generalization capability further. The best-performing models with different criteria are combined using the ensemble technique, resulting in remarkable improvements in disease detection tasks. DELM1 emerges as the most successful ensemble model, achieving an ROC AUC Score of 99 %. This study demonstrates the effectiveness of deep ensemble learning models in palm disease detection and classification for smart agriculture applications. The findings contribute to advancing disease detection systems and emphasize the potential of ensemble learning. The study provides valuable insights for future research, guiding the application of DL techniques to address critical agricultural challenges and improve crop health monitoring systems. Another contribution is combining various plant diseases and insect pest classes using diverse datasets. A comprehensive classification system is achieved by considering different disease classes and stages within the white scale category, improving the model's robustness.

18.
Front Genet ; 15: 1336891, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39319317

RESUMO

The traditional single nucleotide polymorphism (SNP)-wise approach in genome-wide association studies is focused on examining the marginal association between each SNP with the outcome separately and applying multiple testing adjustments to the resulting p-values to reduce false positives. However, the approach suffers a lack of power in identifying biomarkers. We design an ensemble machine learning approach to aggregate results from logistic regression models based on multiple subsamples, which helps to identify biomarkers from high-dimensional genomic data. We use different methods to analyze a genome-wide association study from the Alzheimer's Disease Neuroimaging Initiative. The SNP-wise approach does not identify any significant signal, while our novel approach provides a list of ranked SNPs associated with the cognitive functions of interests.

19.
Front Robot AI ; 11: 1445565, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39346742

RESUMO

Diabetic Retinopathy (DR) is a serious eye condition that occurs due to high blood sugar levels in patients with Diabetes Mellitus. If left untreated, DR can potentially result in blindness. Using automated neural network-based methods to grade DR shows potential for early detection. However, the uneven and non-quadrilateral forms of DR lesions provide difficulties for traditional Convolutional Neural Network (CNN)-based architectures. To address this challenge and explore a novel algorithm architecture, this work delves into the usage of contrasting cluster assignments in retinal fundus images with the Swapping Assignments between multiple Views (SwAV) algorithm for DR grading. An ablation study was made where SwAV outperformed other CNN and Transformer-based models, independently and in ensemble configurations with an accuracy of 87.00% despite having fewer parameters and layers. The proposed approach outperforms existing state-of-the-art models regarding classification metrics, complexity, and prediction time. The findings offer great potential for medical practitioners, allowing for more accurate diagnosis of DR and earlier treatments to avoid visual loss.

20.
BMC Med Res Methodol ; 24(1): 221, 2024 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-39333904

RESUMO

Diabetes is thought to be the most common illness in underdeveloped nations. Early detection and competent medical care are crucial steps in reducing the effects of diabetes. Examining the signs associated with diabetes is one of the most effective ways to identify the condition. The problem of missing data is not very well investigated in existing works. In addition, existing studies on diabetes detection lack accuracy and robustness. The available datasets frequently contain missing information for the automated detection of diabetes, which might negatively impact machine learning model performance. This work suggests an automated diabetes prediction method that achieves high accuracy and effectively manages missing variables in order to address this problem. The proposed strategy employs a stacked ensemble voting classifier model with three machine learning models. and a KNN Imputer to handle missing values. Using the KNN imputer, the suggested model performs exceptionally well, with accuracy, precision, recall, F1 score, and MCC of 98.59%, 99.26%, 99.75%, 99.45%, and 99.24%, respectively. In two scenarios one with missing values eliminated and the other with KNN imputer, the study thoroughly compared the suggested model with seven other machine learning techniques. The outcomes demonstrate the superiority of the suggested model over current state-of-the-art methods and confirm its efficacy. This work demonstrates the capability of KNN imputer and looks at the problem of missing values for diabetes detection. Medical professionals can utilize the results to improve care for diabetes patients and discover problems early.


Assuntos
Algoritmos , Mineração de Dados , Diabetes Mellitus , Aprendizado de Máquina , Humanos , Mineração de Dados/métodos , Mineração de Dados/estatística & dados numéricos , Diabetes Mellitus/diagnóstico , Feminino , Masculino , Pessoa de Meia-Idade , Adulto
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...