Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
1.
Biomimetics (Basel) ; 9(5)2024 May 13.
Article in English | MEDLINE | ID: mdl-38786502

ABSTRACT

One of the significant challenges in scaling agile software development is organizing software development teams to ensure effective communication among members while equipping them with the capabilities to deliver business value independently. A formal approach to address this challenge involves modeling it as an optimization problem: given a professional staff, how can they be organized to optimize the number of communication channels, considering both intra-team and inter-team channels? In this article, we propose applying a set of bio-inspired algorithms to solve this problem. We introduce an enhancement that incorporates ensemble learning into the resolution process to achieve nearly optimal results. Ensemble learning integrates multiple machine-learning strategies with diverse characteristics to boost optimizer performance. Furthermore, the studied metaheuristics offer an excellent opportunity to explore their linear convergence, contingent on the exploration and exploitation phases. The results produce more precise definitions for team sizes, aligning with industry standards. Our approach demonstrates superior performance compared to the traditional versions of these algorithms.

2.
Heliyon ; 10(8): e29398, 2024 Apr 30.
Article in English | MEDLINE | ID: mdl-38655356

ABSTRACT

-The automatic identification of human physical activities, commonly referred to as Human Activity Recognition (HAR), has garnered significant interest and application across various sectors, including entertainment, sports, and notably health. Within the realm of health, a myriad of applications exists, contingent upon the nature of experimentation, the activities under scrutiny, and the methodology employed for data and information acquisition. This diversity opens doors to multifaceted applications, including support for the well-being and safeguarding of elderly individuals afflicted with neurodegenerative diseases, especially in the context of smart homes. Within the existing literature, a multitude of datasets from both indoor and outdoor environments have surfaced, significantly contributing to the activity identification processes. One prominent dataset, the CASAS project developed by Washington State University (WSU) University, encompasses experiments conducted in indoor settings. This dataset facilitates the identification of a range of activities, such as cleaning, cooking, eating, washing hands, and even making phone calls. This article introduces a model founded on the principles of Semi-supervised Ensemble Learning, enabling the harnessing of the potential inherent in distance-based clustering analysis. This technique aids in the identification of distinct clusters, each encapsulating unique activity characteristics. These clusters serve as pivotal inputs for the subsequent classification process, which leverages supervised techniques. The outcomes of this approach exhibit great promise, as evidenced by the quality metrics' analysis, showcasing favorable results compared to the existing state-of-the-art methods. This integrated framework not only contributes to the field of HAR but also holds immense potential for enhancing the capabilities of smart homes and related applications.

3.
Genes (Basel) ; 14(2)2023 01 20.
Article in English | MEDLINE | ID: mdl-36833196

ABSTRACT

Context: Inferring gene regulatory networks (GRN) from high-throughput gene expression data is a challenging task for which different strategies have been developed. Nevertheless, no ever-winning method exists, and each method has its advantages, intrinsic biases, and application domains. Thus, in order to analyze a dataset, users should be able to test different techniques and choose the most appropriate one. This step can be particularly difficult and time consuming, since most methods' implementations are made available independently, possibly in different programming languages. The implementation of an open-source library containing different inference methods within a common framework is expected to be a valuable toolkit for the systems biology community. Results: In this work, we introduce GReNaDIne (Gene Regulatory Network Data-driven Inference), a Python package that implements 18 machine learning data-driven gene regulatory network inference methods. It also includes eight generalist preprocessing techniques, suitable for both RNA-seq and microarray dataset analysis, as well as four normalization techniques dedicated to RNA-seq. In addition, this package implements the possibility to combine the results of different inference tools to form robust and efficient ensembles. This package has been successfully assessed under the DREAM5 challenge benchmark dataset. The open-source GReNaDIne Python package is made freely available in a dedicated GitLab repository, as well as in the official third-party software repository PyPI Python Package Index. The latest documentation on the GReNaDIne library is also available at Read the Docs, an open-source software documentation hosting platform. Contribution: The GReNaDIne tool represents a technological contribution to the field of systems biology. This package can be used to infer gene regulatory networks from high-throughput gene expression data using different algorithms within the same framework. In order to analyze their datasets, users can apply a battery of preprocessing and postprocessing tools and choose the most adapted inference method from the GReNaDIne library and even combine the output of different methods to obtain more robust results. The results format provided by GReNaDIne is compatible with well-known complementary refinement tools such as PYSCENIC.


Subject(s)
Computational Biology , Gene Regulatory Networks , Computational Biology/methods , Saint Vincent and the Grenadines , Software , Gene Expression
4.
Rev Socionetwork Strateg ; 16(2): 259-289, 2022.
Article in English | MEDLINE | ID: mdl-36159389

ABSTRACT

Fake news detection continues to be a major problem that affects our society today. Fake news can be classified using a variety of methods. Predicting and detecting fake news has proven to be challenging even for machine learning algorithms. This research employs Legitimacy, a unique ensemble machine learning model to accomplish the task of Credibility-Based Fake News Detection. The Legitimacy ensemble combines the learning potential of a Two-Class Boosted Decision Tree and a Two-Class Neural Network. The ensemble technique follows a pseudo-mixture-of-experts methodology. For the gating model, an instance of Two-Class Logistic Regression is implemented. This study validates Legitimacy using a standard dataset with features relating to the credibility of news publishers to predict fake news. These features are analysed using the ensemble algorithm. The results of these experiments are examined using four evaluation methodologies. The analysis of the results reveals positive performance with the use of the ensemble ML method with an accuracy of 96.9%. This ensemble's performance is compared with the performance of the two base machine learning models of the ensemble. The performance of the ensemble surpasses that of the two base models. The performance of Legitimacy is also analysed as the size of the dataset increases to demonstrate its scalability. Hence, based on our selected dataset, the Legitimacy ensemble model has proven to be most appropriate for Credibility-Based Fake News Detection.

5.
Sensors (Basel) ; 22(16)2022 Aug 16.
Article in English | MEDLINE | ID: mdl-36015882

ABSTRACT

To improve the monitoring of the electrical power grid, it is necessary to evaluate the influence of contamination in relation to leakage current and its progression to a disruptive discharge. In this paper, insulators were tested in a saline chamber to simulate the increase of salt contamination on their surface. From the time series forecasting of the leakage current, it is possible to evaluate the development of the fault before a flashover occurs. In this paper, for a complete evaluation, the long short-term memory (LSTM), group method of data handling (GMDH), adaptive neuro-fuzzy inference system (ANFIS), bootstrap aggregation (bagging), sequential learning (boosting), random subspace, and stacked generalization (stacking) ensemble learning models are analyzed. From the results of the best structure of the models, the hyperparameters are evaluated and the wavelet transform is used to obtain an enhanced model. The contribution of this paper is related to the improvement of well-established models using the wavelet transform, thus obtaining hybrid models that can be used for several applications. The results showed that using the wavelet transform leads to an improvement in all the used models, especially the wavelet ANFIS model, which had a mean RMSE of 1.58 ×10-3, being the model that had the best result. Furthermore, the results for the standard deviation were 2.18 ×10-19, showing that the model is stable and robust for the application under study. Future work can be performed using other components of the distribution power grid susceptible to contamination because they are installed outdoors.


Subject(s)
Fuzzy Logic , Neural Networks, Computer , Forecasting , Time Factors
6.
Front Genet ; 13: 834724, 2022.
Article in English | MEDLINE | ID: mdl-35692843

ABSTRACT

This study aimed to perform a genome-wide association analysis (GWAS) using the Random Forest (RF) approach for scanning candidate genes for age at first calving (AFC) in Nellore cattle. Additionally, potential epistatic effects were investigated using linear mixed models with pairwise interactions between all markers with high importance scores within the tree ensemble non-linear structure. Data from Nellore cattle were used, including records of animals born between 1984 and 2015 and raised in commercial herds located in different regions of Brazil. The estimated breeding values (EBV) were computed and used as the response variable in the genomic analyses. After quality control, the remaining number of animals and SNPs considered were 3,174 and 360,130, respectively. Five independent RF analyses were carried out, considering different initialization seeds. The importance score of each SNP was averaged across the independent RF analyses to rank the markers according to their predictive relevance. A total of 117 SNPs associated with AFC were identified, which spanned 10 autosomes (2, 3, 5, 10, 11, 17, 18, 21, 24, and 25). In total, 23 non-overlapping genomic regions embedded 262 candidate genes for AFC. Enrichment analysis and previous evidence in the literature revealed that many candidate genes annotated close to the lead SNPs have key roles in fertility, including embryo pre-implantation and development, embryonic viability, male germinal cell maturation, and pheromone recognition. Furthermore, some genomic regions previously associated with fertility and growth traits in Nellore cattle were also detected in the present study, reinforcing the effectiveness of RF for pre-screening candidate regions associated with complex traits. Complementary analyses revealed that many SNPs top-ranked in the RF-based GWAS did not present a strong marginal linear effect but are potentially involved in epistatic hotspots between genomic regions in different autosomes, remarkably in the BTAs 3, 5, 11, and 21. The reported results are expected to enhance the understanding of genetic mechanisms involved in the biological regulation of AFC in this cattle breed.

7.
Sensors (Basel) ; 22(9)2022 Apr 29.
Article in English | MEDLINE | ID: mdl-35591091

ABSTRACT

The Assisted Living Environments Research Area-AAL (Ambient Assisted Living), focuses on generating innovative technology, products, and services to assist, medical care and rehabilitation to older adults, to increase the time in which these people can live. independently, whether they suffer from neurodegenerative diseases or some disability. This important area is responsible for the development of activity recognition systems-ARS (Activity Recognition Systems), which is a valuable tool when it comes to identifying the type of activity carried out by older adults, to provide them with assistance. that allows you to carry out your daily activities with complete normality. This article aims to show the review of the literature and the evolution of the different techniques for processing this type of data from supervised, unsupervised, ensembled learning, deep learning, reinforcement learning, transfer learning, and metaheuristics approach applied to this sector of science. health, showing the metrics of recent experiments for researchers in this area of knowledge. As a result of this article, it can be identified that models based on reinforcement or transfer learning constitute a good line of work for the processing and analysis of human recognition activities.


Subject(s)
Ambient Intelligence , Disabled Persons , Activities of Daily Living , Aged , Human Activities , Humans , Technology
8.
Mol Divers ; 25(3): 1361-1373, 2021 Aug.
Article in English | MEDLINE | ID: mdl-34264440

ABSTRACT

Trypanosomatid-caused diseases are among the neglected infectious diseases with the highest disease burden, affecting about 27 million people worldwide and, in particular, socio-economically vulnerable populations. Trypanothione synthetase (TryS) is considered one of the most attractive drug targets within the thiol-polyamine metabolism of typanosomatids, being unique, essential and druggable. Here, we have compiled a dataset of 401 T. brucei TryS inhibitors that includes compounds with inhibitory data reported in the literature, but also in-house acquired data. QSAR classifiers were derived and validated from such dataset, using publicly available and open-source software, thus assuring the portability of the obtained models. The performance and robustness of the resulting models were substantially improved through ensemble learning. The performance of the individual models and the model ensembles was further assessed through retrospective virtual screening campaigns. At last, as an application example, the chosen model-ensemble has been applied in a prospective virtual screening campaign on DrugBank 5.1.6 compound library. All the in-house scripts used in this study are available on request, whereas the dataset has been included as supplementary material.


Subject(s)
Amide Synthases/chemistry , Drug Discovery/methods , Enzyme Inhibitors/chemistry , Machine Learning , Algorithms , Amide Synthases/antagonists & inhibitors , Amide Synthases/metabolism , Antiprotozoal Agents/chemistry , Antiprotozoal Agents/pharmacology , Databases, Pharmaceutical , Drug Evaluation, Preclinical/methods , Drug Evaluation, Preclinical/standards , Enzyme Inhibitors/pharmacology , Humans , Metabolic Networks and Pathways , Models, Theoretical , ROC Curve , Structure-Activity Relationship
9.
Entropy (Basel) ; 22(9)2020 Sep 12.
Article in English | MEDLINE | ID: mdl-33286789

ABSTRACT

Sentiment polarity classification in social media is a very important task, as it enables gathering trends on particular subjects given a set of opinions. Currently, a great advance has been made by using deep learning techniques, such as word embeddings, recurrent neural networks, and encoders, such as BERT. Unfortunately, these techniques require large amounts of data, which, in some cases, is not available. In order to model this situation, challenges, such as the Spanish TASS organized by the Spanish Society for Natural Language Processing (SEPLN), have been proposed, which pose particular difficulties: First, an unwieldy balance in the training and the test set, being this latter more than eight times the size of the training set. Another difficulty is the marked unbalance in the distribution of classes, which is also different between both sets. Finally, there are four different labels, which create the need to adapt current classifications methods for multiclass handling. Traditional machine learning methods, such as Naïve Bayes, Logistic Regression, and Support Vector Machines, achieve modest performance in these conditions, but used as an ensemble it is possible to attain competitive execution. Several strategies to build classifier ensembles have been proposed; this paper proposes estimating an optimal weighting scheme using a Differential Evolution algorithm focused on dealing with particular issues that multiclass classification and unbalanced corpora pose. The ensemble with the proposed optimized weighting scheme is able to improve the classification results on the full test set of the TASS challenge (General corpus), achieving state of the art performance when compared with other works on this task, which make no use of NLP techniques.

10.
J Biomed Inform ; 111: 103575, 2020 11.
Article in English | MEDLINE | ID: mdl-32976990

ABSTRACT

Epidemiological time series forecasting plays an important role in health public systems, due to its ability to allow managers to develop strategic planning to avoid possible epidemics. In this paper, a hybrid learning framework is developed to forecast multi-step-ahead (one, two, and three-month-ahead) meningitis cases in four states of Brazil. First, the proposed approach applies an ensemble empirical mode decomposition (EEMD) to decompose the data into intrinsic mode functions and residual components. Then, each component is used as the input of five different forecasting models, and, from there, forecasted results are obtained. Finally, all combinations of models and components are developed, and for each case, the forecasted results are weighted integrated (WI) to formulate a heterogeneous ensemble forecaster for the monthly meningitis cases. In the final stage, a multi-objective optimization (MOO) using the Non-Dominated Sorting Genetic Algorithm - version II is employed to find a set of candidates' weights, and then the Technique for Order of Preference by similarity to Ideal Solution (TOPSIS) is applied to choose the adequate set of weights. Next, the most adequate model is the one with the best generalization capacity out-of-sample in terms of performance criteria including mean absolute error (MAE), relative root mean squared error (RRMSE), and symmetric mean absolute percentage error (sMAPE). By using MOO, the intention is to enhance the performance of the forecasting models by improving simultaneously their accuracy and stability measures. To access the model's performance, comparisons based on metrics are conducted with: (i) EEMD, heterogeneous ensemble integrated by direct strategy, or simple sum; (ii) EEMD, homogeneous ensemble of components WI; (iii) models without signal decomposition. At this stage, MAE, RRMSE, and sMAPE criteria as well as Diebold-Mariano statistical test are adopted. In all twelve scenarios, the proposed framework was able to perform more accurate and stable forecasts, which showed, on 89.17% of the cases, that the errors of the proposed approach are statistically lower than other approaches. These results showed that combining EEMD, heterogeneous ensemble and WI with weights obtained by optimization can develop precise and stable forecasts. The modeling developed in this paper is promising and can be used by managers to support decision making.


Subject(s)
Epidemics , Meningitis , Brazil , Forecasting , Humans , Meningitis/diagnosis , Meningitis/epidemiology
11.
Mini Rev Med Chem ; 20(14): 1447-1460, 2020.
Article in English | MEDLINE | ID: mdl-32072906

ABSTRACT

BACKGROUND: Since their introduction in the virtual screening field, Receiver Operating Characteristic (ROC) curve-derived metrics have been widely used for benchmarking of computational methods and algorithms intended for virtual screening applications. Whereas in classification problems, the ratio between sensitivity and specificity for a given score value is very informative, a practical concern in virtual screening campaigns is to predict the actual probability that a predicted hit will prove truly active when submitted to experimental testing (in other words, the Positive Predictive Value - PPV). Estimation of such probability is however, obstructed due to its dependency on the yield of actives of the screened library, which cannot be known a priori. OBJECTIVE: To explore the use of PPV surfaces derived from simulated ranking experiments (retrospective virtual screening) as a complementary tool to ROC curves, for both benchmarking and optimization of score cutoff values. METHODS: The utility of the proposed approach is assessed in retrospective virtual screening experiments with four datasets used to infer QSAR classifiers: inhibitors of Trypanosoma cruzi trypanothione synthetase; inhibitors of Trypanosoma brucei N-myristoyltransferase; inhibitors of GABA transaminase and anticonvulsant activity in the 6 Hz seizure model. RESULTS: Besides illustrating the utility of PPV surfaces to compare the performance of machine learning models for virtual screening applications and to select an adequate score threshold, our results also suggest that ensemble learning provides models with better predictivity and more robust behavior. CONCLUSION: PPV surfaces are valuable tools to assess virtual screening tools and choose score thresholds to be applied in prospective in silico screens. Ensemble learning approaches seem to consistently lead to improved predictivity and robustness.


Subject(s)
Machine Learning , Quantitative Structure-Activity Relationship , 4-Aminobutyrate Transaminase/antagonists & inhibitors , 4-Aminobutyrate Transaminase/metabolism , Animals , Anticonvulsants/chemistry , Anticonvulsants/therapeutic use , Area Under Curve , Protozoan Proteins/antagonists & inhibitors , Protozoan Proteins/metabolism , ROC Curve , Seizures/drug therapy , Seizures/pathology , Trypanosoma/metabolism
12.
Comput Biol Med ; 89: 135-143, 2017 10 01.
Article in English | MEDLINE | ID: mdl-28800442

ABSTRACT

It is estimated that in 2015, approximately 1.8 million people infected by tuberculosis died, most of them in developing countries. Many of those deaths could have been prevented if the disease had been detected at an earlier stage, but the most advanced diagnosis methods are still cost prohibitive for mass adoption. One of the most popular tuberculosis diagnosis methods is the analysis of frontal thoracic radiographs; however, the impact of this method is diminished by the need for individual analysis of each radiography by properly trained radiologists. Significant research can be found on automating diagnosis by applying computational techniques to medical images, thereby eliminating the need for individual image analysis and greatly diminishing overall costs. In addition, recent improvements on deep learning accomplished excellent results classifying images on diverse domains, but its application for tuberculosis diagnosis remains limited. Thus, the focus of this work is to produce an investigation that will advance the research in the area, presenting three proposals to the application of pre-trained convolutional neural networks as feature extractors to detect the disease. The proposals presented in this work are implemented and compared to the current literature. The obtained results are competitive with published works demonstrating the potential of pre-trained convolutional networks as medical image feature extractors.


Subject(s)
Image Processing, Computer-Assisted/methods , Neural Networks, Computer , Tuberculosis, Pulmonary/diagnostic imaging , Female , Humans , Male , Tuberculosis, Pulmonary/diagnosis
SELECTION OF CITATIONS
SEARCH DETAIL