RESUMO
Drought prediction is a complex phenomenon that impacts human activities and the environment. For this reason, predicting its behavior is crucial to mitigating such effects. Deep learning techniques are emerging as a powerful tool for this task. The main goal of this work is to review the state-of-the-art for characterizing the deep learning techniques used in the drought prediction task. The results suggest that the most widely used climate indexes were the Standardized Precipitation Index (SPI) and the Standardized Precipitation Evapotranspiration Index (SPEI). Regarding the multispectral index, the Normalized Difference Vegetation Index (NDVI) is the indicator most utilized. On the other hand, countries with a higher production of scientific knowledge in this area are located in Asia and Oceania; meanwhile, America and Africa are the regions with few publications. Concerning deep learning methods, the Long-Short Term Memory network (LSTM) is the algorithm most implemented for this task, either implemented canonically or together with other deep learning techniques (hybrid methods). In conclusion, this review reveals a need for more scientific knowledge about drought prediction using multispectral indices and deep learning techniques in America and Africa; therefore, it is an opportunity to characterize the phenomenon in developing countries.
RESUMO
The enhanced multi-objective symbolic discretization for time series (eMODiTS) uses an evolutionary process to identify the appropriate discretization scheme in the Time Series Classification (TSC) task. It discretizes using a unique alphabet cut for each word segment. However, this kind of scheme has a higher computational cost. Therefore, this study implemented surrogate models to minimize this cost. The general procedure is summarized below.â¢The K-nearest neighbor for regression, the support vector regression model, and the Ra- dial Basis Functions neural networks were implemented as surrogate models to estimate the objective values of eMODiTS, including the discretization process.â¢An archive-based update strategy was introduced to maintain diversity in the training set.â¢Finally, the model update process uses a hybrid (fixed and dynamic) approach for the surrogate model's evolution control.
RESUMO
Mixed integer nonlinear programming (MINLP) addresses optimization problems that involve continuous and discrete/integer decision variables, as well as nonlinear functions. These problems often exhibit multiple discontinuous feasible parts due to the presence of integer variables. Discontinuous feasible parts can be analyzed as subproblems, some of which may be highly constrained. This significantly impacts the performance of evolutionary algorithms (EAs), whose operators are generally insensitive to constraints, leading to the generation of numerous infeasible solutions. In this article, a variant of the differential evolution algorithm (DE) with a gradient-based repair method for MINLP problems (G-DEmi) is proposed. The aim of the repair method is to fix promising infeasible solutions in different subproblems using the gradient information of the constraint set. Extensive experiments were conducted to evaluate the performance of G-DEmi on a set of MINLP benchmark problems and a real-world case. The results demonstrated that G-DEmi outperformed several state-of-the-art algorithms. Notably, G-DEmi did not require novel improvement strategies in the variation operators to promote diversity; instead, an effective exploration within each subproblem is under consideration. Furthermore, the gradient-based repair method was successfully extended to other DE variants, emphasizing its capacity in a more general context.
RESUMO
This article presents a study, intending to design a model with 90% reliability, which helps in the prediction of school dropouts in higher and secondary education institutions, implementing machine learning techniques. The collection of information was carried out with open data from the 2015 Intercensal Survey and the 2010 and 2020 Population and Housing censuses carried out by the National Institute of Statistics and Geography, which contain information about the inhabitants and homes. in the 32 federal entities of Mexico. The data were homologated and twenty variables were selected, based on the correlation. After cleaning the data, there was a sample of 1,080,782 records in total. Supervised learning was used to create the model, automating data processing with training and testing, applying the following techniques, Artificial Neural Networks, Support Vector Machines, Linear Ridge and Lasso Regression, Bayesian Optimization, Random Forest, the first two with a reliability greater than 99% and the last with 91%.
Assuntos
Aprendizado de Máquina , Evasão Escolar , Humanos , Teorema de Bayes , Reprodutibilidade dos Testes , Redes Neurais de Computação , Máquina de Vetores de SuporteRESUMO
This paper proposes the tuning approach of the event-triggered controller (ETCTA) for the robotic system stabilization task where the reduction of the stabilization error and the data broadcasting of the control update are simultaneously considered. This approach is stated as a dynamic optimization problem, and the best controller parameters are obtained by using fourteen different bio-inspired optimization algorithms. The statistics results reveal that, among the tested bio-inspired optimization algorithms, the most reliable algorithm in the proposed tuning problem is the differential evolution variant DE/Best/1/Exp. The obtained result is validated both in numerical simulation as well as using a laboratory prototype. The simulation results indicate that the obtained control parameters can also deal with disturbances and reference changes not considered in the ETCTA's optimization problem formulation without significantly worsening the control design objective. Experimental results disclose that the proposed event-triggered control tuning approach provides the best trade-off between the number of control signal updates and the position error among other tuning approaches, decreasing the data broadcasting of the control update by around 86.33% with a non-significant increase in the stabilization error of around 26.53%.
Assuntos
Procedimentos Cirúrgicos Robóticos , Robótica , Algoritmos , Simulação por ComputadorRESUMO
The identification of subnetworks of interest-or active modules-by integrating biological networks with molecular profiles is a key resource to inform on the processes perturbed in different cellular conditions. We here propose MOGAMUN, a Multi-Objective Genetic Algorithm to identify active modules in MUltiplex biological Networks. MOGAMUN optimizes both the density of interactions and the scores of the nodes (e.g., their differential expression). We compare MOGAMUN with state-of-the-art methods, representative of different algorithms dedicated to the identification of active modules in single networks. MOGAMUN identifies dense and high-scoring modules that are also easier to interpret. In addition, to our knowledge, MOGAMUN is the first method able to use multiplex networks. Multiplex networks are composed of different layers of physical and functional relationships between genes and proteins. Each layer is associated to its own meaning, topology, and biases; the multiplex framework allows exploiting this diversity of biological networks. We applied MOGAMUN to identify cellular processes perturbed in Facio-Scapulo-Humeral muscular Dystrophy, by integrating RNA-seq expression data with a multiplex biological network. We identified different active modules of interest, thereby providing new angles for investigating the pathomechanisms of this disease. Availability: MOGAMUN is available at https://github.com/elvanov/MOGAMUN and as a Bioconductor package at https://bioconductor.org/packages/release/bioc/html/MOGAMUN.html. Contact: anais.baudot@univ-amu.fr.
Assuntos
Algoritmos , Modelos Biológicos , Biologia Computacional , Simulação por Computador , Bases de Dados de Ácidos Nucleicos , Redes Reguladoras de Genes , Humanos , Modelos Genéticos , Distrofia Muscular Facioescapuloumeral/genética , Distrofia Muscular Facioescapuloumeral/metabolismo , RNA-Seq , Software , Biologia de Sistemas , Integração de Sistemas , Teoria de Sistemas , TranscriptomaRESUMO
The efficient speed regulation of four-bar mechanisms is required for many industrial processes. These mechanisms are hard to control due to the highly nonlinear behavior and the presence of uncertainties or disturbances. In this paper, different Pareto-front approximation search approaches in the adaptive controller tuning based on online multiobjective metaheuristic optimization are studied through their application in the four-bar mechanism speed regulation problem. Dominance-based, decomposition-based, metric-driven, and hybrid search approaches included in the algorithms, such as nondominated sorting genetic algorithm II, multiobjective evolutionary algorithm based on decomposition and differential evolution, S-metric selection evolutionary multiobjective algorithm, and nondominated sorting genetic algorithm III, respectively, are considered in this paper. Also, a proposed metric-driven algorithm based on the differential evolution and the hypervolume indicator (HV-MODE) is incorporated into the analysis. The comparative descriptive and nonparametric statistical evidence presented in this paper shows the effectiveness of the adaptive controller tuning based on online multiobjective metaheuristic optimization and reveals the advantages of the metric-driven search approach.
RESUMO
An essential aspect in the interaction between people and computers is the recognition of facial expressions. A key issue in this process is to select relevant features to classify facial expressions accurately. This study examines the selection of optimal geometric features to classify six basic facial expressions: happiness, sadness, surprise, fear, anger, and disgust. Inspired by the Facial Action Coding System (FACS) and the Moving Picture Experts Group 4th standard (MPEG-4), an initial set of 89 features was proposed. These features are normalized distances and angles in 2D and 3D computed from 22 facial landmarks. To select a minimum set of features with the maximum classification accuracy, two selection methods and four classifiers were tested. The first selection method, principal component analysis (PCA), obtained 39 features. The second selection method, a genetic algorithm (GA), obtained 47 features. The experiments ran on the Bosphorus and UIVBFED data sets with 86.62% and 93.92% median accuracy, respectively. Our main finding is that the reduced feature set obtained by the GA is the smallest in comparison with other methods of comparable accuracy. This has implications in reducing the time of recognition.
Assuntos
Reconhecimento Facial Automatizado , Emoções , Expressão Facial , HumanosRESUMO
This paper presents two-swim operators to be added to the chemotaxis process of the modified bacterial foraging optimization algorithm to solve three instances of the synthesis of four-bar planar mechanisms. One swim favors exploration while the second one promotes fine movements in the neighborhood of each bacterium. The combined effect of the new operators looks to increase the production of better solutions during the search. As a consequence, the ability of the algorithm to escape from local optimum solutions is enhanced. The algorithm is tested through four experiments and its results are compared against two BFOA-based algorithms and also against a differential evolution algorithm designed for mechanical design problems. The overall results indicate that the proposed algorithm outperforms other BFOA-based approaches and finds highly competitive mechanisms, with a single set of parameter values and with less evaluations in the first synthesis problem, with respect to those mechanisms obtained by the differential evolution algorithm, which needed a parameter fine-tuning process for each optimization problem.
Assuntos
Algoritmos , Inteligência Artificial , Fenômenos Fisiológicos Bacterianos , Evolução Biológica , Simulação por Computador , Quimiotaxia/fisiologiaRESUMO
In this work, we present a novel application of time series discretization using evolutionary programming for the classification of precancerous cervical lesions. The approach optimizes the number of intervals in which the length and amplitude of the time series should be compressed, preserving the important information for classification purposes. Using evolutionary programming, the search for a good discretization scheme is guided by a cost function which considers three criteria: the entropy regarding the classification, the complexity measured as the number of different strings needed to represent the complete data set, and the compression rate assessed as the length of the discrete representation. This discretization approach is evaluated using a time series data based on temporal patterns observed during a classical test used in cervical cancer detection; the classification accuracy reached by our method is compared with the well-known times series discretization algorithm SAX and the dimensionality reduction method PCA. Statistical analysis of the classification accuracy shows that the discrete representation is as efficient as the complete raw representation for the present application, reducing the dimensionality of the time series length by 97%. This representation is also very competitive in terms of classification accuracy when compared with similar approaches.
Assuntos
Lesões Pré-Cancerosas/classificação , Neoplasias do Colo do Útero/classificação , Feminino , Humanos , Análise de Componente PrincipalRESUMO
The bias-variance dilemma is a well-known and important problem in Machine Learning. It basically relates the generalization capability (goodness of fit) of a learning method to its corresponding complexity. When we have enough data at hand, it is possible to use these data in such a way so as to minimize overfitting (the risk of selecting a complex model that generalizes poorly). Unfortunately, there are many situations where we simply do not have this required amount of data. Thus, we need to find methods capable of efficiently exploiting the available data while avoiding overfitting. Different metrics have been proposed to achieve this goal: the Minimum Description Length principle (MDL), Akaike's Information Criterion (AIC) and Bayesian Information Criterion (BIC), among others. In this paper, we focus on crude MDL and empirically evaluate its performance in selecting models with a good balance between goodness of fit and complexity: the so-called bias-variance dilemma, decomposition or tradeoff. Although the graphical interaction between these dimensions (bias and variance) is ubiquitous in the Machine Learning literature, few works present experimental evidence to recover such interaction. In our experiments, we argue that the resulting graphs allow us to gain insights that are difficult to unveil otherwise: that crude MDL naturally selects balanced models in terms of bias-variance, which not necessarily need be the gold-standard ones. We carry out these experiments using a specific model: a Bayesian network. In spite of these motivating results, we also should not overlook three other components that may significantly affect the final model selection: the search procedure, the noise rate and the sample size.
Assuntos
Algoritmos , Viés , Teorema de Bayes , Bases de Dados como Assunto , ProbabilidadeRESUMO
A nonrigid body image registration method for spatiotemporal alignment of image sequences obtained from colposcopy examinations to detect precancerous lesions of the cervix is proposed in this paper. The approach is based on time series calculation for those pixels in the first image of the sequence and a division of such image into small windows. A search process is then carried out to find the window with the highest affinity in each image of the sequence and replace it with the window in the reference image. The affinity value is based on polynomial approximation of the time series computed and the search is bounded by a search radius which defines the neighborhood of each window. The proposed approach is tested in ten 310-frame real cases in two experiments: the first one to determine the best values for the window size and the search radius and the second one to compare the best obtained results with respect to four registration methods found in the specialized literature. The obtained results show a robust and competitive performance of the proposed approach with a significant lower time with respect to the compared methods.