RESUMEN
Knowledge distillation (KD) is a conventional method in the field of deep learning that enables the transfer of dark knowledge from a teacher model to a student model, consequently improving the performance of the student model. In randomized neural networks, due to the simple topology of network architecture and the insignificant relationship between model performance and model size, KD is not able to improve model performance. In this work, we propose a self-distillation pipeline for randomized neural networks: the predictions of the network itself are regarded as the additional target, which are mixed with the weighted original target as a distillation target containing dark knowledge to supervise the training of the model. All the predictions during multi-generation self-distillation process can be integrated by a multi-teacher method. By induction, we have additionally arrived at the methods for infinite self-distillation (ISD) of randomized neural networks. We then provide relevant theoretical analysis about the self-distillation method for randomized neural networks. Furthermore, we demonstrated the effectiveness of the proposed method in practical applications on several benchmark datasets.
RESUMEN
Rescheduling is a necessary procedure for a flexible job shop when newly arrived priority jobs must be inserted into an existing schedule. Instability measures the amount of change made to the existing schedule and is an important metrics to evaluate the quality of rescheduling solutions. This paper focuses on a flexible job-shop rescheduling problem (FJRP) for new job insertion. First, it formulates FJRP for new job insertion arising from pump remanufacturing. This paper deals with bi-objective FJRPs to minimize: 1) instability and 2) one of the following indices: a) makespan; b) total flow time; c) machine workload; and d) total machine workload. Next, it discretizes a novel and simple metaheuristic, named Jaya, resulting in DJaya and improves it to solve FJRP. Two simple heuristics are employed to initialize high-quality solutions. Finally, it proposes five objective-oriented local search operators and four ensembles of them to improve the performance of DJaya. Finally, it performs experiments on seven real-life cases with different scales from pump remanufacturing and compares DJaya with some state-of-the-art algorithms. The results show that DJaya is effective and efficient for solving the concerned FJRPs.
RESUMEN
Developing efficient evolutionary algorithms attracts many researchers due to the existence of optimization problems in numerous real-world applications. A new differential evolution algorithm, sTDE-dR, is proposed to improve the search quality, avoid premature convergence, and stagnation. The population is clustered in multiple tribes and utilizes an ensemble of different mutation and crossover strategies. In this algorithm, a competitive success-based scheme is introduced to determine the life cycle of each tribe and its participation ratio for the next generation. In each tribe, a different adaptive scheme is used to control the scaling factor and crossover rate. The mean success of each subgroup is used to calculate the ratio of its participation for the next generation. This guarantees that successful tribes with the best adaptive schemes are only the ones that guide the search toward the optimal solution. The population size is dynamically reduced using a dynamic reduction method. Comprehensive comparison of the proposed heuristic over a challenging set of benchmarks from the CEC2014 real parameter single objective competition against several state-of-the-art algorithms is performed. The results affirm robustness of the proposed approach compared to other state-of-the-art algorithms.
RESUMEN
Deep neural network-based methods have recently achieved excellent performance in visual tracking task. As very few training samples are available in visual tracking task, those approaches rely heavily on extremely large auxiliary dataset such as ImageNet to pretrain the model. In order to address the discrepancy between the source domain (the auxiliary data) and the target domain (the object being tracked), they need to be finetuned during the tracking process. However, those methods suffer from sensitivity to the hyper-parameters such as learning rate, maximum number of epochs, size of mini-batch, and so on. Thus, it is worthy to investigate whether pretraining and fine tuning through conventional back-prop is essential for visual tracking. In this paper, we shed light on this line of research by proposing convolutional random vector functional link (CRVFL) neural network, which can be regarded as a marriage of the convolutional neural network and random vector functional link network, to simplify the visual tracking system. The parameters in the convolutional layer are randomly initialized and kept fixed. Only the parameters in the fully connected layer need to be learned. We further propose an elegant approach to update the tracker. In the widely used visual tracking benchmark, without any auxiliary data, a single CRVFL model achieves 79.0% with a threshold of 20 pixels for the precision plot. Moreover, an ensemble of CRVFL yields comparatively the best result of 86.3%.
RESUMEN
Wind energy is a clean and an abundant renewable energy source. Accurate wind speed forecasting is essential for power dispatch planning, unit commitment decision, maintenance scheduling, and regulation. However, wind is intermittent and wind speed is difficult to predict. This brief proposes a novel wind speed forecasting method by integrating empirical mode decomposition (EMD) and support vector regression (SVR) methods. The EMD is used to decompose the wind speed time series into several intrinsic mode functions (IMFs) and a residue. Subsequently, a vector combining one historical data from each IMF and the residue is generated to train the SVR. The proposed EMD-SVR model is evaluated with a wind speed data set. The proposed EMD-SVR model outperforms several recently reported methods with respect to accuracy or computational complexity.
RESUMEN
Differential evolution (DE) is one of the most powerful stochastic real parameter optimizers of current interest. In this paper, we propose a new mutation strategy, a fitness-induced parent selection scheme for the binomial crossover of DE, and a simple but effective scheme of adapting two of its most important control parameters with an objective of achieving improved performance. The new mutation operator, which we call DE/current-to-gr_best/1, is a variant of the classical DE/current-to-best/1 scheme. It uses the best of a group (whose size is q% of the population size) of randomly selected solutions from current generation to perturb the parent (target) vector, unlike DE/current-to-best/1 that always picks the best vector of the entire population to perturb the target vector. In our modified framework of recombination, a biased parent selection scheme has been incorporated by letting each mutant undergo the usual binomial crossover with one of the p top-ranked individuals from the current population and not with the target vector with the same index as used in all variants of DE. A DE variant obtained by integrating the proposed mutation, crossover, and parameter adaptation strategies with the classical DE framework (developed in 1995) is compared with two classical and four state-of-the-art adaptive DE variants over 25 standard numerical benchmarks taken from the IEEE Congress on Evolutionary Computation 2005 competition and special session on real parameter optimization. Our comparative study indicates that the proposed schemes improve the performance of DE by a large magnitude such that it becomes capable of enjoying statistical superiority over the state-of-the-art DE variants for a wide variety of test problems. Finally, we experimentally demonstrate that, if one or more of our proposed strategies are integrated with existing powerful DE variants such as jDE and JADE, their performances can also be enhanced.