RESUMO
Detection of abnormal situations in mobile systems not only provides predictions about risky situations but also has the potential to increase energy efficiency. In this study, two real-world drives of a battery electric vehicle and unsupervised hybrid anomaly detection approaches were developed. The anomaly detection performances of hybrid models created with the combination of Long Short-Term Memory (LSTM)-Autoencoder, the Local Outlier Factor (LOF), and the Mahalanobis distance were evaluated with the silhouette score, Davies-Bouldin index, and Calinski-Harabasz index, and the potential energy recovery rates were also determined. Two driving datasets were evaluated in terms of chaotic aspects using the Lyapunov exponent, Kolmogorov-Sinai entropy, and fractal dimension metrics. The developed hybrid models are superior to the sub-methods in anomaly detection. Hybrid Model-2 had 2.92% more successful results in anomaly detection compared to Hybrid Model-1. In terms of potential energy saving, Hybrid Model-1 provided 31.26% superiority, while Hybrid Model-2 provided 31.48%. It was also observed that there is a close relationship between anomaly and chaoticity. In the literature where cyber security and visual sources dominate in anomaly detection, a strategy was developed that provides energy efficiency-based anomaly detection and chaotic analysis from data obtained without additional sensor data.
RESUMO
This study proposed a separation method to identify the temperature-induced response from the long-term monitoring data with noise and other action-induced effects. In the proposed method, the original measured data are transformed using the local outlier factor (LOF), and the threshold of the LOF is determined by minimizing the variance of the modified data. The Savitzky-Golay convolution smoothing is also utilized to filter the noise of the modified data. Furthermore, this study proposes an optimization algorithm, namely the AOHHO, which hybridizes the Aquila Optimizer (AO) and the Harris Hawks Optimization (HHO) to identify the optimal value of the threshold of the LOF. The AOHHO employs the exploration ability of the AO and the exploitation ability of the HHO. Four benchmark functions illustrate that the proposed AOHHO owns a stronger search ability than the other four metaheuristic algorithms. A numerical example and in situ measured data are utilized to evaluate the performances of the proposed separation method. The results show that the separation accuracy of the proposed method is better than the wavelet-based method and is based on machine learning methods in different time windows. The maximum separation errors of the two methods are about 2.2 times and 5.1 times that of the proposed method, respectively.
RESUMO
As the world progresses toward a digitally connected and sustainable future, the integration of semi-supervised anomaly detection in wastewater treatment processes (WWTPs) promises to become an essential tool in preserving water resources and assuring the continuous effectiveness of plants. When these complex and dynamic systems are coupled with limited historical anomaly data or complex anomalies, it is crucial to have powerful tools capable of detecting subtle deviations from normal behavior to enable the early detection of equipment malfunctions. To address this challenge, in this study, we analyzed five semi-supervised machine learning techniques (SSLs) such as Isolation Forest (IF), Local Outlier Factor (LOF), One-Class Support Vector Machine (OCSVM), Multilayer Perceptron Autoencoder (MLP-AE), and Convolutional Autoencoder (Conv-AE) for detecting different anomalies (complete, concurrent, and complex) of the Dissolved Oxygen (DO) sensor and aeration valve in the WWTP. The best results are obtained in the case of Conv-AE algorithm, with an accuracy of 98.36 for complete faults, 97.81% for concurrent faults, and 98.64% for complex faults (a combination of incipient and concurrent faults). Additionally, we developed an anomaly detection system for the most effective semi-supervised technique, which can provide the detection of delay time and generate a fault alarm for each considered anomaly.
RESUMO
P450nor is a heme-containing enzyme that catalyzes the conversion of nitric oxide (NO) to nitrous oxide (N2O). Its catalytic mechanism has attracted attention in chemistry, biology, and environmental engineering. The catalytic cycle of P450nor is proposed to consist of three major steps. The reaction mechanism for the last step, N2O generation, remains unknown. In this study, the reaction pathway of the N2O generation from the intermediate I was explored with the B3LYP calculations using an active center model after the examination of the validity of the model. In the validation, we compared the heme distortions between P450nor and other oxidoreductases, suggesting a small effect of protein environment on the N2O generation reaction in P450nor. We then evaluated the electrostatic environment effect of P450nor on the hydride affinity to the active site with quantum mechanics/molecular mechanics (QM/MM) calculations, confirming that the affinity was unchanged with or without the protein environment. The active center model for P450nor showed that the N2O generation process in the enzymatic reaction undergoes a reasonable barrier height without protein environment. Consequently, our findings strongly suggest that the N2O generation reaction from the intermediate I depends sorely on the intrinsic reactivity of the heme cofactor bound on cysteine residue.
Assuntos
Óxido Nítrico , Oxirredutases , Oxirredutases/metabolismo , Óxido Nítrico/metabolismo , Óxido Nitroso/metabolismo , Simulação de Dinâmica Molecular , HemeRESUMO
Electroencephalogram (EEG) data are typically affected by artifacts. The detection and removal of bad channels (i.e., with poor signal-to-noise ratio) is a crucial initial step. EEG data acquired from different populations require different cleaning strategies due to the inherent differences in the data quality, the artifacts' nature, and the employed experimental paradigm. To deal with such differences, we propose a robust EEG bad channel detection method based on the Local Outlier Factor (LOF) algorithm. Unlike most existing bad channel detection algorithms that look for the global distribution of channels, LOF identifies bad channels relative to the local cluster of channels, which makes it adaptable to any kind of EEG. To test the performance and versatility of the proposed algorithm, we validated it on EEG acquired from three populations (newborns, infants, and adults) and using two experimental paradigms (event-related and frequency-tagging). We found that LOF can be applied to all kinds of EEG data after calibrating its main hyperparameter: the LOF threshold. We benchmarked the performance of our approach with the existing state-of-the-art (SoA) bad channel detection methods. We found that LOF outperforms all of them by improving the F1 Score, our chosen performance metric, by about 40% for newborns and infants and 87.5% for adults.
Assuntos
Eletroencefalografia , Processamento de Sinais Assistido por Computador , Adulto , Algoritmos , Artefatos , Eletroencefalografia/métodos , Humanos , Recém-Nascido , Razão Sinal-RuídoRESUMO
This paper proposes a new diagnostic method for sensor signals collected during semiconductor manufacturing. These signals provide important information for predicting the quality and yield of the finished product. Much of the data gathered during this process is time series data for fault detection and classification (FDC) in real time. This means that time series classification (TSC) must be performed during fabrication. With advances in semiconductor manufacturing, the distinction between normal and abnormal data has become increasingly significant as new challenges arise in their identification. One challenge is that an extremely high FDC performance is required, which directly impacts productivity and yield. However, general classification algorithms can have difficulty separating normal and abnormal data because of subtle differences. Another challenge is that the frequency of abnormal data is remarkably low. Hence, engineers can use only normal data to develop their models. This study presents a method that overcomes these problems and improves the FDC performance; it consists of two phases. Phase I has three steps: signal segmentation, feature extraction based on local outlier factors (LOF), and one-class classification (OCC) modeling using the isolation forest (iF) algorithm. Phase II, the test stage, consists of three steps: signal segmentation, feature extraction, and anomaly detection. The performance of the proposed method is superior to that of other baseline methods.
Assuntos
Algoritmos , Semicondutores , DifusãoRESUMO
The aim of this paper is to provide an extended analysis of the outlier detection, using probabilistic and AI techniques, applied in a demo pilot demand response in blocks of buildings project, based on real experiments and energy data collection with detected anomalies. A numerical algorithm was created to differentiate between natural energy peaks and outliers, so as to first apply a data cleaning. Then, a calculation of the impact in the energy baseline for the demand response computation was implemented, with improved precision, as related to other referenced methods and to the original data processing. For the demo pilot project implemented in the Technical University of Cluj-Napoca block of buildings, without the energy baseline data cleaning, in some cases it was impossible to compute the established key performance indicators (peak power reduction, energy savings, cost savings, CO2 emissions reduction) or the resulted values were far much higher (>50%) and not realistic. Therefore, in real case business models, it is crucial to use outlier's removal. In the past years, both companies and academic communities pulled their efforts in generating input that consist in new abstractions, interfaces, approaches for scalability, and crowdsourcing techniques. Quantitative and qualitative methods were created with the scope of error reduction and were covered in multiple surveys and overviews to cope with outlier detection.
RESUMO
Outlier detection in data streams is crucial to successful data mining. However, this task is made increasingly difficult by the enormous growth in the quantity of data generated by the expansion of Internet of Things (IoT). Recent advances in outlier detection based on the density-based local outlier factor (LOF) algorithms do not consider variations in data that change over time. For example, there may appear a new cluster of data points over time in the data stream. Therefore, we present a novel algorithm for streaming data, referred to as time-aware density-based incremental local outlier detection (TADILOF) to overcome this issue. In addition, we have developed a means for estimating the LOF score, termed "approximate LOF," based on historical information following the removal of outdated data. The results of experiments demonstrate that TADILOF outperforms current state-of-the-art methods in terms of AUC while achieving similar performance in terms of execution time. Moreover, we present an application of the proposed scheme to the development of an air-quality monitoring system.
RESUMO
Mobile activity recognition is significant to the development of human-centric pervasive applications including elderly care, personalized recommendations, etc. Nevertheless, the distribution of inertial sensor data can be influenced to a great extent by varying users. This means that the performance of an activity recognition classifier trained by one user’s dataset will degenerate when transferred to others. In this study, we focus on building a personalized classifier to detect four categories of human activities: light intensity activity, moderate intensity activity, vigorous intensity activity, and fall. In order to solve the problem caused by different distributions of inertial sensor signals, a user-adaptive algorithm based on K-Means clustering, local outlier factor (LOF), and multivariate Gaussian distribution (MGD) is proposed. To automatically cluster and annotate a specific user’s activity data, an improved K-Means algorithm with a novel initialization method is designed. By quantifying the samples’ informative degree in a labeled individual dataset, the most profitable samples can be selected for activity recognition model adaption. Through experiments, we conclude that our proposed models can adapt to new users with good recognition performance.
RESUMO
In this paper, we present a novel onboard robust visual algorithm for long-term arbitrary 2D and 3D object tracking using a reliable global-local object model for unmanned aerial vehicle (UAV) applications, e.g., autonomous tracking and chasing a moving target. The first main approach in this novel algorithm is the use of a global matching and local tracking approach. In other words, the algorithm initially finds feature correspondences in a way that an improved binary descriptor is developed for global feature matching and an iterative Lucas-Kanade optical flow algorithm is employed for local feature tracking. The second main module is the use of an efficient local geometric filter (LGF), which handles outlier feature correspondences based on a new forward-backward pairwise dissimilarity measure, thereby maintaining pairwise geometric consistency. In the proposed LGF module, a hierarchical agglomerative clustering, i.e., bottom-up aggregation, is applied using an effective single-link method. The third proposed module is a heuristic local outlier factor (to the best of our knowledge, it is utilized for the first time to deal with outlier features in a visual tracking application), which further maximizes the representation of the target object in which we formulate outlier feature detection as a binary classification problem with the output features of the LGF module. Extensive UAV flight experiments show that the proposed visual tracker achieves real-time frame rates of more than thirty-five frames per second on an i7 processor with 640 × 512 image resolution and outperforms the most popular state-of-the-art trackers favorably in terms of robustness, efficiency and accuracy.
RESUMO
The genitalia of male insects have been widely used in taxonomic identification and systematics and are potentially involved in maintaining reproductive isolation between species. Although sexual selection has been invoked to explain patterns of morphological variation in genitalia among populations and species, developmental plasticity in genitalia likely contributes to observed variation but has been rarely examined, particularly in wild populations. Bilateral gynandromorphs are individuals that are genetically male on one side of the midline and genetically female on the other, while mosaic gynandromorphs have only a portion of their body developing as the opposite sex. Gynandromorphs might offer unique insights into developmental plasticity because individuals experience abnormal cellular interactions at the genitalic midline. In this study, we compare the genitalia and wing patterns of gynandromorphic Anna and Melissa blue butterflies, Lycaeides anna (Edwards) (formerly L. idas anna) and L. melissa (Edwards) (Lepidoptera: Lycaenidae), to the morphology of normal individuals from the same populations. Gynandromorph wing markings all fell within the range of variation of normal butterflies; however, a number of genitalic measurements were outliers when compared with normal individuals. From these results, we conclude that the gynandromorphs' genitalia, but not wing patterns, can be abnormal when compared with normal individuals and that the gynandromorphic genitalia do not deviate developmentally in a consistent pattern across individuals. Finally, genetic mechanisms are considered for the development of gynandromorphism in Lycaeides butterflies.
Assuntos
Borboletas/anatomia & histologia , Asas de Animais/anatomia & histologia , Animais , Borboletas/crescimento & desenvolvimento , Feminino , Genitália/anatomia & histologia , Genitália/crescimento & desenvolvimento , Masculino , Estados Unidos , Asas de Animais/crescimento & desenvolvimentoRESUMO
Electroencephalography (EEG) is arising as a valuable method to investigate neurocognitive functions shortly after birth. However, obtaining high-quality EEG data from human newborn recordings is challenging. Compared to adults and older infants, datasets are typically much shorter due to newborns' limited attentional span and much noisier due to non-stereotyped artifacts mainly caused by uncontrollable movements. We propose Newborn EEG Artifact Removal (NEAR), a pipeline for EEG artifact removal designed explicitly for human newborns. NEAR is based on two key steps: 1) A novel bad channel detection tool based on the Local Outlier Factor (LOF), a robust outlier detection algorithm; 2) A parameter calibration procedure for adapting to newborn EEG data the algorithm Artifacts Subspace Reconstruction (ASR), developed for artifact removal in mobile adult EEG. Tests on simulated data showed that NEAR outperforms existing methods in removing representative newborn non-stereotypical artifacts. NEAR was validated on two developmental populations (newborns and 9-month-old infants) recorded with two different experimental designs (frequency-tagging and ERP). Results show that NEAR artifact removal successfully reproduces established EEG responses from noisy datasets, with a higher statistical significance than the one obtained by existing artifact removal methods. The EEGLAB-based NEAR pipeline is freely available at https://github.com/vpKumaravel/NEAR.
Assuntos
Artefatos , Processamento de Sinais Assistido por Computador , Adulto , Algoritmos , Eletroencefalografia/métodos , Humanos , Lactente , Recém-Nascido , MovimentoRESUMO
Age-related macular degeneration (AMD) is a retinal disorder affecting the elderly, and society's aging population means that the disease is becoming increasingly prevalent. The vision in patients with early AMD is usually unaffected or nearly normal but central vision may be weakened or even lost if timely treatment is not performed. Therefore, early diagnosis is particularly important to prevent the further exacerbation of AMD. This paper proposed a novel automatic detection method of AMD from optical coherence tomography (OCT) images based on deep learning and a local outlier factor (LOF) algorithm. A ResNet-50 model with L2-constrained softmax loss was retrained to extract features from OCT images and the LOF algorithm was used as the classifier. The proposed method was trained on the UCSD dataset and tested on both the UCSD dataset and Duke dataset, with an accuracy of 99.87% and 97.56%, respectively. Even though the model was only trained on the UCSD dataset, it obtained good detection accuracy when tested on another dataset. Comparison with other methods also indicates the efficiency of the proposed method in detecting AMD.
RESUMO
Copy number variation (CNV) is a common type of structural variation in the human genome. Accurate detection of CNVs from tumor genomes can provide crucial information for the study of tumor genesis and cancer precision diagnosis. However, the contamination of normal genomes in tumor genomes and the crude profiles of the read depth make such a task difficult. In this paper, we propose an alternative approach, called CIRCNV, for the detection of CNVs from sequencing data. CIRCNV is an extension of our previously developed method CNV-LOF, which uses local outlier factors to predict CNVs. Comparatively, CIRCNV can be performed on individual tumor samples and has the following two new features: (1) it transfers the read depth profile from a line shape to a circular shape via a polar coordinate transformation, in order to improve the efficiency of the read depth (RD) profile for the detection of CNVs; and (2) it performs a second round of CNV declaration based on the truth circular RD profile, which is recovered by estimating tumor purity. We test and validate the performance of CIRCNV based on simulation and real sequencing data and perform comparisons with several peer methods. The results demonstrate that CIRCNV can obtain superior performance in terms of sensitivity and precision. We expect that our proposed method will be a supplement to existing methods and become a routine tool in the field of variation analysis of tumor genomes.
RESUMO
Traditional kernel principal component analysis (KPCA) based nonlinear process monitoring method may not perform well because its Gaussian distribution assumption is often violated in the real industrial processes. To overcome this deficiency, this paper proposes a modified KPCA method based on double-weighted local outlier factor (DWLOF-KPCA). In order to avoid the assumption of specific data distribution, local outlier factor (LOF) is introduced to construct two LOF-based monitoring statistics, which are used to substitute for the traditional T2 and SPE statistics, respectively. To provide better online monitoring performance, a double-weighted LOF method is further designed, which assigns the weights for each component to highlight the key components with significant fault information, and uses the moving window to weight the historical statistics for reducing the drastic fluctuations in the monitoring results. Finally, simulations on a numerical example and the Tennessee Eastman (TE) benchmark process are used to demonstrate the superiority of the proposed DWLOF-KPCA method.