RESUMO
Currently, Android apps are easily targeted by malicious network traffic because of their constant network access. These threats have the potential to steal vital information and disrupt the commerce, social system, and banking markets. In this paper, we present a malware detection system based on word2vec-based transfer learning and multi-model image representation. The proposed method combines the textual and texture features of network traffic to leverage the advantages of both types. Initially, the transfer learning method is used to extract trained vocab from network traffic. Then, the malware-to-image algorithm visualizes network bytes for visual analysis of data traffic. Next, the texture features are extracted from malware images using a combination of scale-invariant feature transforms (SIFTs) and oriented fast and rotated brief transforms (ORBs). Moreover, a convolutional neural network (CNN) is designed to extract deep features from a set of trained vocab and texture features. Finally, an ensemble model is designed to classify and detect malware based on the combination of textual and texture features. The proposed method is tested using two standard datasets, CIC-AAGM2017 and CICMalDroid 2020, which comprise a total of 10.2K malware and 3.2K benign samples. Furthermore, an explainable AI experiment is performed to interpret the proposed approach.
Assuntos
Algoritmos , Redes Neurais de Computação , Aprendizado de MáquinaRESUMO
A proper method on real-time monitoring of organic biomass degradation and its evaluation for safeguarding the ecosystem is the need of the hour. The work process designed in this study is to demarcate the anaerobic digestion potential using kinetic modelling and web GIS application methods. Wastewater source that causes pollution are identified through satellite maps such as solid earth, drain system, surface of earth structure, land filling and land use. The grabbed data are utilized for identifying the concentration of sludge availability. Based on literature resource multi influencing factor techniques are introduced along with overlay method to differentiate digestion potential of sludge source. This study optimizes the biodegradation potential of domestic sewage at different sludge concentrations in a pilot model operated with the samples identified through topographical drainage survey. The materialization of devices is using the Internet of Things (IoTs), that is pragmatic to be the promising tendency. Kinetic study, methanogenic assay test are performed with three different cation binding agents to find its solubilization potential and methane evolution, which is further subjected to digestion potential in anaerobic conditions for possible application in the field of environmental science. Risk analysis reveals that land filling method will have highest impact on maintaining sustainable environment. The results outcome on natural biodegradation may be used for individual house hold wastewater management for the locality.
Assuntos
Reatores Biológicos , Internet das Coisas , Anaerobiose , Biodegradação Ambiental , Ecossistema , Sistemas de Informação Geográfica , Metano , Medição de Risco , EsgotosRESUMO
The exponential growth in population and their overall reliance on the usage of electrical and electronic devices have increased the demand for energy production. It needs precise energy management systems that can forecast the usage of the consumers for future policymaking. Embedded smart sensors attached to electricity meters and home appliances enable power suppliers to effectively analyze the energy usage to generate and distribute electricity into residential areas based on their level of energy consumption. Therefore, this paper proposes a clustering-based analysis of energy consumption to categorize the consumers' electricity usage into different levels. First, a deep autoencoder that transfers the low-dimensional energy consumption data to high-level representations was trained. Second, the high-level representations were fed into an adaptive self-organizing map (SOM) clustering algorithm. Afterward, the levels of electricity energy consumption were established by conducting the statistical analysis on the obtained clustered data. Finally, the results were visualized in graphs and calendar views, and the predicted levels of energy consumption were plotted over the city map, providing a compact overview to the providers for energy utilization analysis.
RESUMO
Electric energy consumption forecasting is an interesting, challenging, and important issue in energy management and equipment efficiency improvement. Existing approaches are predictive models that have the ability to predict for a specific profile, i.e., a time series of a whole building or an individual household in a smart building. In practice, there are many profiles in each smart building, which leads to time-consuming and expensive system resources. Therefore, this study develops a robust framework for the Multiple Electric Energy Consumption forecasting (MEC) of a smart building using Transfer Learning and Long Short-Term Memory (TLL), the so-called MEC-TLL framework. In this framework, we first employ a k-means clustering algorithm to cluster the daily load demand of many profiles in the training set. In this phase, we also perform Silhouette analysis to specify the optimal number of clusters for the experimental datasets. Next, this study develops the MEC training algorithm, which utilizes a cluster-based strategy for transfer learning the Long Short-Term Memory models to reduce the computational time. Finally, extensive experiments are conducted to compare the computational time and different performance metrics for multiple electric energy consumption forecasting on two smart buildings in South Korea. The experimental results indicate that our proposed approach is capable of economical overheads while achieving superior performances. Therefore, the proposed approach can be applied effectively for intelligent energy management in smart buildings.
RESUMO
For efficient and effective energy management, accurate energy consumption forecasting is required in energy management systems (EMSs). Recently, several artificial intelligence-based techniques have been proposed for accurate electric load forecasting; moreover, perfect energy consumption data are critical for the prediction. However, owing to diverse reasons, such as device malfunctions and signal transmission errors, missing data are frequently observed in the actual data. Previously, many imputation methods have been proposed to compensate for missing values; however, these methods have achieved limited success in imputing electric energy consumption data because the period of data missing is long and the dependency on historical data is high. In this study, we propose a novel missing-value imputation scheme for electricity consumption data. The proposed scheme uses a bagging ensemble of multilayer perceptrons (MLPs), called softmax ensemble network, wherein the ensemble weight of each MLP is determined by a softmax function. This ensemble network learns electric energy consumption data with explanatory variables and imputes missing values in this data. To evaluate the performance of our scheme, we performed diverse experiments on real electric energy consumption data and confirmed that the proposed scheme can deliver superior performance compared to other imputation methods.
RESUMO
Due to industrialization and the rising demand for energy, global energy consumption has been rapidly increasing. Recent studies show that the biggest portion of energy is consumed in residential buildings, i.e., in European Union countries up to 40% of the total energy is consumed by households. Most residential buildings and industrial zones are equipped with smart sensors such as metering electric sensors, that are inadequately utilized for better energy management. In this paper, we develop a hybrid convolutional neural network (CNN) with an long short-term memory autoencoder (LSTM-AE) model for future energy prediction in residential and commercial buildings. The central focus of this research work is to utilize the smart meters' data for energy forecasting in order to enable appropriate energy management in buildings. We performed extensive research using several deep learning-based forecasting models and proposed an optimal hybrid CNN with the LSTM-AE model. To the best of our knowledge, we are the first to incorporate the aforementioned models under the umbrella of a unified framework with some utility preprocessing. Initially, the CNN model extracts features from the input data, which are then fed to the LSTM-encoder to generate encoded sequences. The encoded sequences are decoded by another following LSTM-decoder to advance it to the final dense layer for energy prediction. The experimental results using different evaluation metrics show that the proposed hybrid model works well. Also, it records the smallest value for mean square error (MSE), mean absolute error (MAE), root mean square error (RMSE) and mean absolute percentage error (MAPE) when compared to other state-of-the-art forecasting methods over the UCI residential building dataset. Furthermore, we conducted experiments on Korean commercial building data and the results indicate that our proposed hybrid model is a worthy contribution to energy forecasting.
RESUMO
With the large-scale deployment of smart meters worldwide, research in non-intrusive load monitoring (NILM) has seen a significant rise due to its dual use of real-time monitoring of end-user appliances and user-centric feedback of power consumption usage. NILM is a technique for estimating the state and the power consumption of an individual appliance in a consumer's premise using a single point of measurement device such as a smart meter. Although there are several existing NILM techniques, there is no meaningful and accurate metric to evaluate these NILM techniques for multi-state devices such as the fridge, heat pump, etc. In this paper, we demonstrate the inadequacy of the existing metrics and propose a new metric that combines both event classification and energy estimation of an operational state to give a more realistic and accurate evaluation of the performance of the existing NILM techniques. In particular, we use unsupervised clustering techniques to identify the operational states of the device from a labeled dataset to compute a penalty threshold for predictions that are too far away from the ground truth. Our work includes experimental evaluation of the state-of-the-art NILM techniques on widely used datasets of power consumption data measured in a real-world environment.
RESUMO
Flying ad-hoc networks (FANETs) are a very vibrant research area nowadays. They have many military and civil applications. Limited battery energy and the high mobility of micro unmanned aerial vehicles (UAVs) represent their two main problems, i.e., short flight time and inefficient routing. In this paper, we try to address both of these problems by means of efficient clustering. First, we adjust the transmission power of the UAVs by anticipating their operational requirements. Optimal transmission range will have minimum packet loss ratio (PLR) and better link quality, which ultimately save the energy consumed during communication. Second, we use a variant of the K-Means Density clustering algorithm for selection of cluster heads. Optimal cluster heads enhance the cluster lifetime and reduce the routing overhead. The proposed model outperforms the state of the art artificial intelligence techniques such as Ant Colony Optimization-based clustering algorithm and Grey Wolf Optimization-based clustering algorithm. The performance of the proposed algorithm is evaluated in term of number of clusters, cluster building time, cluster lifetime and energy consumption.
RESUMO
In mobile cloud computing environment, the cooperation of distributed computing objects is one of the most important requirements for providing successful cloud services. To satisfy this requirement, all the members, who are employed in the cooperation group, need to share the knowledge for mutual understanding. Even if ontology can be the right tool for this goal, there are several issues to make a right ontology. As the cost and complexity of managing knowledge increase according to the scale of the knowledge, reducing the size of ontology is one of the critical issues. In this paper, we propose a method of extracting ontology module to increase the utility of knowledge. For the given signature, this method extracts the ontology module, which is semantically self-contained to fulfill the needs of the service, by considering the syntactic structure and semantic relation of concepts. By employing this module, instead of the original ontology, the cooperation of computing objects can be performed with less computing load and complexity. In particular, when multiple external ontologies need to be combined for more complex services, this method can be used to optimize the size of shared knowledge.
Assuntos
Sistemas Computacionais , Disseminação de Informação/métodos , Armazenamento e Recuperação da Informação/métodos , Bases de Conhecimento , Algoritmos , Biologia Computacional/métodos , HumanosRESUMO
We have presented a classification framework that combines multiple heterogeneous classifiers in the presence of class label noise. An extension of m-Mediods based modeling is presented that generates model of various classes whilst identifying and filtering noisy training data. This noise free data is further used to learn model for other classifiers such as GMM and SVM. A weight learning method is then introduced to learn weights on each class for different classifiers to construct an ensemble. For this purpose, we applied genetic algorithm to search for an optimal weight vector on which classifier ensemble is expected to give the best accuracy. The proposed approach is evaluated on variety of real life datasets. It is also compared with existing standard ensemble techniques such as Adaboost, Bagging, and Random Subspace Methods. Experimental results show the superiority of proposed ensemble method as compared to its competitors, especially in the presence of class label noise and imbalance classes.
Assuntos
Inteligência Artificial/classificação , Estatística como Assunto/métodos , Gestão da Informação/classificação , Gestão da Informação/métodosRESUMO
Ubiquitination is an essential post-translational modification mechanism involving the ubiquitin protein's bonding to a substrate protein. It is crucial in a variety of physiological activities including cell survival and differentiation, and innate and adaptive immunity. Any alteration in the ubiquitin system leads to the development of various human diseases. Numerous researches show the highly reversibility and dynamic of ubiquitin system, making the experimental identification quite difficult. To solve this issue, this article develops a model using a machine learning approach, tending to improve the ubiquitin protein prediction precisely. We deeply investigate the ubiquitination data that is proceed through different features extraction methods, followed by the classification. The evaluation and assessment are conducted considering Jackknife tests and 10-fold cross-validation. The proposed method demonstrated the remarkable performance in terms of 100 %, 99.88 %, and 99.84 % accuracy on Dataset-I, Dataset-II, and Dataset-III, respectively. Using Jackknife test, the method achieves 100 %, 99.91 %, and 99.99 % for Dataset-I, Dataset-II and Dataset-III, respectively. This analysis concludes that the proposed method outperformed the state-of-the-arts to identify the ubiquitination sites and helpful in the development of current clinical therapies. The source code and datasets will be made available at Github.
RESUMO
Gait recognition is the identification of individuals based on how they walk. It can identify an individual of interest without their intervention, making it better suited for surveillance from afar. Computer-aided silhouette-based gait analysis is frequently employed due to its efficiency and effectiveness. However, covariate conditions have a significant influence on individual recognition because they conceal essential features that are helpful in recognizing individuals from their walking style. To address such issues, we proposed a novel deep-learning framework to tackle covariate conditions in gait by proposing regions subject to covariate conditions. The features extracted from those regions will be neglected to keep the model's performance effective with custom kernels. The proposed technique sets aside static and dynamic areas of interest, where static areas contain covariates, and then features are learnt from the dynamic regions unaffected by covariates to effectively recognize individuals. The features were extracted using three customized kernels, and the results were concatenated to produce a fused feature map. Afterward, CNN learns and extracts the features from the proposed regions to recognize an individual. The suggested approach is an end-to-end system that eliminates the requirement for manual region proposal and feature extraction, which would improve gait-based identification of individuals in real-world scenarios. The experimentation is performed on publicly available dataset i.e. CASIA A, and CASIA C. The findings indicate that subjects wearing bags produced 90 % accuracy, and subjects wearing coats produced 58 % accuracy. Likewise, recognizing individuals with different walking speeds also exhibited excellent results, with an accuracy of 94 % for fast and 96 % for slow-paced walk patterns, which shows improvement compared to previous deep learning methods.© 2017 Elsevier Inc. All rights reserved.
RESUMO
Gait identification based on Deep Learning (DL) techniques has recently emerged as biometric technology for surveillance. We leveraged the vulnerabilities and decision-making abilities of the DL model in gait-based autonomous surveillance systems when attackers have no access to underlying model gradients/structures using a patch-based black-box adversarial attack with Reinforcement Learning (RL). These automated surveillance systems are secured, blocking the attacker's access. Therefore, the attack can be conducted in an RL framework where the agent's goal is determining the optimal image location, causing the model to perform incorrectly when perturbed with random pixels. Furthermore, the proposed adversarial attack presents encouraging results (maximum success rate = 77.59%). Researchers should explore system resilience scenarios (e.g., when attackers have no system access) before using these models in surveillance applications.
Assuntos
Redes Neurais de Computação , Reforço Psicológico , Biometria , Marcha , TecnologiaRESUMO
Daily peak load forecasting (DPLF) and total daily load forecasting (TDLF) are essential for optimal power system operation from one day to one week later. This study develops a Cubist-based incremental learning model to perform accurate and interpretable DPLF and TDLF. To this end, we employ time-series cross-validation to effectively reflect recent electrical load trends and patterns when constructing the model. We also analyze variable importance to identify the most crucial factors in the Cubist model. In the experiments, we used two publicly available building datasets and three educational building cluster datasets. The results showed that the proposed model yielded averages of 7.77 and 10.06 in mean absolute percentage error and coefficient of variation of the root mean square error, respectively. We also confirmed that temperature and holiday information are significant external factors, and electrical loads one day and one week ago are significant internal factors.
Assuntos
Eletricidade , Redes Neurais de Computação , Previsões , TemperaturaRESUMO
Lung abnormality in humans is steadily increasing due to various causes, and early recognition and treatment are extensively suggested. Tuberculosis (TB) is one of the lung diseases, and due to its occurrence rate and harshness, the World Health Organization (WHO) lists TB among the top ten diseases which lead to death. The clinical level detection of TB is usually performed using bio-medical imaging methods, and a chest X-ray is a commonly adopted imaging modality. This work aims to develop an automated procedure to detect TB from X-ray images using VGG-UNet-supported joint segmentation and classification. The various phases of the proposed scheme involved; (i) image collection and resizing, (ii) deep-features mining, (iii) segmentation of lung section, (iv) local-binary-pattern (LBP) generation and feature extraction, (v) optimal feature selection using spotted hyena algorithm (SHA), (vi) serial feature concatenation, and (vii) classification and validation. This research considered 3000 test images (1500 healthy and 1500 TB class) for the assessment, and the proposed experiment is implemented using Matlab®. This work implements the pretrained models to detect TB in X-rays with improved accuracy, and this research helped achieve a classification accuracy of >99% with a fine-tree classifier.
Assuntos
Hyaenidae , Tuberculose , Algoritmos , Animais , Humanos , Pulmão/diagnóstico por imagem , Tuberculose/diagnóstico por imagemRESUMO
With the development of big data and cloud computing technologies, the importance of pseudonym information has grown. However, the tools for verifying whether the de-identification methodology is correctly applied to ensure data confidentiality and usability are insufficient. This paper proposes a verification of de-identification techniques for personal healthcare information by considering data confidentiality and usability. Data are generated and preprocessed by considering the actual statistical data, personal information datasets, and de-identification datasets based on medical data to represent the de-identification technique as a numeric dataset. Five tree-based regression models (i.e., decision tree, random forest, gradient boosting machine, extreme gradient boosting, and light gradient boosting machine) are constructed using the de-identification dataset to effectively discover nonlinear relationships between dependent and independent variables in numerical datasets. Then, the most effective model is selected from personal information data in which pseudonym processing is essential for data utilization. The Shapley additive explanation, an explainable artificial intelligence technique, is applied to the most effective model to establish pseudonym processing policies and machine learning to present a machine-learning process that selects an appropriate de-identification methodology.
RESUMO
Recently, the electroencephalogram (EEG) signal presents an excellent potential for a new person identification technique. Several studies defined the EEG with unique features, universality, and natural robustness to be used as a new track to prevent spoofing attacks. The EEG signals are a visual recording of the brain's electrical activities, measured by placing electrodes (channels) in various scalp positions. However, traditional EEG-based systems lead to high complexity with many channels, and some channels have critical information for the identification system while others do not. Several studies have proposed a single objective to address the EEG channel for person identification. Unfortunately, these studies only focused on increasing the accuracy rate without balancing the accuracy and the total number of selected EEG channels. The novelty of this paper is to propose a multiobjective binary version of the cuckoo search algorithm (MOBCS-KNN) to find optimal EEG channel selections for person identification. The proposed method (MOBCS-KNN) used a weighted sum technique to implement a multiobjective approach. In addition, a KNN classifier for EEG-based biometric person identification is used. It is worth mentioning that this is the initial investigation of using a multiobjective technique with EEG channel selection problem. A standard EEG motor imagery dataset is used to evaluate the performance of the MOBCS-KNN. The experiments show that the MOBCS-KNN obtained accuracy of 93.86% using only 24 sensors with AR20 autoregressive coefficients. Another critical point is that the MOBCS-KNN finds channels not too close to each other to capture relevant information from all over the head. In conclusion, the MOBCS-KNN algorithm achieves the best results compared with metaheuristic algorithms. Finally, the recommended approach can draw future directions to be applied to different research areas.
Assuntos
Interfaces Cérebro-Computador , Eletroencefalografia , Algoritmos , Atenção à Saúde , Eletrodos , HumanosRESUMO
The excessive number of COVID-19 cases reported worldwide so far, supplemented by a high rate of false alarms in its diagnosis using the conventional polymerase chain reaction method, has led to an increased number of high-resolution computed tomography (CT) examinations conducted. The manual inspection of the latter, besides being slow, is susceptible to human errors, especially because of an uncanny resemblance between the CT scans of COVID-19 and those of pneumonia, and therefore demands a proportional increase in the number of expert radiologists. Artificial intelligence-based computer-aided diagnosis of COVID-19 using the CT scans has been recently coined, which has proven its effectiveness in terms of accuracy and computation time. In this work, a similar framework for classification of COVID-19 using CT scans is proposed. The proposed method includes four core steps: (i) preparing a database of three different classes such as COVID-19, pneumonia, and normal; (ii) modifying three pretrained deep learning models such as VGG16, ResNet50, and ResNet101 for the classification of COVID-19-positive scans; (iii) proposing an activation function and improving the firefly algorithm for feature selection; and (iv) fusing optimal selected features using descending order serial approach and classifying using multiclass supervised learning algorithms. We demonstrate that once this method is performed on a publicly available dataset, this system attains an improved accuracy of 97.9% and the computational time is almost 34 (sec).
Assuntos
COVID-19 , Aprendizado Profundo , Inteligência Artificial , Computadores , Humanos , SARS-CoV-2 , Tomografia Computadorizada por Raios XRESUMO
This study presents a precise way to detect the third ( S3 ) heart sound, which is recognized as an important indication of heart failure, based on nonlinear single decomposition and time-frequency localization. The detection of the S3 is obscured due to its significantly low energy and frequency. Even more, the detected S3 may be misunderstood as an abnormal second heart sound with a fixed split, which was not addressed in the literature. To detect such S3, the Hilbert vibration decomposition method is applied to decompose the heart sound into a certain number of subcomponents while intactly preserving the phase information. Thus, the time information of all of the decomposed components are unchanged, which further expedites the identification and localization of any module/section of a signal properly. Next, the proposed localization step is applied to the decomposed subcomponents by using smoothed pseudo Wigner-Ville distribution followed by the reassignment method. Finally, based on the positional information, the S3 is distinguished and confirmed by measuring time delays between the S2 and S3. In total, 82 sets of cardiac cycles collected from different databases including Texas Heart Institute database are examined for evaluation of the proposed method. The result analysis shows that the proposed method can detect the S3 correctly, even when the normalized temporal energy of S3 is larger than 0.16, and the frequency of those is larger than 34 Hz. In a performance analysis, the proposed method demonstrates that the accuracy rate of S3 detection is as high as 93.9%, which is significantly higher compared with the other methods. Such findings prove the robustness of the proposed idea for detecting substantially low-energized S3 .