Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 669
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Nano Lett ; 24(23): 7091-7099, 2024 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-38804877

RESUMO

Multimodal perception can capture more precise and comprehensive information compared with unimodal approaches. However, current sensory systems typically merge multimodal signals at computing terminals following parallel processing and transmission, which results in the potential loss of spatial association information and requires time stamps to maintain temporal coherence for time-series data. Here we demonstrate bioinspired in-sensor multimodal fusion, which effectively enhances comprehensive perception and reduces the level of data transfer between sensory terminal and computation units. By adopting floating gate phototransistors with reconfigurable photoresponse plasticity, we realize the agile spatial and spatiotemporal fusion under nonvolatile and volatile photoresponse modes. To realize an optimal spatial estimation, we integrate spatial information from visual-tactile signals. For dynamic events, we capture and fuse in real time spatiotemporal information from visual-audio signals, realizing a dance-music synchronization recognition task without a time-stamping process. This in-sensor multimodal fusion approach provides the potential to simplify the multimodal integration system, extending the in-sensor computing paradigm.

2.
Methods ; 218: 94-100, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37507060

RESUMO

In recent years, healthcare data from various sources such as clinical institutions, patients, and pharmaceutical industries have become increasingly abundant. However, due to the complex healthcare system and data privacy concerns, aggregating and utilizing these data in a centralized manner can be challenging. Federated learning (FL) has emerged as a promising solution for distributed training in edge computing scenarios, utilizing on-device user data while reducing server costs. In traditional FL, a central server trains a global model sampled client data randomly, and the server combines the collected model from different clients into one global model. However, for not independent and identically distributed (non-i.i.d.) datasets, randomly selecting users to train server is not an optimal choice and can lead to poor model training performance. To address this limitation, we propose the Federated Multi-Center Clustering algorithm (FedMCC) to enhance the robustness and accuracy for all clients. FedMCC leverages the Model-Agnostic Meta-Learning (MAML) algorithm, focusing on training a robust base model during the initial training phase and better capturing features from different users. Subsequently, clustering methods are used to ensure that features among users within each cluster are similar, approximating an i.i.d. training process in each round, resulting in more effective training of the global model. We validate the effectiveness and generalizability of FedMCC through extensive experiments on public healthcare datasets. The results demonstrate that FedMCC achieves improved performance and accuracy for all clients while maintaining data privacy and security, showcasing its potential for various healthcare applications.


Assuntos
Algoritmos , Privacidade , Humanos , Análise por Conglomerados
3.
Sensors (Basel) ; 24(3)2024 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-38339545

RESUMO

Myocardial Infarction (MI), commonly known as heart attack, is a cardiac condition characterized by damage to a portion of the heart, specifically the myocardium, due to the disruption of blood flow. Given its recurring and often asymptomatic nature, there is the need for continuous monitoring using wearable devices. This paper proposes a single-microcontroller-based system designed for the automatic detection of MI based on the Edge Computing paradigm. Two solutions for MI detection are evaluated, based on Machine Learning (ML) and Deep Learning (DL) techniques. The developed algorithms are based on two different approaches currently available in the literature, and they are optimized for deployment on low-resource hardware. A feasibility assessment of their implementation on a single 32-bit microcontroller with an ARM Cortex-M4 core was examined, and a comparison in terms of accuracy, inference time, and memory usage was detailed. For ML techniques, significant data processing for feature extraction, coupled with a simpler Neural Network (NN) is involved. On the other hand, the second method, based on DL, employs a Spectrogram Analysis for feature extraction and a Convolutional Neural Network (CNN) with a longer inference time and higher memory utilization. Both methods employ the same low power hardware reaching an accuracy of 89.40% and 94.76%, respectively. The final prototype is an energy-efficient system capable of real-time detection of MI without the need to connect to remote servers or the cloud. All processing is performed at the edge, enabling NN inference on the same microcontroller.


Assuntos
Cardiopatias , Infarto do Miocárdio , Humanos , Infarto do Miocárdio/diagnóstico , Coração , Miocárdio , Algoritmos
4.
Sensors (Basel) ; 24(7)2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38610489

RESUMO

In the mobile edge computing (MEC) environment, the edge caching can provide the timely data response service for the intelligent scenarios. However, due to the limited storage capacity of edge nodes and the malicious node behavior, the question of how to select the cached contents and realize the decentralized security data caching faces challenges. In this paper, a blockchain-based decentralized and proactive caching strategy is proposed in an MEC environment to address this problem. The novelty is that the blockchain was adopted in an MEC environment with a proactive caching strategy based on node utility, and the corresponding optimization problem was built. The blockchain was adopted to build a secure and reliable service environment. The employed methodology is that the optimal caching strategy was achieved based on the linear relaxation technology and the interior point method. Additionally, in a content caching system, there is a trade-off between cache space and node utility, and the caching strategy was proposed to solve this problem. There was also a trade-off between the consensus process delay of blockchain and the caching latency of content. An offline consensus authentication method was adopted to reduce the influence of the consensus process delay on the content caching. The key finding was that the proposed algorithm can reduce latency and can ensure the security data caching in an IoT environment. Finally, the simulation experiment showed that the proposed algorithm can achieve up to 49.32%, 43.11%, and 34.85% improvements on the cache hit rate, the average content response latency, and the average system utility, respectively, compared to the random content caching algorithm, and it achieved up to 9.67%, 8.11%, and 5.95% increases, successively, compared to the greedy content caching algorithm.

5.
Sensors (Basel) ; 24(15)2024 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-39124124

RESUMO

A complete low-power, low-cost and wireless solution for bridge structural health monitoring is presented. This work includes monitoring nodes with modular hardware design and low power consumption based on a control and resource management board called CoreBoard, and a specific board for sensorization called SensorBoard is presented. The firmware is presented as a design of FreeRTOS parallelised tasks that carry out the management of the hardware resources and implement the Random Decrement Technique to minimize the amount of data to be transmitted over the NB-IoT network in a secure way. The presented solution is validated through the characterization of its energy consumption, which guarantees an autonomy higher than 10 years with a daily 8 min monitoring periodicity, and two deployments in a pilot laboratory structure and the Eduardo Torroja bridge in Posadas (Córdoba, Spain). The results are compared with two different calibrated commercial systems, obtaining an error lower than 1.72% in modal analysis frequencies. The architecture and the results obtained place the presented design as a new solution in the state of the art and, thanks to its autonomy, low cost and the graphical device management interface presented, allow its deployment and integration in the current IoT paradigm.

6.
Sensors (Basel) ; 24(13)2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-39001087

RESUMO

The growing importance of edge and fog computing in the modern IT infrastructure is driven by the rise of decentralized applications. However, resource allocation within these frameworks is challenging due to varying device capabilities and dynamic network conditions. Conventional approaches often result in poor resource use and slowed advancements. This study presents a novel strategy for enhancing resource allocation in edge and fog computing by integrating machine learning with the blockchain for reliable trust management. Our proposed framework, called CyberGuard, leverages the blockchain's inherent immutability and decentralization to establish a trustworthy and transparent network for monitoring and verifying edge and fog computing transactions. CyberGuard combines the Trust2Vec model with conventional machine-learning models like SVM, KNN, and random forests, creating a robust mechanism for assessing trust and security risks. Through detailed optimization and case studies, CyberGuard demonstrates significant improvements in resource allocation efficiency and overall system performance in real-world scenarios. Our results highlight CyberGuard's effectiveness, evidenced by a remarkable accuracy, precision, recall, and F1-score of 98.18%, showcasing the transformative potential of our comprehensive approach in edge and fog computing environments.

7.
Sensors (Basel) ; 24(13)2024 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-39001116

RESUMO

This study investigates the dynamic deployment of unmanned aerial vehicles (UAVs) using edge computing in a forest fire scenario. We consider the dynamically changing characteristics of forest fires and the corresponding varying resource requirements. Based on this, this paper models a two-timescale UAV dynamic deployment scheme by considering the dynamic changes in the number and position of UAVs. In the slow timescale, we use a gate recurrent unit (GRU) to predict the number of future users and determine the number of UAVs based on the resource requirements. UAVs with low energy are replaced accordingly. In the fast timescale, a deep-reinforcement-learning-based UAV position deployment algorithm is designed to enable the low-latency processing of computational tasks by adjusting the UAV positions in real time to meet the ground devices' computational demands. The simulation results demonstrate that the proposed scheme achieves better prediction accuracy. The number and position of UAVs can be adapted to resource demand changes and reduce task execution delays.

8.
Sensors (Basel) ; 24(10)2024 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-38793843

RESUMO

Edge computing provides higher computational power and lower transmission latency by offloading tasks to nearby edge nodes with available computational resources to meet the requirements of time-sensitive tasks and computationally complex tasks. Resource allocation schemes are essential to this process. To allocate resources effectively, it is necessary to attach metadata to a task to indicate what kind of resources are needed and how many computation resources are required. However, these metadata are sensitive and can be exposed to eavesdroppers, which can lead to privacy breaches. In addition, edge nodes are vulnerable to corruption because of their limited cybersecurity defenses. Attackers can easily obtain end-device privacy through unprotected metadata or corrupted edge nodes. To address this problem, we propose a metadata privacy resource allocation scheme that uses searchable encryption to protect metadata privacy and zero-knowledge proofs to resist semi-malicious edge nodes. We have formally proven that our proposed scheme satisfies the required security concepts and experimentally demonstrated the effectiveness of the scheme.

9.
Sensors (Basel) ; 24(8)2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38676147

RESUMO

This paper focuses on the use of smart manufacturing in lathe-cutting tool machines, which can experience thermal deformation during long-term processing, leading to displacement errors in the cutting head and damage to the final product. This study uses time-series thermal compensation to develop a predictive system for thermal displacement in machine tools, which is applicable in the industry using edge computing technology. Two experiments were carried out to optimize the temperature prediction models and predict the displacement of five axes at the temperature points. First, an examination is conducted to determine possible variances in time-series data. This analysis is based on the data obtained for the changes in time, speed, torque, and temperature at various locations of the machine tool. Using the viable machine-learning models determined, the study then examines various cutting settings, temperature points, and machine speeds to forecast the future five-axis displacement. Second, to verify the precision of the models created in the initial phase, other time-series models are examined and trained in the subsequent phase, and their effectiveness is compared to the models acquired in the first phase. This work also included training seven models of WNN, LSTNet, TPA-LSTM, XGBoost, BiLSTM, CNN, and GA-LSTM. The study found that the GA-LSTM model outperforms the other three best models of the LSTM, GRU, and XGBoost models with an average precision greater than 90%. Based on the analysis of training time and model precision, the study concluded that a system using LSTM, GRU, and XGBoost should be designed and applied for thermal compensation using edge devices such as the Raspberry Pi.

10.
Sensors (Basel) ; 24(8)2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38676197

RESUMO

Federated learning (FL) in mobile edge computing has emerged as a promising machine-learning paradigm in the Internet of Things, enabling distributed training without exposing private data. It allows multiple mobile devices (MDs) to collaboratively create a global model. FL not only addresses the issue of private data exposure but also alleviates the burden on a centralized server, which is common in conventional centralized learning. However, a critical issue in FL is the imposed computing for local training on multiple MDs, which often have limited computing capabilities. This limitation poses a challenge for MDs to actively contribute to the training process. To tackle this problem, this paper proposes an adaptive dataset management (ADM) scheme, aiming to reduce the burden of local training on MDs. Through an empirical study on the influence of dataset size on accuracy improvement over communication rounds, we confirm that the amount of dataset has a reduced impact on accuracy gain. Based on this finding, we introduce a discount factor that represents the reduced impact of the size of the dataset on the accuracy gain over communication rounds. To address the ADM problem, which involves determining how much the dataset should be reduced over classes while considering both the proposed discounting factor and Kullback-Leibler divergence (KLD), a theoretical framework is presented. The ADM problem is a non-convex optimization problem. To solve it, we propose a greedy-based heuristic algorithm that determines a suboptimal solution with low complexity. Simulation results demonstrate that our proposed scheme effectively alleviates the training burden on MDs while maintaining acceptable training accuracy.

11.
Sensors (Basel) ; 24(6)2024 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-38544101

RESUMO

Recently, the integration of unmanned aerial vehicles (UAVs) with edge computing has emerged as a promising paradigm for providing computational support for Internet of Things (IoT) applications in remote, disaster-stricken, and maritime areas. In UAV-aided edge computing, the offloading decision plays a central role in optimizing the overall system performance. However, the trajectory directly affects the offloading decision. In general, IoT devices use ground offload computation-intensive tasks on UAV-aided edge servers. The UAVs plan their trajectories based on the task generation rate. Therefore, researchers are attempting to optimize the offloading decision along with the trajectory, and numerous studies are ongoing to determine the impact of the trajectory on offloading decisions. In this survey, we review existing trajectory-aware offloading decision techniques by focusing on design concepts, operational features, and outstanding characteristics. Moreover, they are compared in terms of design principles and operational characteristics. Open issues and research challenges are discussed, along with future directions.

12.
Sensors (Basel) ; 24(6)2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38544005

RESUMO

With the development of the Internet of Things (IoT) technology, massive amounts of sensor data in applications such as fire monitoring need to be transmitted to edge servers for timely processing. However, there is an energy-hole phenomenon in transmitting data only through terrestrial multi-hop networks. In this study, we focus on the data collection task in an unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) network, where a UAV is deployed as the mobile data collector for the ground sensor nodes (SNs) to ensure high information freshness. Meanwhile, the UAV is equipped with an edge server for data caching. We first establish a rigorous mathematical model in which the age of information (AoI) is used as a measure of information freshness, related to both the data collection time and the UAV's flight time. Then a mixed-integer non-convex optimization problem is formulated to minimize the peak AoI of the collected data. To solve the problem efficiently, we propose an iterative two-step algorithm named the AoI-minimized association and trajectory planning (AoI-MATP) algorithm. In each iteration, the optimal SN-collection point (CP) associations and CP locations for the parameter ε are first obtained by the affinity propagation clustering algorithm. The optimal UAV trajectory is found using an improved elite genetic algorithm. Simulation results show that based on the optimized ε, the AoI-MATP algorithm can achieve a balance between data collection time and flight time, reducing the peak AoI of the collected data.

13.
Sensors (Basel) ; 24(6)2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38544128

RESUMO

With the exponential growth of wireless devices and the demand for real-time processing, traditional server architectures face challenges in meeting the ever-increasing computational requirements. This paper proposes a collaborative edge computing framework to offload and process tasks efficiently in such environments. By equipping a moving unmanned aerial vehicle (UAV) as the mobile edge computing (MEC) server, the proposed architecture aims to release the burden on roadside units (RSUs) servers. Specifically, we propose a two-layer edge intelligence scheme to allocate network computing resources. The first layer intelligently offloads and allocates tasks generated by wireless devices in the vehicular system, and the second layer utilizes the partially observable stochastic game (POSG), solved by duelling deep Q-learning, to allocate the computing resources of each processing node (PN) to different tasks. Meanwhile, we propose a weighted position optimization algorithm for the UAV movement in the system to facilitate task offloading and task processing. Simulation results demonstrate the improved performance by applying the proposed scheme.

14.
Sensors (Basel) ; 24(5)2024 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-38474892

RESUMO

This paper describes the design and optimization of a smart algorithm based on artificial intelligence to increase the accuracy of an ocean water current meter. The main purpose of water current meters is to obtain the fundamental frequency of the ocean waves and currents. The limiting factor in those underwater applications is power consumption and that is the reason to use only ultra-low power microcontrollers. On the other hand, nowadays extraction algorithms assume that the processed signal is defined in a fixed bandwidth. In our approach, belonging to the edge computing research area, we use a deep neural network to determine the narrow bandwidth for filtering the fundamental frequency of the ocean waves and currents on board instruments. The proposed solution is implemented on an 8 MHz ARM Cortex-M0+ microcontroller without a floating point unit requiring only 9.54 ms in the worst case based on a deep neural network solution. Compared to a greedy algorithm in terms of computational effort, our worst-case approach is 1.81 times faster than a fast Fourier transform with a length of 32 samples. The proposed solution is 2.33 times better when an artificial neural network approach is adopted.

15.
Sensors (Basel) ; 24(5)2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38475104

RESUMO

The effects of climate change and the rapid growth of societies often lead to water scarcity and inadequate water quality, resulting in a significant number of diseases. The digitalization of infrastructure and the use of Digital Twins are presented as alternatives for optimizing resources and the necessary infrastructure in the water cycle. This paper presents a framework for the development of a Digital Twin platform for a wastewater treatment plant, based on a microservices architecture which optimized its design for edge computing implementation. The platform aims to optimize the operation and maintenance processes of the plant's systems, by employing machine learning techniques, process modeling and simulation, as well as leveraging the information contained in BIM models to support decision-making.

16.
Sensors (Basel) ; 24(7)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38610369

RESUMO

Video surveillance systems are integral to bolstering safety and security across multiple settings. With the advent of deep learning (DL), a specialization within machine learning (ML), these systems have been significantly augmented to facilitate DL-based video surveillance services with notable precision. Nevertheless, DL-based video surveillance services, which necessitate the tracking of object movement and motion tracking (e.g., to identify unusual object behaviors), can demand a significant portion of computational and memory resources. This includes utilizing GPU computing power for model inference and allocating GPU memory for model loading. To tackle the computational demands inherent in DL-based video surveillance, this study introduces a novel video surveillance management system designed to optimize operational efficiency. At its core, the system is built on a two-tiered edge computing architecture (i.e., client and server through socket transmission). In this architecture, the primary edge (i.e., client side) handles the initial processing tasks, such as object detection, and is connected via a Universal Serial Bus (USB) cable to the Closed-Circuit Television (CCTV) camera, directly at the source of the video feed. This immediate processing reduces the latency of data transfer by detecting objects in real time. Meanwhile, the secondary edge (i.e., server side) plays a vital role by hosting a dynamically controlling threshold module targeted at releasing DL-based models, reducing needless GPU usage. This module is a novel addition that dynamically adjusts the threshold time value required to release DL models. By dynamically optimizing this threshold, the system can effectively manage GPU usage, ensuring resources are allocated efficiently. Moreover, we utilize federated learning (FL) to streamline the training of a Long Short-Term Memory (LSTM) network for predicting imminent object appearances by amalgamating data from diverse camera sources while ensuring data privacy and optimized resource allocation. Furthermore, in contrast to the static threshold values or moving average techniques used in previous approaches for the controlling threshold module, we employ a Deep Q-Network (DQN) methodology to manage threshold values dynamically. This approach efficiently balances the trade-off between GPU memory conservation and the reloading latency of the DL model, which is enabled by incorporating LSTM-derived predictions as inputs to determine the optimal timing for releasing the DL model. The results highlight the potential of our approach to significantly improve the efficiency and effective usage of computational resources in video surveillance systems, opening the door to enhanced security in various domains.

17.
Sensors (Basel) ; 24(7)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38610415

RESUMO

In Vehicular Edge Computing Network (VECN) scenarios, the mobility of vehicles causes the uncertainty of channel state information, which makes it difficult to guarantee the Quality of Service (QoS) in the process of computation offloading and the resource allocation of a Vehicular Edge Computing Server (VECS). A multi-user computation offloading and resource allocation optimization model and a computation offloading and resource allocation algorithm based on the Deep Deterministic Policy Gradient (DDPG) are proposed to address this problem. Firstly, the problem is modeled as a Mixed Integer Nonlinear Programming (MINLP) problem according to the optimization objective of minimizing the total system delay. Then, in response to the large state space and the coexistence of discrete and continuous variables in the action space, a reinforcement learning algorithm based on DDPG is proposed. Finally, the proposed method is used to solve the problem and compared with the other three benchmark schemes. Compared with the baseline algorithms, the proposed scheme can effectively select the task offloading mode and reasonably allocate VECS computing resources, ensure the QoS of task execution, and have a certain stability and scalability. Simulation results show that the total completion time of the proposed scheme can be reduced by 24-29% compared with the existing state-of-the-art techniques.

18.
Sensors (Basel) ; 24(9)2024 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-38732905

RESUMO

High-pressure pipelines are critical for transporting hazardous materials over long distances, but they face threats from third-party interference activities. Preventive measures are implemented, but interference accidents can still occur, making the need for high-quality detection strategies vital. This paper proposes an end-to-end Artificial Intelligence of Things (AIoT) solution to detect potential interference threats in real time. The solution involves developing a smart visual sensor capable of processing images using state-of-the-art computer vision algorithms and transmitting alerts to pipeline operators in real time. The system's core is based on the object-detection model (e.g., You Only Look Once version 4 (YOLOv4) and DETR with Improved deNoising anchOr boxes (DINO)), trained on a custom Pipeline Visual Threat Assessment (Pipe-VisTA) dataset. Among the trained models, DINO was able to achieve the best Mean Average Precision (mAP) of 71.2% for the unseen test dataset. However, for the deployment on a limited computational-ability edge computer (i.e., the NVIDIA Jetson Nano), the simpler and TensorRT-optimized YOLOv4 model was used, which achieved a mAP of 61.8% for the test dataset. The developed AIoT device captures the image using a camera, processes on the edge using the trained YOLOv4 model to detect the potential threat, transmits the threat alert to a Fleet Portal via LoRaWAN, and hosts the alert on a dashboard via a satellite network. The device has been fully tested in the field to ensure its functionality prior to deployment for the SEA Gas use-case. The AIoT smart solution has been deployed across the 10km stretch of the SEA Gas pipeline across the Murray Bridge section. In total, 48 AIoT devices and three Fleet Portals are installed to ensure the line-of-sight communication between the devices and portals.

19.
Sensors (Basel) ; 24(9)2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38733003

RESUMO

In the context of the rapid development of the Internet of Vehicles, virtual reality, automatic driving and the industrial Internet, the terminal devices in the network show explosive growth. As a result, more and more information is generated from the edge of the network, which makes the data throughput increase dramatically in the mobile communication network. As the key technology of the fifth-generation mobile communication network, mobile edge caching technology which caches popular data to the edge server deployed at the edge of the network avoids the data transmission delay of the backhaul link and the occurrence of network congestion. With the growing scale of the network, distributing hot data from cloud servers to edge servers will generate huge energy consumption. To realize the green and sustainable development of the communication industry and reduce the energy consumption of distribution of data that needs to be cached in edge servers, we make the first attempt to propose and solve the problem of edge caching data distribution with minimum energy consumption (ECDDMEC) in this paper. First, we model and formulate the problem as a constrained optimization problem and then prove its NP-hardness. Subsequently, we design a greedy algorithm with computational complexity of O(n2) to solve the problem approximately. Experimental results show that compared with the distribution strategy of each edge server directly requesting data from the cloud server, the strategy obtained by the algorithm can significantly reduce the energy consumption of data distribution.

20.
Sensors (Basel) ; 24(15)2024 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-39123922

RESUMO

Interest in deploying deep reinforcement learning (DRL) models on low-power edge devices, such as Autonomous Mobile Robots (AMRs) and Internet of Things (IoT) devices, has seen a significant rise due to the potential of performing real-time inference by eliminating the latency and reliability issues incurred from wireless communication and the privacy benefits of processing data locally. Deploying such energy-intensive models on power-constrained devices is not always feasible, however, which has led to the development of model compression techniques that can reduce the size and computational complexity of DRL policies. Policy distillation, the most popular of these methods, can be used to first lower the number of network parameters by transferring the behavior of a large teacher network to a smaller student model before deploying these students at the edge. This works well with deterministic policies that operate using discrete actions. However, many real-world tasks that are power constrained, such as in the field of robotics, are formulated using continuous action spaces, which are not supported. In this work, we improve the policy distillation method to support the compression of DRL models designed to solve these continuous control tasks, with an emphasis on maintaining the stochastic nature of continuous DRL algorithms. Experiments show that our methods can be used effectively to compress such policies up to 750% while maintaining or even exceeding their teacher's performance by up to 41% in solving two popular continuous control tasks.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA