RESUMEN
In this paper, we investigate how to efficiently utilize channel bandwidth in heterogeneous hybrid optical and acoustic underwater sensor networks, where sensor nodes adopt different Media Access Control (MAC) protocols to transmit data packets to a common relay node on optical or acoustic channels. We propose a new MAC protocol based on deep reinforcement learning (DRL), referred to as optical and acoustic dual-channel deep-reinforcement learning multiple access (OA-DLMA), in which the sensor nodes utilizing the OA-DLMA protocol are called agents, and the remainder are non-agents. The agents can learn the transmission patterns of coexisting non-agents and find an optimal channel access strategy without any prior information. Moreover, in order to further enhance network performance, we develop a differentiated reward policy that rewards specific actions over optical and acoustic channels differently, with priority compensation being given to the optical channel to achieve greater data transmission. Furthermore, we have derived the optimal short-term sum throughput and channel utilization analytically and conducted extensive simulations to evaluate the OA-DLMA protocol. Simulation results show that our protocol performs with near-optimal performance and significantly outperforms other existing protocols in terms of short-term sum throughput and channel utilization.
Asunto(s)
Acústica , Redes de Comunicación de Computadores , Simulación por ComputadorRESUMEN
Underwater Wireless Sensor Networks (UWSNs) have aroused increasing interest of many researchers in industry, military, commerce and academe recently. Due to the harsh underwater environment, energy efficiency is a significant theme should be considered for routing in UWSNs. Underwater positioning is also a particularly tricky task since the high attenuation of radio-frequency signals in UWSNs. In this paper, we propose an energy-efficient depth-based opportunistic routing algorithm with Q-learning (EDORQ) for UWSNs to guarantee the energy-saving and reliable data transmission. It combines the respective advantages of Q-learning technique and opportunistic routing (OR) algorithm without the full-dimensional location information to improve the network performance in terms of energy consumption, average network overhead and packet delivery ratio. In EDORQ, the void detection factor, residual energy and depth information of candidate nodes are jointly considered when defining the Q-value function, which contributes to proactively detecting void nodes in advance, meanwhile, reducing energy consumption. In addition, a simple and scalable void node recovery mode is proposed for the selection of candidate set so as to rescue packets that are stuck in void nodes unfortunately. Furthermore, we design a novel method to set the holding time for the schedule of packet forwarding base on Q-value so as to alleviate the packet collision and redundant transmission. We conduct extensive simulations to evaluate the performance of our proposed algorithm and compare it with other three routing algorithms on Aqua-sim platform (NS2). The results show that the proposed algorithm significantly improve the performance in terms of energy efficiency, packet delivery ratio and average network overhead without sacrificing too much average packet delay.