RESUMO
Software-defined networking (SDN) has become one of the critical technologies for data center networks, as it can improve network performance from a global perspective using artificial intelligence algorithms. Due to the strong decision-making and generalization ability, deep reinforcement learning (DRL) has been used in SDN intelligent routing and scheduling mechanisms. However, traditional deep reinforcement learning algorithms present the problems of slow convergence rate and instability, resulting in poor network quality of service (QoS) for an extended period before convergence. Aiming at the above problems, we propose an automatic QoS architecture based on multistep DRL (AQMDRL) to optimize the QoS performance of SDN. AQMDRL uses a multistep approach to solve the overestimation and underestimation problems of the deep deterministic policy gradient (DDPG) algorithm. The multistep approach uses the maximum value of the n-step action currently estimated by the neural network instead of the one-step Q-value function, as it reduces the possibility of positive error generated by the Q-value function and can effectively improve convergence stability. In addition, we adapt a prioritized experience sampling based on SumTree binary trees to improve the convergence rate of the multistep DDPG algorithm. Our experiments show that the AQMDRL we proposed significantly improves the convergence performance and effectively reduces the network transmission delay of SDN over existing DRL algorithms.