Abstract: The objective is to study an on-line Hidden Markov model (HMM) estimation-based Q-learning algorithm for partially observable Markov decision process (POMDP) on finite state and action sets.
Abstract: In this paper, we aim to design a deep reinforcement learning (DRL) based control solution to navigating a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area ...