LES DERNIÈRES INFORMATIONS

Dynamic SDN-Based Radio Access Network Slicing With Deep Reinforcement Learning for URLLC and eMBB Services

Titre	Dynamic SDN-Based Radio Access Network Slicing With Deep Reinforcement Learning for URLLC and eMBB Services
Publication Type	Journal Article
Year of Publication	2022
Authors	Filali, A, Mlika, Z, Cherkaoui, S, Kobbane, A
Journal	IEEE Transactions on Network Science and Engineering
Volume	9
Pagination	2174-2187
Mots-clés	5G mobile communication systems, Deep learning, EMBB, Heuristic algorithms, Heuristics algorithm, Learning algorithms, Low-latency communication, Markov processes, Multi agent systems, Network slicing, Optimisations, Optimization, Quality of service, Quality-of-service, Radio access networks, Reinforcement learning, Resource allocation, Resource Management, Software agents, Software defined networking, Software-defined networkings, Time measurement, Ultra reliable low latency communication, URLLC
Abstract	Radio access network (RAN) slicing is a key technology that enables 5G network to support heterogeneous requirements of generic services, namely ultra-reliable low-latency communication (URLLC) and enhanced mobile broadband (eMBB). In this paper, we propose a two time-scales RAN slicing mechanism to optimize the performance of URLLC and eMBB services. In a large time-scale, an SDN controller allocates radio resources to gNodeBs according to the requirements of the eMBB and URLLC services. In a short time-scale, each gNodeB allocates its available resources to its end-users and requests, if needed, additional resources from adjacent gNodeBs. We formulate this problem as a non-linear binary program and prove its NP-hardness. Next, for each time-scale, we model the problem as a Markov decision process (MDP), where the large-time scale is modeled as a single agent MDP whereas the shorter time-scale is modeled as a multi-agent MDP. We leverage the exponential-weight algorithm for exploration and exploitation (EXP3) to solve the single-agent MDP of the large time-scale MDP and the multi-agent deep Q-learning (DQL) algorithm to solve the multi-agent MDP of the short time-scale resource allocation. Extensive simulations show that our approach is efficient under different network parameters configuration and it outperforms recent benchmark solutions. © 2013 IEEE.
URL	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85126273319&doi=10.1109%2fTNSE.2022.3157274&partnerID=40&md5=44d6a41c3d7b8b6d8c4d6918856f36c5
DOI	10.1109/TNSE.2022.3157274