An optimal resource assignment and mode selection for vehicular communication using proximal on-policy scheme

Budhiraja, Ishan; Alphy, Anna; Pandey, Pawan; Garg, Sahil; Choi, Bong Jun; Hassan, Mohammad Mehedi

doi:10.1016/j.aej.2024.07.010

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

An optimal resource assignment and mode selection for vehicular communication using proximal on-policy schemeopen access

Authors: Budhiraja, Ishan; Alphy, Anna; Pandey, Pawan; Garg, Sahil; Choi, Bong Jun; Hassan, Mohammad Mehedi

Issue Date: Nov-2024

Publisher: ELSEVIER

Keywords: DRL; DDPG; MDP; POPS; And V2X

Citation: ALEXANDRIA ENGINEERING JOURNAL, v.107, pp 268 - 279

Pages: 12

Journal Title: ALEXANDRIA ENGINEERING JOURNAL

Volume: 107

Start Page: 268

End Page: 279

URI: https://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/49913

DOI: 10.1016/j.aej.2024.07.010

ISSN: 1110-0168
2090-2670

Abstract: Vehicle-to-everything (V2X) communication is essential in 5G and upcoming networks as it enables seamless interaction between vehicles and infrastructure, ensuring the reliable transmission of critical and time- sensitive data. Challenges like unstable communication in highly mobile vehicular networks, limited channel state information, high transmission overhead, and significant communication costs hinder vehicle-to-vehicle (V2V) communication. To tackle these issues, a unified approach utilizing distributed deep reinforcement learning is proposed to enhance the overall network performance while meeting the quality of service (QoS), latency, and rate requirements. Recognizing the complexity of this NP-hard, non-convex problem, a machine learning framework based on the Markov decision process (MDP) is adopted for a robust strategy. This framework facilitates the formulation of a reward function and the selection of optimal actions with certainty. Furthermore, a spectrum-based allocation framework employing multi-agent deep reinforcement learning (MADRL) is confidently introduced. The deep deterministic policy gradient (DDPG) within this framework enables the exchange of historical data globally during the primary learning phase, effectively removing the need for signal interaction and manual intervention in optimizing system efficiency. The data transmission policy follows an augmented online policy scheme, known as the proximal online policy scheme (POPS), which confidently reduces the computational complexity during the learning process. The complexity is marginally adjusted using the clipping substitute technique with assurance in the learning phase. Simulation results validate that the proposed method outperforms existing decentralized systems in achieving a higher average data transmission rate and ensuring quality of service (QoS) satisfaction confidently.

Files in This Item: Go to Link

Appears in Collections: College of Information Technology > School of Computer Science and Engineering > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Choi, Bong Jun photo

Choi, Bong Jun: College of Information Technology (School of Computer Science and Engineering)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

Soongsil University Library 369 Sangdo-Ro, Dongjak-Gu, Seoul, Korea (06978)02-820-0733

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE