An optimal resource assignment and mode selection for vehicular communication using proximal on-policy schemeopen access
- Authors
- Budhiraja, Ishan; Alphy, Anna; Pandey, Pawan; Garg, Sahil; Choi, Bong Jun; Hassan, Mohammad Mehedi
- Issue Date
- Nov-2024
- Publisher
- ELSEVIER
- Keywords
- DRL; DDPG; MDP; POPS; And V2X
- Citation
- ALEXANDRIA ENGINEERING JOURNAL, v.107, pp 268 - 279
- Pages
- 12
- Journal Title
- ALEXANDRIA ENGINEERING JOURNAL
- Volume
- 107
- Start Page
- 268
- End Page
- 279
- URI
- https://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/49913
- DOI
- 10.1016/j.aej.2024.07.010
- ISSN
- 1110-0168
2090-2670
- Abstract
- Vehicle-to-everything (V2X) communication is essential in 5G and upcoming networks as it enables seamless interaction between vehicles and infrastructure, ensuring the reliable transmission of critical and time- sensitive data. Challenges like unstable communication in highly mobile vehicular networks, limited channel state information, high transmission overhead, and significant communication costs hinder vehicle-to-vehicle (V2V) communication. To tackle these issues, a unified approach utilizing distributed deep reinforcement learning is proposed to enhance the overall network performance while meeting the quality of service (QoS), latency, and rate requirements. Recognizing the complexity of this NP-hard, non-convex problem, a machine learning framework based on the Markov decision process (MDP) is adopted for a robust strategy. This framework facilitates the formulation of a reward function and the selection of optimal actions with certainty. Furthermore, a spectrum-based allocation framework employing multi-agent deep reinforcement learning (MADRL) is confidently introduced. The deep deterministic policy gradient (DDPG) within this framework enables the exchange of historical data globally during the primary learning phase, effectively removing the need for signal interaction and manual intervention in optimizing system efficiency. The data transmission policy follows an augmented online policy scheme, known as the proximal online policy scheme (POPS), which confidently reduces the computational complexity during the learning process. The complexity is marginally adjusted using the clipping substitute technique with assurance in the learning phase. Simulation results validate that the proposed method outperforms existing decentralized systems in achieving a higher average data transmission rate and ensuring quality of service (QoS) satisfaction confidently.
- Files in This Item
-
Go to Link
- Appears in
Collections - College of Information Technology > School of Computer Science and Engineering > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/49913)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.