Learning to Schedule Joint Radar-Communication with Deep Multi-Agent Reinforcement Learning

Lee, J.; Niyato, T.D.; Guan, Y.L.; Kim, D.I.

doi:10.1109/TVT.2021.3124810

Detailed Information

Cited 4 time in webofscience

Cited 5 time in scopus

Metadata Downloads

Learning to Schedule Joint Radar-Communication with Deep Multi-Agent Reinforcement Learning

Authors: Lee, J.; Niyato, T.D.; Guan, Y.L.; Kim, D.I.

Issue Date: Jan-2022

Publisher: Institute of Electrical and Electronics Engineers Inc.

Keywords: Accidents; Automotive engineering; Cameras; communication; deep learning; Radar; Reinforcement learning; Reinforcement learning; Sensor systems; Sensors; task scheduling; vehicle safety

Citation: IEEE Transactions on Vehicular Technology, v.71, no.1, pp 406 - 422

Pages: 17

Indexed: SCIE
SCOPUS

Journal Title: IEEE Transactions on Vehicular Technology

Volume: 71

Number: 1

Start Page: 406

End Page: 422

URI: https://scholarworks.bwise.kr/skku/handle/2021.sw.skku/92784

DOI: 10.1109/TVT.2021.3124810

ISSN: 0018-9545
1939-9359

Abstract: Radar detection and communication are two of several sub-tasks essential for the operation of next-generation autonomous vehicles (AVs). The former is required for sensing and perception, more frequently so under various unfavorable environmental conditions such as heavy precipitation; the latter is needed to transmit time-critical data. Forthcoming proliferation of faster 5G networks utilizing mmWave is likely to lead to interference with automotive radar sensors, which has led to a body of research on the development of Joint Radar Communication (JRC) systems and solutions. This paper considers the problem of time-sharing for JRC, with the additional simultaneous objective of minimizing the average age of information (AoI) transmitted by a JRC-equipped AV. We first formulate the problem as a Markov Decision Process (MDP). We then propose a more general multi-agent system, with an appropriate medium access control protocol (MAC), which is formulated as a partially observed Markov game (POMG). To solve the POMG, we propose a multi-agent extension of the Proximal Policy Optimization (PPO) algorithm, along with algorithmic features to enhance learning from raw observations. Simulations are run with a range of environmental parameters to mimic variations in real-world operation. The results show that the chosen deep reinforcement learning methods allow the agents to obtain good results with minimal a priori knowledge about the environment. IEEE

Files in This Item: There are no files associated with this item.

Appears in Collections: Information and Communication Engineering > School of Electronic and Electrical Engineering > 1. Journal Articles

Show full item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :5,368,237; Today View :2,797

RSS_1.0 RSS_2.0 ATOM_1.0

(03063) 25-2, SUNGKYUNKWAN-RO, JONGNO-GU, SEOUL, KOREAsamsunglib@skku.edu

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE