IVDR: Imitation learning with Variational inference and Distributional Reinforcement learning to find Optimal Driving Strategy
- Authors
- Joo, K.[Joo, K.]; Woo, S.S.[Woo, S.S.]
- Issue Date
- 2021
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- Distributional RL (DR); Imitation learning; Reinforcement Learning (RL); Soft Actor-Critic (SAC); Variational Inference (VI)
- Citation
- Proceedings - 20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021, pp.256 - 262
- Indexed
- SCOPUS
- Journal Title
- Proceedings - 20th IEEE International Conference on Machine Learning and Applications, ICMLA 2021
- Start Page
- 256
- End Page
- 262
- URI
- https://scholarworks.bwise.kr/skku/handle/2021.sw.skku/97480
- DOI
- 10.1109/ICMLA52953.2021.00047
- ISSN
- 0000-0000
- Abstract
- Current state-of-the-art autonomous driving technology significantly advanced, leveraging reinforcement learning (RL) algorithms, because it is not easy to apply a rule-based driving method that reflects all the various traffic conditions. Indeed, reinforcement learning can produce the possible optimal driving strategy of urban, rural, and motorway roads in various environmental conditions such as speed limits and school zones. However, it is challenging to adjust the parameters of the reward mechanism in RL, because the driving style of each user is very different. And it takes a massive amount of time and resources to conduct RL by reflecting all complex traffic conditions. However, if RL imitates the driving behavior of an expert, RL algorithm can proceed more quickly. Therefore, we propose a novel imitation learning framework, which combines an expert's driving behavior with a continuous behavior of an agent. Further, a deep reinforcement learning approach is used to mimic the expert's driving behavior. Therefore, we propose imitation learning with variational inference and distributional reinforcement learning (IVDR) algorithm. Our results show that IVDR achieves 80% better learning speed than the learning speed of other approaches and outperforms 12% higher in average reward. Our work shows great promise of using RL for autonomous driving and real vehicle driving simulation. © 2021 IEEE.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - Computing and Informatics > Computer Science and Engineering > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/skku/handle/2021.sw.skku/97480)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.