Pruning with Scaled Policy Constraints for Light-weight Reinforcement Learning

Park, Seongmin; Kim, Hyungmin; Kim, Hyunhak; Choi, Jungwook

doi:10.1109/ACCESS.2024.3367002

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Pruning with Scaled Policy Constraints for Light-weight Reinforcement Learningopen access

Authors: Park, Seongmin; Kim, Hyungmin; Kim, Hyunhak; Choi, Jungwook

Issue Date: Feb-2024

Publisher: Institute of Electrical and Electronics Engineers Inc.

Keywords: Autonomous Driving; Behavioral sciences; Cloning; Computational modeling; D4RL; Data models; Decision making; Deep reinforcement learning; Deep Reinforcement Learning; Drone Control; Drones; Fine-tuning; Hardware; Mobile robots; Model Compression; Offline Reinforcment Learning; Reinforcement learning; Robotics; Robots; Structured Pruning

Citation: IEEE Access, v.12, pp 36055 - 36065

Pages: 11

Indexed: SCIE
SCOPUS

Journal Title: IEEE Access

Volume: 12

Start Page: 36055

End Page: 36065

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/195137

DOI: 10.1109/ACCESS.2024.3367002

ISSN: 2169-3536
2169-3536

Abstract: The increasing computational demands of Deep Reinforcement Learning (DRL) models, particularly for embedded systems in autonomous vehicles and drones, present significant challenges owing to their extensive neural network complexities. Previous DRL compression strategies predominantly focused on unstructured pruning, effective for reducing model size but requiring specialized hardware for computational acceleration. Conversely, DRL models with structured pruning applied can be accelerated on standard hardware, though they typically encounter performance issues at higher pruning rates due to structural constraints. In response to these challenges, this paper introduces an advanced structured pruning methodology, combined with scaled policy constraints (SPC) for DRL models. Our approach overcomes the performance limitations of conventional structured pruning, achieving high pruning rates while maintaining robust model performance. Enhanced performance restoration after pruning is achieved by fine-tuning with SPC and applying structural regularization, thus ensuring efficient decision-making with a minimal computational burden. Extensive evaluations on the D4RL benchmark and in a drone control simulation environment confirm the effectiveness of our method. Our approach maintains performance integrity even at high pruning rates, with less than a 2% decrease in normalized score at 90% pruning in D4RL and preserving cumulative reward at 87% pruning in drone control simulation. Significantly, our approach also enables considerable computational acceleration on standard hardware. We implemented our method on the NVIDIA Jetson Xavier NX board and achieved a 2.5-fold speed-up on devices with NVIDIA Volta GPUs and over double the speed-up on those with NVIDIA Carmel ARMv8.2 CPUs. These outcomes highlight our method’s suitability for real-time, resource-constrained applications, demonstrating its practicality and efficiency.

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE