Rocorl: Transferable Reinforcement Learning-Based Robust Control for Cyber-Physical Systems with Limited Data Updates

Yoo, G.[Yoo, G.]; Yoo, M.[Yoo, M.]; Yeom, I.[Yeom, I.]; Woo, H.[Woo, H.]

doi:10.1109/ACCESS.2020.3044945

Detailed Information

Cited 2 time in webofscience

Cited 2 time in scopus

Metadata Downloads

Rocorl: Transferable Reinforcement Learning-Based Robust Control for Cyber-Physical Systems with Limited Data Updatesopen access

Authors: Yoo, G.[Yoo, G.]; Yoo, M.[Yoo, M.]; Yeom, I.[Yeom, I.]; Woo, H.[Woo, H.]

Issue Date: 2020

Publisher: Institute of Electrical and Electronics Engineers Inc.

Keywords: Cyber-physical system; model-based learning; real-time data; reinforcement learning; stale observations

Citation: IEEE Access, v.8, pp.225370 - 225383

Indexed: SCIE
SCOPUS

Journal Title: IEEE Access

Volume: 8

Start Page: 225370

End Page: 225383

URI: https://scholarworks.bwise.kr/skku/handle/2021.sw.skku/2193

DOI: 10.1109/ACCESS.2020.3044945

ISSN: 2169-3536

Abstract: Autonomous control systems are increasingly using machine learning technologies to process sensor data, making timely and informed decisions about performing control functions based on the data processing results. Among such machine learning technologies, reinforcement learning (RL) with deep neural networks has been recently recognized as one of the feasible solutions, since it enables learning by interaction with environments of control systems. In this paper, we consider RL-based control models and address the problem of temporally outdated observations often incurred in dynamic cyber-physical environments. The problem can hinder broad adoptions of RL methods for autonomous control systems. Specifically, we present an RL-based robust control model, namely rocorl, that exploits a hierarchical learning structure in which a set of low-level policy variants are trained for stale observations and then their learned knowledge can be transferred to a target environment limited in timely data updates. In doing so, we employ an autoencoder-based observation transfer scheme for systematically training a set of transferable control policies and an aggregated model-based learning scheme for data-efficiently training a high-level orchestrator in a hierarchy. Our experiments show that rocorl is robust against various conditions of distributed sensor data updates, compared with several other models including a state-of-the-art POMDP method. © 2013 IEEE.

Files in This Item: There are no files associated with this item.

Appears in Collections: Computing and Informatics > Computer Science and Engineering > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher YEOM, IK JUN photo

YEOM, IK JUN: Computing and Informatics (Computer Science and Engineering)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :5,516,570; Today View :4,581

RSS_1.0 RSS_2.0 ATOM_1.0

(03063) 25-2, SUNGKYUNKWAN-RO, JONGNO-GU, SEOUL, KOREAsamsunglib@skku.edu

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE