FA3C: FPGA-Accelerated Deep Reinforcement Learning
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Cho, Hyungmin | - |
dc.contributor.author | H. | - |
dc.contributor.author | Oh | - |
dc.contributor.author | P. | - |
dc.contributor.author | Park | - |
dc.contributor.author | J. | - |
dc.contributor.author | Jung | - |
dc.contributor.author | W. | - |
dc.contributor.author | Lee | - |
dc.contributor.author | J. | - |
dc.date.available | 2021-03-17T08:00:41Z | - |
dc.date.created | 2021-02-26 | - |
dc.date.issued | 2019 | - |
dc.identifier.issn | 0000-0000 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/hongik/handle/2020.sw.hongik/12784 | - |
dc.description.abstract | Deep Reinforcement Learning (Deep RL) is applied to many areas where an agent learns how to interact with the environment to achieve a certain goal, such as video game plays and robot controls. Deep RL exploits a DNN to eliminate the need for handcrafted feature engineering that requires prior domain knowledge. The Asynchronous Advantage Actor-Critic (A3C) is one of the state-of-the-art Deep RL methods. In this paper, we present an FPGA-based A3C Deep RL platform, called FA3C. Traditionally, FPGA-based DNN accelerators have mainly focused on inference only by exploiting fixed-point arithmetic. Our platform targets both inference and training using single-precision floating-point arithmetic. We demonstrate the performance and energy efficiency of FA3C using multiple A3C agents that learn the control policies of six Atari 2600 games. Its performance is better than a high-end GPU-based platform (NVIDIA Tesla P100). FA3C achieves 27.9% better performance than that of a state-of-the-art GPU-based implementation. Moreover, the energy efficiency of FA3C is 1.62x better than that of the GPU-based implementation. | - |
dc.publisher | ASSOC COMPUTING MACHINERY | - |
dc.title | FA3C: FPGA-Accelerated Deep Reinforcement Learning | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Cho, Hyungmin | - |
dc.identifier.doi | 10.1145/3297858.3304058 | - |
dc.identifier.scopusid | 2-s2.0-85064698044 | - |
dc.identifier.wosid | 000584356000036 | - |
dc.identifier.bibliographicCitation | International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS, pp.499 - 513 | - |
dc.relation.isPartOf | International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS | - |
dc.citation.title | International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS | - |
dc.citation.startPage | 499 | - |
dc.citation.endPage | 513 | - |
dc.type.rims | ART | - |
dc.type.docType | Proceedings Paper | - |
dc.description.journalClass | 1 | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Hardware & Architecture | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Software Engineering | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Theory & Methods | - |
dc.subject.keywordAuthor | reinforcement learning | - |
dc.subject.keywordAuthor | deep neural networks | - |
dc.subject.keywordAuthor | FPGA | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
94, Wausan-ro, Mapo-gu, Seoul, 04066, Korea02-320-1314
COPYRIGHT 2020 HONGIK UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.