Learning to schedule network resources throughput and delay optimally using q+-learning
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Bae, Jeongmin | - |
dc.contributor.author | Lee, Joohyun | - |
dc.contributor.author | Chong, Song | - |
dc.date.accessioned | 2021-11-08T04:34:46Z | - |
dc.date.available | 2021-11-08T04:34:46Z | - |
dc.date.issued | 2021-04 | - |
dc.identifier.issn | 1063-6692 | - |
dc.identifier.issn | 1558-2566 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/106214 | - |
dc.description.abstract | As network architecture becomes complex and the user requirement gets diverse, the role of efficient network resource management becomes more important. However, existing throughput-optimal scheduling algorithms such as the max-weight algorithm suffer from poor delay performance. In this paper, we present reinforcement learning-based network scheduling algorithms for a single-hop downlink scenario which achieve throughput-optimality and converge to minimal delay. To this end, we first formulate the network optimization problem as a Markov decision process (MDP) problem. Then, we introduce a new state-action value function called Q+-function and develop a reinforcement learning algorithm called Q+-learning with UCB (Upper Confidence Bound) exploration which guarantees small performance loss during a learning process. We also derive an upper bound of the sample complexity in our algorithm, which is more efficient than the best known bound from Q-learning with UCB exploration by a factor of γ2 where γ is the discount factor of the MDP problem. Finally, via simulation, we verify that our algorithm shows a delay reduction of up to 40.8% compared to the max-weight algorithm over various scenarios. We also show that the Q+-learning with UCB exploration converges to an ϵ-epsilon-optimal policy 10 times faster than Q-learning with UCB. | - |
dc.format.extent | 14 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.title | Learning to schedule network resources throughput and delay optimally using q+-learning | - |
dc.type | Article | - |
dc.publisher.location | 미국 | - |
dc.identifier.doi | 10.1109/tnet.2021.3051663 | - |
dc.identifier.scopusid | 2-s2.0-85100475440 | - |
dc.identifier.wosid | 000641964600020 | - |
dc.identifier.bibliographicCitation | IEEE/ACM Transactions on Networking, v.29, no.2, pp 750 - 763 | - |
dc.citation.title | IEEE/ACM Transactions on Networking | - |
dc.citation.volume | 29 | - |
dc.citation.number | 2 | - |
dc.citation.startPage | 750 | - |
dc.citation.endPage | 763 | - |
dc.type.docType | Article | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Engineering | - |
dc.relation.journalResearchArea | Telecommunications | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Hardware & Architecture | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Theory & Methods | - |
dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
dc.relation.journalWebOfScienceCategory | Telecommunications | - |
dc.subject.keywordPlus | ALLOCATION | - |
dc.subject.keywordPlus | OPTIMIZATION | - |
dc.subject.keywordPlus | MODEL | - |
dc.subject.keywordAuthor | Network resource management | - |
dc.subject.keywordAuthor | throughput and delay optimality | - |
dc.subject.keywordAuthor | reinforcement learning | - |
dc.subject.keywordAuthor | upper confidence bound | - |
dc.identifier.url | https://ieeexplore.ieee.org/document/9336288 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr
COPYRIGHT © 2021 HANYANG UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.