Privacy-Preserving Intelligent Resource Allocation for Federated Edge Learning in Quantum Internet
- Authors
- Xu, M.[Xu, M.]; Niyato, D.[Niyato, D.]; Yang, Z.[Yang, Z.]; Xiong, Z.[Xiong, Z.]; Kang, J.[Kang, J.]; Kim, D.I.[Kim, D.I.]; Shen, X.[Shen, X.]
- Issue Date
- 1-Jan-2023
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- deep reinforcement learning; Federated edge learning; quantum key distribution (QKD); resource allocation
- Citation
- IEEE Journal on Selected Topics in Signal Processing, v.17, no.1, pp.142 - 157
- Indexed
- SCIE
SCOPUS
- Journal Title
- IEEE Journal on Selected Topics in Signal Processing
- Volume
- 17
- Number
- 1
- Start Page
- 142
- End Page
- 157
- URI
- https://scholarworks.bwise.kr/skku/handle/2021.sw.skku/103078
- DOI
- 10.1109/JSTSP.2022.3224591
- ISSN
- 1932-4553
- Abstract
- Federated learning (FL) is an emerging technology for empowering various applications that generate large amounts of data in intelligent cyber-physical systems (ICPS). Though FL can address users' concerns about data privacy, its maintenance still depends on efficient incentive mechanisms. For long-term incentivization to participants in data federation under dynamic environments, deep reinforcement learning as a promising technology has been extensively studied. However, the non-stationary problem caused by the heterogeneity of ICPS devices results in a serious effect on the convergence rate of existing single-agent reinforcement learning. In this paper, we propose a multi-agent learning-based incentive mechanism to capture the stationarity approximation in FL with heterogeneous ICPS. First, we formulate the secure communication and data resource allocation problem as a Stackelberg game in FL with multiple participants. Then, to tackle the heterogeneous problem, we model this multi-agent game as a partially observable Markov decision process. Particularly, a multi-agent federated reinforcement learning algorithm is proposed to learn the allocation policies efficiently by dwindling variances in policy evaluation caused by interaction among multiple devices without sharing privacy information. Moreover, the proposed algorithm is proved to attain convergence at an expected rate. Lastly, extensive experimental results demonstrate that our proposed algorithm significantly outperforms baselines. © 2007-2012 IEEE.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - Information and Communication Engineering > School of Electronic and Electrical Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.