NysAct: A Scalable Preconditioned Gradient Descent using Nyström Approximation
- Authors
- 고현석
- Issue Date
- Dec-2024
- Publisher
- IEEE
- Keywords
- Deep learning optimization; Gradient preconditioning; Nyström approximation
- Citation
- IEEE International Conference on BigData, pp 1442 - 1449
- Pages
- 8
- Indexed
- SCOPUS
- Journal Title
- IEEE International Conference on BigData
- Start Page
- 1442
- End Page
- 1449
- URI
- https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/121997
- DOI
- 10.1109/BigData62323.2024.10825352
- Abstract
- Adaptive gradient methods are computationally efficient and converge quickly, but they often suffer from poor generalization. In contrast, second-order methods enhance convergence and generalization but typically incur high computational and memory costs. In this work, we introduce NYSACT, a scalable first-order gradient preconditioning method that strikes a balance between state-of-the-art first-order and second-order optimization methods. NYSACT leverages an eigenvalue-shifted Nyström method to approximate the activation covariance matrix, which is used as a preconditioning matrix, significantly reducing time and memory complexities with minimal impact on test accuracy. Our experiments show that NYSACT not only achieves improved test accuracy compared to both first-order and second-order methods but also demands considerably less computational resources than existing second-order methods.
- Files in This Item
-
Go to Link
- Appears in
Collections - COLLEGE OF ENGINEERING SCIENCES > SCHOOL OF ELECTRICAL ENGINEERING > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.