A Continual Learning algorithm based on Orthogonal Gradient Descent beyond Neural Tangent Kernel regime
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, Da Eun | - |
dc.contributor.author | Nakamura, Kensuke | - |
dc.contributor.author | Tak, Jae-Ho | - |
dc.contributor.author | Hong, Byung-Woo | - |
dc.date.accessioned | 2023-09-28T13:40:56Z | - |
dc.date.available | 2023-09-28T13:40:56Z | - |
dc.date.issued | 2023 | - |
dc.identifier.issn | 2169-3536 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/67894 | - |
dc.description.abstract | Continual learning aims to enable neural networks to learn new tasks without catastrophic forgetting of previously learned knowledge. Orthogonal Gradient Descent algorithms have been proposed as an effective solution to mitigate catastrophic forgetting. However, these algorithms often rely on the Neural Tangent Kernel regime, which imposes limitations on network architecture. In this study, we propose a novel method to construct an orthonormal basis set for orthogonal projection by leveraging Catastrophic Forgetting Loss. In contrast to the conventional gradient-based basis that reflects an update of model within an infinitesimal range, our loss-based basis can account for the variance within two distinct points in the model parameter space, thus overcoming the limitations of the Neural Tangent Kernel regime. We provide both quantitative and qualitative analysis of the proposed method, discussing its advantages over conventional gradient-based baselines. Our approach is extensively evaluated on various model architectures and datasets, demonstrating a significant performance advantage, especially for deep or narrow networks where the Neural Tangent Kernel regime is violated. Furthermore, we offer a mathematical analysis based on higher-order Taylor series to provide theoretical justification. This study introduces a novel theoretical framework and a practical algorithm, potentially inspiring further research in areas such as continual learning, network debugging, and one-pass learning. Author | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.title | A Continual Learning algorithm based on Orthogonal Gradient Descent beyond Neural Tangent Kernel regime | - |
dc.type | Article | - |
dc.identifier.doi | 10.1109/ACCESS.2023.3303869 | - |
dc.identifier.bibliographicCitation | IEEE Access, v.11, pp 85395 - 85404 | - |
dc.description.isOpenAccess | Y | - |
dc.identifier.wosid | 001051639800001 | - |
dc.identifier.scopusid | 2-s2.0-85167806971 | - |
dc.citation.endPage | 85404 | - |
dc.citation.startPage | 85395 | - |
dc.citation.title | IEEE Access | - |
dc.citation.volume | 11 | - |
dc.type.docType | Article | - |
dc.publisher.location | 미국 | - |
dc.subject.keywordAuthor | Catastrophic forgetting | - |
dc.subject.keywordAuthor | Computational modeling | - |
dc.subject.keywordAuthor | continual learning | - |
dc.subject.keywordAuthor | Kernel | - |
dc.subject.keywordAuthor | Neural networks | - |
dc.subject.keywordAuthor | neural tangent kernel | - |
dc.subject.keywordAuthor | orthogonal gradient descent | - |
dc.subject.keywordAuthor | orthogonal projection | - |
dc.subject.keywordAuthor | Predictive models | - |
dc.subject.keywordAuthor | Principal component analysis | - |
dc.subject.keywordAuthor | Task analysis | - |
dc.subject.keywordAuthor | Training | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Engineering | - |
dc.relation.journalResearchArea | Telecommunications | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
dc.relation.journalWebOfScienceCategory | Telecommunications | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
84, Heukseok-ro, Dongjak-gu, Seoul, Republic of Korea (06974)02-820-6194
COPYRIGHT 2019 Chung-Ang University All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.