Prune Your Model Before Distill It

Park, J.; No, A.

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Prune Your Model Before Distill It

Authors: Park, J.; No, A.

Issue Date: 1-Jan-2022

Publisher: Springer Science and Business Media Deutschland GmbH

Keywords: Knowledge distillation; Label smoothing regularization (LSR); Neural network compression; Pruning

Citation: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v.13671 LNCS, pp.120 - 136

Journal Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume: 13671 LNCS

Start Page: 120

End Page: 136

URI: https://scholarworks.bwise.kr/hongik/handle/2020.sw.hongik/30622

DOI: 10.1007/978-3-031-20083-0_8

ISSN: 0302-9743

Abstract: Knowledge distillation transfers the knowledge from a cumbersome teacher to a small student. Recent results suggest that the student-friendly teacher is more appropriate to distill since it provides more transferrable knowledge. In this work, we propose the novel framework, “prune, then distill,” that prunes the model first to make it more transferrable and then distill it to the student. We provide several exploratory examples where the pruned teacher teaches better than the original unpruned networks. We further show theoretically that the pruned teacher plays the role of regularizer in distillation, which reduces the generalization error. Based on this result, we propose a novel neural network compression scheme where the student network is formed based on the pruned teacher and then apply the “prune, then distill” strategy. The code is available at https://github.com/ososos888/prune-then-distill. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Engineering > School of Electronic & Electrical Engineering > 1. Journal Articles

Show full item record

qrcode

Altmetrics

Total Views & Downloads

STATISTICS: Total View :2,526,058; Today View :6,919

RSS_1.0 RSS_2.0 ATOM_1.0

94, Wausan-ro, Mapo-gu, Seoul, 04066, Korea02-320-1314

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE