Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling

Shim, Kyuhong; Choi, Iksoo; Sung, Wonyong; Choi, Jung wook

doi:10.1109/ISOCC53507.2021.9613933

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling

Full metadata record

DC Field	Value	Language
dc.contributor.author	Shim, Kyuhong	-
dc.contributor.author	Choi, Iksoo	-
dc.contributor.author	Sung, Wonyong	-
dc.contributor.author	Choi, Jung wook	-
dc.date.accessioned	2022-07-06T11:33:21Z	-
dc.date.available	2022-07-06T11:33:21Z	-
dc.date.created	2022-03-07	-
dc.date.issued	2021-11	-
dc.identifier.issn	2163-9612	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/140374	-
dc.description.abstract	Recently, the necessity of multiple attention heads in transformer architecture has been questioned [1]. Removing less important heads from a large network is a promising strategy to reduce computation cost and parameters. However, pruning out attention heads in multihead attention does not evenly reduce the overall load, because feedforward modules are not affected. In this study, we apply attention head pruning on All-Attention [2] transformer, where savings in the computation are proportional to the number of pruned heads. This improved computing efficiency comes at the cost of pruning sensitivity, which we stabilize with three training techniques. Our attention head pruning enables a considerably fewer number of parameters with a comparable perplexity for transformer-based language modeling.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	IEEE	-
dc.title	Layer-wise Pruning of Transformer Attention Heads for Efficient Language Modeling	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Choi, Jung wook	-
dc.identifier.doi	10.1109/ISOCC53507.2021.9613933	-
dc.identifier.scopusid	2-s2.0-85123372999	-
dc.identifier.wosid	000861550500152	-
dc.identifier.bibliographicCitation	Proceedings - International SoC Design Conference 2021, ISOCC 2021, pp.357 - 358	-
dc.relation.isPartOf	Proceedings - International SoC Design Conference 2021, ISOCC 2021	-
dc.citation.title	Proceedings - International SoC Design Conference 2021, ISOCC 2021	-
dc.citation.startPage	357	-
dc.citation.endPage	358	-
dc.type.rims	ART	-
dc.type.docType	Proceedings Paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Computer Science, Hardware & Architecture	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.subject.keywordPlus	Computational linguistics	-
dc.subject.keywordPlus	Computation costs	-
dc.subject.keywordPlus	Computing efficiency	-
dc.subject.keywordPlus	Feed forward	-
dc.subject.keywordPlus	Language model	-
dc.subject.keywordPlus	Larger networks	-
dc.subject.keywordPlus	Layer-wise	-
dc.subject.keywordPlus	Multihead	-
dc.subject.keywordPlus	Multihead attention	-
dc.subject.keywordPlus	Pruning	-
dc.subject.keywordPlus	Transformer	-
dc.subject.keywordPlus	Modeling languages	-
dc.subject.keywordAuthor	multihead attention	-
dc.subject.keywordAuthor	pruning	-
dc.subject.keywordAuthor	transformer	-
dc.identifier.url	https://ieeexplore.ieee.org/document/9613933	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 융합전자공학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Choi, Jung wook photo

Choi, Jung wook: COLLEGE OF ENGINEERING (SCHOOL OF ELECTRONIC ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE