MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection

Kim, Bumsoo; Mun, Jonghwan; On, Kyoung-Woon; Shin, Minchul; Lee, Junhyun; Kim, Eun Sol

doi:10.1109/CVPR52688.2022.01897

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Bumsoo	-
dc.contributor.author	Mun, Jonghwan	-
dc.contributor.author	On, Kyoung-Woon	-
dc.contributor.author	Shin, Minchul	-
dc.contributor.author	Lee, Junhyun	-
dc.contributor.author	Kim, Eun Sol	-
dc.date.accessioned	2022-12-20T10:37:13Z	-
dc.date.available	2022-12-20T10:37:13Z	-
dc.date.created	2022-12-07	-
dc.date.issued	2022-06	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/173242	-
dc.description.abstract	Human-Object Interaction (HOI) detection is the task of identifying a set of (human, object, interaction) triplets from an image. Recent work proposed transformer encoder-decoder architectures that successfully eliminated the need for many hand-designed components in HOI detection through end-to-end training. However, they are limited to single-scale feature resolution, providing suboptimal performance in scenes containing humans, objects, and their interactions with vastly different scales and distances. To tackle this problem, we propose a Multi-Scale TRansformer (MSTR) for HOI detection powered by two novel HOI-aware deformable attention modules called Dual-Entity attention and Entity-conditioned Context attention. While existing deformable attention comes at a huge cost in HOI detection performance, our proposed attention modules of MSTR learn to effectively attend to sampling points that are essential to identify interactions. In experiments, we achieve the new state-of-the-art performance on two HOI detection benchmarks.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	IEEE Computer Society	-
dc.title	MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Kim, Eun Sol	-
dc.identifier.doi	10.1109/CVPR52688.2022.01897	-
dc.identifier.scopusid	2-s2.0-85141778735	-
dc.identifier.wosid	000870783005038	-
dc.identifier.bibliographicCitation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, v.2022-June, pp.19556 - 19565	-
dc.relation.isPartOf	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	-
dc.citation.title	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	-
dc.citation.volume	2022-June	-
dc.citation.startPage	19556	-
dc.citation.endPage	19565	-
dc.type.rims	ART	-
dc.type.docType	Proceedings Paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Imaging Science & Photographic Technology	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Imaging Science & Photographic Technology	-
dc.subject.keywordAuthor	Scene analysis and understanding	-
dc.identifier.url	https://ieeexplore.ieee.org/document/9878434	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kim, Eun Sol photo

Kim, Eun Sol: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE