Parallel Pathway Dense Video Captioning With Deformable Transformer

Choi, Wangyu; Chen, Jiasi; Yoon, Jongwon

doi:10.1109/ACCESS.2022.3228821

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Parallel Pathway Dense Video Captioning With Deformable Transformer

Full metadata record

DC Field	Value	Language
dc.contributor.author	Choi, Wangyu	-
dc.contributor.author	Chen, Jiasi	-
dc.contributor.author	Yoon, Jongwon	-
dc.date.accessioned	2023-02-21T05:39:51Z	-
dc.date.available	2023-02-21T05:39:51Z	-
dc.date.issued	2022-12	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/111561	-
dc.description.abstract	Dense video captioning is a very challenging task because it requires a high-level understanding of the video story, as well as pinpointing details such as objects and motions for a consistent and fluent description of the video. Many existing solutions divide this problem into two sub-tasks, event detection and captioning, and solve them sequentially ( "localize-then-describe " or reverse). Consequently, the final outcome is highly dependent on the performance of the preceding modules. In this paper, we decompose this sequential approach by proposing a parallel pathway dense video captioning framework that localizes and describes events simultaneously without any bottlenecks. We introduce a representation organization network at the branching point of the parallel pathway to organize the encoded video feature by considering the entire storyline. Then, an event localizer focuses to localize events without any event proposal generation network, a sentence generator describes events while considering the fluency and coherency of sentences. Our method has several advantages over existing work: (i) the final output does not depend on the output of the preceding modules, (ii) it improves existing parallel decoding methods by relieving the bottleneck of information. We evaluate the performance of PPVC on large-scale benchmark datasets, the ActivityNet Captions, and YouCook2. PPVC not only outperforms existing algorithms on the majority of metrics but also improves on both datasets by 5.4% and 4.9% compared to the state-of-the-art parallel decoding method.	-
dc.format.extent	12	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	Parallel Pathway Dense Video Captioning With Deformable Transformer	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/ACCESS.2022.3228821	-
dc.identifier.scopusid	2-s2.0-85144806731	-
dc.identifier.wosid	000902044700001	-
dc.identifier.bibliographicCitation	IEEE Access, v.10, pp 129899 - 129910	-
dc.citation.title	IEEE Access	-
dc.citation.volume	10	-
dc.citation.startPage	129899	-
dc.citation.endPage	129910	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Telecommunications	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Telecommunications	-
dc.subject.keywordAuthor	Machine learning	-
dc.subject.keywordAuthor	deep learning	-
dc.subject.keywordAuthor	video and language	-
dc.subject.keywordAuthor	video captioning	-
dc.identifier.url	https://ieeexplore.ieee.org/document/9982599	-

Files in This Item

Parallel_Pathway_Dense_Video_Captioning_With_Deformable_Transformer.pdf 1.68 MB

Appears in Collections: COLLEGE OF COMPUTING > ERICA 컴퓨터학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Yoon, Jongwon photo

Yoon, Jongwon: ERICA 소프트웨어융합대학 (ERICA 컴퓨터학부)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE