Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

DFT-based transformation invariant pooling layer for visual classification

Full metadata record
DC Field Value Language
dc.contributor.authorRyu, Jongbin-
dc.contributor.authorYang, Ming-Hsuan-
dc.contributor.authorLim, Jongwoo-
dc.date.accessioned2022-07-11T09:28:33Z-
dc.date.available2022-07-11T09:28:33Z-
dc.date.created2021-05-11-
dc.date.issued2018-10-
dc.identifier.issn0302-9743-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/149307-
dc.description.abstractWe propose a novel discrete Fourier transform-based pooling layer for convolutional neural networks. The DFT magnitude pooling replaces the traditional max/average pooling layer between the convolution and fully-connected layers to retain translation invariance and shape preserving (aware of shape difference) properties based on the shift theorem of the Fourier transform. Thanks to the ability to handle image misalignment while keeping important structural information in the pooling stage, the DFT magnitude pooling improves the classification accuracy significantly. In addition, we propose the DFT+ method for ensemble networks using the middle convolution layer outputs. The proposed methods are extensively evaluated on various classification tasks using the ImageNet, CUB 2010-2011, MIT Indoors, Caltech 101, FMD and DTD datasets. The AlexNet, VGG-VD 16, Inception-v3, and ResNet are used as the base networks, upon which DFT and DFT+ methods are implemented. Experimental results show that the proposed methods improve the classification performance in all networks and datasets.-
dc.language영어-
dc.language.isoen-
dc.publisherSpringer Verlag-
dc.titleDFT-based transformation invariant pooling layer for visual classification-
dc.typeArticle-
dc.contributor.affiliatedAuthorLim, Jongwoo-
dc.identifier.doi10.1007/978-3-030-01264-9_6-
dc.identifier.scopusid2-s2.0-85055696119-
dc.identifier.bibliographicCitationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v.11218 LNCS, pp.89 - 104-
dc.relation.isPartOfLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)-
dc.citation.titleLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)-
dc.citation.volume11218 LNCS-
dc.citation.startPage89-
dc.citation.endPage104-
dc.type.rimsART-
dc.type.docTypeConference Paper-
dc.description.journalClass1-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.subject.keywordPlusComputer vision-
dc.subject.keywordPlusConvolution-
dc.subject.keywordPlusDiscrete Fourier transforms-
dc.subject.keywordPlusImage enhancement-
dc.subject.keywordPlusNeural networks-
dc.subject.keywordPlusClassification accuracy-
dc.subject.keywordPlusClassification performance-
dc.subject.keywordPlusConvolutional neural network-
dc.subject.keywordPlusFully-connected layers-
dc.subject.keywordPlusStructural information-
dc.subject.keywordPlusTransformation invariants-
dc.subject.keywordPlusTranslation invariance-
dc.subject.keywordPlusVisual classification-
dc.subject.keywordPlusClassification (of information)-
dc.identifier.urlhttps://link.springer.com/chapter/10.1007/978-3-030-01264-9_6-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Lim, Jongwoo photo

Lim, Jongwoo
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE