An enhanced 3DCNN-ConvLSTM for spatiotemporal multimedia data analysis
- Authors
- Wang, Tian; Li, Jiakun; Zhang, Mengyi; Zhu, Aichun; Snoussi, Hichem; Choi, Chang
- Issue Date
- Jan-2021
- Publisher
- WILEY
- Keywords
- action recognition; ConvLSTM; 3DCNN
- Citation
- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, v.33, no.2
- Journal Title
- CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
- Volume
- 33
- Number
- 2
- URI
- https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/81185
- DOI
- 10.1002/cpe.5302
- ISSN
- 1532-0626
- Abstract
- At present, human action recognition is a challenging and complex task in the field of computer vision. The combination of CNN and RNN is a common and effective network structure for this task. Especially, we use 3DCNN in CNN part and ConvLSTM in RNN part. We divide the video into multiple temporal segments by average and compress each segment into one feature map by pooling layer. Adding the pooling layer, dropout layer, and batch normalization layer into ConvLSTM is our groundbreaking work. We test our model on KTH, UCF-11, and HMDB51 datasets and achieve a high accuracy of action recognition.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - IT융합대학 > 컴퓨터공학과 > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.