Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Multi-Modal Excavator Activity Recognition Using Two-Stream CNN-LSTM with RGB and Point Cloud Inputsopen access

Authors
Cho, Hyuk SooLatif, KamranSharafat, AbubakarSeo, Jongwon
Issue Date
Jul-2025
Publisher
MDPI
Keywords
excavator; activity recognition; deep learning; multi-modal; two-stream CNN-LSTM; point cloud; data fusion
Citation
Applied Sciences-basel, v.15, no.15, pp 1 - 28
Pages
28
Indexed
SCIE
SCOPUS
Journal Title
Applied Sciences-basel
Volume
15
Number
15
Start Page
1
End Page
28
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208627
DOI
10.3390/app15158505
ISSN
2076-3417
2076-3417
Abstract
Recently, deep learning algorithms have been increasingly applied in construction for activity recognition, particularly for excavators, to automate processes and enhance safety and productivity through continuous monitoring of earthmoving activities. These deep learning algorithms analyze construction videos to classify excavator activities for earthmoving purposes. However, previous studies have solely focused on single-source external videos, which limits the activity recognition capabilities of the deep learning algorithm. This paper introduces a novel multi-modal deep learning-based methodology for recognizing excavator activities, utilizing multi-stream input data. It processes point clouds and RGB images using the two-stream long short-term memory convolutional neural network (CNN-LSTM) method to extract spatiotemporal features, enabling the recognition of excavator activities. A comprehensive dataset comprising 495,000 video frames of synchronized RGB and point cloud data was collected across multiple construction sites under varying conditions. The dataset encompasses five key excavator activities: Approach, Digging, Dumping, Idle, and Leveling. To assess the effectiveness of the proposed method, the performance of the two-stream CNN-LSTM architecture is compared with that of single-stream CNN-LSTM models on the same RGB and point cloud datasets, separately. The results demonstrate that the proposed multi-stream approach achieved an accuracy of 94.67%, outperforming existing state-of-the-art single-stream models, which achieved 90.67% accuracy for the RGB-based model and 92.00% for the point cloud-based model. These findings underscore the potential of the proposed activity recognition method, making it highly effective for automatic real-time monitoring of excavator activities, thereby laying the groundwork for future integration into digital twin systems for proactive maintenance and intelligent equipment management.
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 건설환경공학과 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Seo, Jong won photo

Seo, Jong won
COLLEGE OF ENGINEERING (DEPARTMENT OF CIVIL AND ENVIRONMENTAL ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE