Excavator activity recognition under occlusion via multi-camera deep learning

Sharafat, Abubakar; Latif, Kamran; Deng, Tao; Seo, Jongwon

doi:10.1016/j.rineng.2025.108611

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Excavator activity recognition under occlusion via multi-camera deep learningopen access

Authors: Sharafat, Abubakar; Latif, Kamran; Deng, Tao; Seo, Jongwon

Issue Date: Mar-2026

Publisher: Elsevier B.V.

Keywords: Activity recognition; Deep learning; Excavator; Multi-camera input; Occlusion; Two-stream convolutional neural networks

Citation: Results in Engineering, v.29, pp 1 - 17

Pages: 17

Indexed: SCOPUS
ESCI

Journal Title: Results in Engineering

Volume: 29

Start Page: 1

End Page: 17

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/211568

DOI: 10.1016/j.rineng.2025.108611

ISSN: 2590-1230
2590-1230

Abstract: Accurate recognition of excavator activities is essential for automating construction processes. However, existing single-camera vision-based recognition methods tend to lose their effectiveness under occlusion. Occlusions are inherent in earthwork operations and are caused by various obstructions or the nature of earthwork operations, disrupting critical visual and motion cues. To address this challenge, this study presents a novel deep learning-based methodology designed to overcome these limitations through a multi-camera, two-stream Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) architecture for excavator activity recognition under occlusion. The proposed approach uses two synchronized cameras-external and in-cabin-each providing RGB and optical flow inputs. Each video data is processed through a dedicated CNN to extract spatial and motion features, which are fused and passed to a Long Short-Term Memory (LSTM) to capture temporal dependencies. A fully connected layer then classifies five excavator activities. To evaluate performance, three datasets-representing no occlusion, partial occlusion, and full occlusion scenarios-were curated to assess their performance under different levels of occlusion and compared with an existing single-camera CNN-LSTM approach using identical settings. The proposed method demonstrated recognition accuracies of 92.38 % and 91.43 % for the partial and full occlusion datasets, resulting in minor decreases of 2.97 % and 3.92 %. In comparison, a single-camera approach exhibits a notable accuracy reduction of approximately 6.0 % and 10.0 % for partial and full occlusions, respectively. These findings highlighted a significant improvement in the robustness and reliability of the multi-camera approach for excavator activity recognition in occluded real-world construction environments.

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 건설환경공학과 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Seo, Jong won photo

Seo, Jong won: COLLEGE OF ENGINEERING (DEPARTMENT OF CIVIL AND ENVIRONMENTAL ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE