Multi-Modal Excavator Activity Recognition Using Two-Stream CNN-LSTM with RGB and Point Cloud Inputs

Cho, Hyuk Soo; Latif, Kamran; Sharafat, Abubakar; Seo, Jongwon

doi:10.3390/app15158505

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Multi-Modal Excavator Activity Recognition Using Two-Stream CNN-LSTM with RGB and Point Cloud Inputs

Full metadata record

DC Field	Value	Language
dc.contributor.author	Cho, Hyuk Soo	-
dc.contributor.author	Latif, Kamran	-
dc.contributor.author	Sharafat, Abubakar	-
dc.contributor.author	Seo, Jongwon	-
dc.date.accessioned	2025-09-03T06:00:24Z	-
dc.date.available	2025-09-03T06:00:24Z	-
dc.date.issued	2025-07	-
dc.identifier.issn	2076-3417	-
dc.identifier.issn	2076-3417	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208627	-
dc.description.abstract	Recently, deep learning algorithms have been increasingly applied in construction for activity recognition, particularly for excavators, to automate processes and enhance safety and productivity through continuous monitoring of earthmoving activities. These deep learning algorithms analyze construction videos to classify excavator activities for earthmoving purposes. However, previous studies have solely focused on single-source external videos, which limits the activity recognition capabilities of the deep learning algorithm. This paper introduces a novel multi-modal deep learning-based methodology for recognizing excavator activities, utilizing multi-stream input data. It processes point clouds and RGB images using the two-stream long short-term memory convolutional neural network (CNN-LSTM) method to extract spatiotemporal features, enabling the recognition of excavator activities. A comprehensive dataset comprising 495,000 video frames of synchronized RGB and point cloud data was collected across multiple construction sites under varying conditions. The dataset encompasses five key excavator activities: Approach, Digging, Dumping, Idle, and Leveling. To assess the effectiveness of the proposed method, the performance of the two-stream CNN-LSTM architecture is compared with that of single-stream CNN-LSTM models on the same RGB and point cloud datasets, separately. The results demonstrate that the proposed multi-stream approach achieved an accuracy of 94.67%, outperforming existing state-of-the-art single-stream models, which achieved 90.67% accuracy for the RGB-based model and 92.00% for the point cloud-based model. These findings underscore the potential of the proposed activity recognition method, making it highly effective for automatic real-time monitoring of excavator activities, thereby laying the groundwork for future integration into digital twin systems for proactive maintenance and intelligent equipment management.	-
dc.format.extent	28	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	MDPI	-
dc.title	Multi-Modal Excavator Activity Recognition Using Two-Stream CNN-LSTM with RGB and Point Cloud Inputs	-
dc.type	Article	-
dc.publisher.location	스위스	-
dc.identifier.doi	10.3390/app15158505	-
dc.identifier.scopusid	2-s2.0-105013342258	-
dc.identifier.wosid	001549047200001	-
dc.identifier.bibliographicCitation	Applied Sciences-basel, v.15, no.15, pp 1 - 28	-
dc.citation.title	Applied Sciences-basel	-
dc.citation.volume	15	-
dc.citation.number	15	-
dc.citation.startPage	1	-
dc.citation.endPage	28	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Chemistry	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Materials Science	-
dc.relation.journalResearchArea	Physics	-
dc.relation.journalWebOfScienceCategory	Chemistry, Multidisciplinary	-
dc.relation.journalWebOfScienceCategory	Engineering, Multidisciplinary	-
dc.relation.journalWebOfScienceCategory	Materials Science, Multidisciplinary	-
dc.relation.journalWebOfScienceCategory	Physics, Applied	-
dc.subject.keywordPlus	CONSTRUCTION WORKERS	-
dc.subject.keywordPlus	INTELLIGENT	-
dc.subject.keywordPlus	TRACKING	-
dc.subject.keywordPlus	FEATURES	-
dc.subject.keywordPlus	NETWORK	-
dc.subject.keywordPlus	SAFETY	-
dc.subject.keywordPlus	MODEL	-
dc.subject.keywordAuthor	excavator	-
dc.subject.keywordAuthor	activity recognition	-
dc.subject.keywordAuthor	deep learning	-
dc.subject.keywordAuthor	multi-modal	-
dc.subject.keywordAuthor	two-stream CNN-LSTM	-
dc.subject.keywordAuthor	point cloud	-
dc.subject.keywordAuthor	data fusion	-
dc.identifier.url	https://www.mdpi.com/2076-3417/15/15/8505	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 건설환경공학과 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Seo, Jong won photo

Seo, Jong won: COLLEGE OF ENGINEERING (DEPARTMENT OF CIVIL AND ENVIRONMENTAL ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE