Cited 0 time in
Multi-Modal Excavator Activity Recognition Using Two-Stream CNN-LSTM with RGB and Point Cloud Inputs
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Cho, Hyuk Soo | - |
| dc.contributor.author | Latif, Kamran | - |
| dc.contributor.author | Sharafat, Abubakar | - |
| dc.contributor.author | Seo, Jongwon | - |
| dc.date.accessioned | 2025-09-03T06:00:24Z | - |
| dc.date.available | 2025-09-03T06:00:24Z | - |
| dc.date.issued | 2025-07 | - |
| dc.identifier.issn | 2076-3417 | - |
| dc.identifier.issn | 2076-3417 | - |
| dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/208627 | - |
| dc.description.abstract | Recently, deep learning algorithms have been increasingly applied in construction for activity recognition, particularly for excavators, to automate processes and enhance safety and productivity through continuous monitoring of earthmoving activities. These deep learning algorithms analyze construction videos to classify excavator activities for earthmoving purposes. However, previous studies have solely focused on single-source external videos, which limits the activity recognition capabilities of the deep learning algorithm. This paper introduces a novel multi-modal deep learning-based methodology for recognizing excavator activities, utilizing multi-stream input data. It processes point clouds and RGB images using the two-stream long short-term memory convolutional neural network (CNN-LSTM) method to extract spatiotemporal features, enabling the recognition of excavator activities. A comprehensive dataset comprising 495,000 video frames of synchronized RGB and point cloud data was collected across multiple construction sites under varying conditions. The dataset encompasses five key excavator activities: Approach, Digging, Dumping, Idle, and Leveling. To assess the effectiveness of the proposed method, the performance of the two-stream CNN-LSTM architecture is compared with that of single-stream CNN-LSTM models on the same RGB and point cloud datasets, separately. The results demonstrate that the proposed multi-stream approach achieved an accuracy of 94.67%, outperforming existing state-of-the-art single-stream models, which achieved 90.67% accuracy for the RGB-based model and 92.00% for the point cloud-based model. These findings underscore the potential of the proposed activity recognition method, making it highly effective for automatic real-time monitoring of excavator activities, thereby laying the groundwork for future integration into digital twin systems for proactive maintenance and intelligent equipment management. | - |
| dc.format.extent | 28 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | MDPI | - |
| dc.title | Multi-Modal Excavator Activity Recognition Using Two-Stream CNN-LSTM with RGB and Point Cloud Inputs | - |
| dc.type | Article | - |
| dc.publisher.location | 스위스 | - |
| dc.identifier.doi | 10.3390/app15158505 | - |
| dc.identifier.scopusid | 2-s2.0-105013342258 | - |
| dc.identifier.wosid | 001549047200001 | - |
| dc.identifier.bibliographicCitation | Applied Sciences-basel, v.15, no.15, pp 1 - 28 | - |
| dc.citation.title | Applied Sciences-basel | - |
| dc.citation.volume | 15 | - |
| dc.citation.number | 15 | - |
| dc.citation.startPage | 1 | - |
| dc.citation.endPage | 28 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Chemistry | - |
| dc.relation.journalResearchArea | Engineering | - |
| dc.relation.journalResearchArea | Materials Science | - |
| dc.relation.journalResearchArea | Physics | - |
| dc.relation.journalWebOfScienceCategory | Chemistry, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Engineering, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Materials Science, Multidisciplinary | - |
| dc.relation.journalWebOfScienceCategory | Physics, Applied | - |
| dc.subject.keywordPlus | CONSTRUCTION WORKERS | - |
| dc.subject.keywordPlus | INTELLIGENT | - |
| dc.subject.keywordPlus | TRACKING | - |
| dc.subject.keywordPlus | FEATURES | - |
| dc.subject.keywordPlus | NETWORK | - |
| dc.subject.keywordPlus | SAFETY | - |
| dc.subject.keywordPlus | MODEL | - |
| dc.subject.keywordAuthor | excavator | - |
| dc.subject.keywordAuthor | activity recognition | - |
| dc.subject.keywordAuthor | deep learning | - |
| dc.subject.keywordAuthor | multi-modal | - |
| dc.subject.keywordAuthor | two-stream CNN-LSTM | - |
| dc.subject.keywordAuthor | point cloud | - |
| dc.subject.keywordAuthor | data fusion | - |
| dc.identifier.url | https://www.mdpi.com/2076-3417/15/15/8505 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366
COPYRIGHT © 2024 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
