Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Semantic Recognition of Human-Object Interactions via Gaussian-based Elliptical Modelling and Pixel-Level Labelingopen access

Authors
Khalid, NidaGhadi, Yazeed YasinGochoo, MunkhjargalJalal, AhmadKim, Kibum
Issue Date
Jul-2021
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
3D point cloud; fiducial points; human-object interaction; K-ary tree hashing; pixel labeling; semantic segmentation; super-pixels
Citation
IEEE Access, v.9, pp 111249 - 111266
Pages
18
Indexed
SCIE
SCOPUS
Journal Title
IEEE Access
Volume
9
Start Page
111249
End Page
111266
URI
https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/113871
DOI
10.1109/ACCESS.2021.3101716
ISSN
2169-3536
Abstract
Human-Object Interaction (HOI) recognition, due to its significance in many computer vision-based applications, requires in-depth and meaningful details from image sequences. Incorporating semantics in scene understanding has led to a deep understanding of human-centric actions. Therefore, in this research work, we propose a semantic HOI recognition system based on multi-vision sensors. In the proposed system, the de-noised RGB and depth images, via Bilateral Filtering (BLF), are segmented into multiple clusters using a Simple Linear Iterative Clustering (SLIC) algorithm. The skeleton is then extracted from segmented RGB and depth images via Euclidean Distance Transform (EDT). Human joints, extracted from the skeleton, provide the annotations for accurate pixel-level labeling. An elliptical human model is then generated via a Gaussian Mixture Model (GMM). A Conditional Random Field (CRF) model is trained to allocate a specific label to each pixel of different human body parts and an interaction object. Two semantic feature types that are extracted from each labeled body part of the human and labelled objects are: Fiducial points and 3D point cloud. Features descriptors are quantized using Fisher's Linear Discriminant Analysis (FLDA) and classified using K-ary Tree Hashing (KATH). In experimentation phase the recognition accuracy achieved with the Sports dataset is 92.88%, with the Sun Yat-Sen University (SYSU) 3D HOI dataset is 93.5% and with the Nanyang Technological University (NTU) RGB+D dataset it is 94.16%. The proposed system is validated via extensive experimentation and should be applicable to many computer-vision based applications such as healthcare monitoring, security systems and assisted living etc. © 2013 IEEE.
Files in This Item
Go to Link
Appears in
Collections
COLLEGE OF COMPUTING > SCHOOL OF MEDIA, CULTURE, AND DESIGN TECHNOLOGY > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Kibum photo

Kim, Kibum
COLLEGE OF COMPUTING (SCHOOL OF MEDIA, CULTURE, AND DESIGN TECHNOLOGY)
Read more

Altmetrics

Total Views & Downloads

BROWSE