Manli Zhu
A Skeleton-aware Graph Convolutional Network for Human-Object Interaction Detection
Zhu, Manli; Ho, Edmund S.L.; Shum, Hubert P.H.
Abstract
Detecting human-object interactions is essential for comprehensive understanding of visual scenes. In particular, spatial connections between humans and objects are important cues for reasoning interactions. To this end, we propose a skeleton-aware graph convolutional network for human-object interaction detection, named SGCN4HOI. Our network exploits the spatial connections between human keypoints and object keypoints to capture their fine-grained structural interactions via graph convolutions. It fuses such geometric features with visual features and spatial configuration features obtained from human-object pairs. Furthermore, to better preserve the object structural information and facilitate human-object interaction detection, we propose a novel skeleton-based object keypoints representation. The performance of SGCN4HOI is evaluated in the public benchmark V-COCO dataset. Experimental results show that the proposed approach outperforms the state-of-theart pose-based models and achieves competitive performance against other models.
Citation
Zhu, M., Ho, E. S., & Shum, H. P. (2022). A Skeleton-aware Graph Convolutional Network for Human-Object Interaction Detection. . https://doi.org/10.1109/smc53654.2022.9945149
Conference Name | IEEE SMC 2022: International Conference on Systems, Man, and Cybernetics |
---|---|
Conference Location | Prague, Czech Republic |
Start Date | Oct 9, 2022 |
End Date | Oct 12, 2022 |
Acceptance Date | Jul 6, 2022 |
Online Publication Date | Nov 18, 2022 |
Publication Date | 2022 |
Deposit Date | Jul 11, 2022 |
Publicly Available Date | Oct 13, 2022 |
Series ISSN | 1062-922X,2577-1655 |
DOI | https://doi.org/10.1109/smc53654.2022.9945149 |
Files
Accepted Conference Proceeding
(988 Kb)
PDF
Copyright Statement
© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation
(2023)
Conference Proceeding
Region-based Appearance and Flow Characteristics for Anomaly Detection in Infrared Surveillance Imagery
(2023)
Conference Proceeding
Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient
(2023)
Conference Proceeding
A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments
(2023)
Conference Proceeding
Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation
(2023)
Conference Proceeding