Kanglei Zhou
Hierarchical Graph Convolutional Networks for Action Quality Assessment
Zhou, Kanglei; Ma, Yue; Shum, Hubert P.H.; Liang, Xiaohui
Abstract
Action quality assessment (AQA) automatically evaluates how well humans perform actions in a given video, a technique widely used in fields such as rehabilitation medicine, athletic competitions, and specific skills assessment. However, existing works that uniformly divide the video sequence into small clips of equal length suffer from intra-clip confusion and inter-clip incoherence, hindering the further development of AQA. To address this issue, we propose a hierarchical graph convolutional network (GCN). First, semantic information confusion is corrected through clip refinement, generating the ‘shot’ as the basic action unit. We then construct a scene graph by combining several consecutive shots into meaningful scenes to capture local dynamics. These scenes can be viewed as different procedures of a given action, providing valuable assessment cues. The video-level representation is finally extracted via sequential action aggregation among scenes to regress the predicted score distribution, enhancing discriminative features and improving assessment performance. Experiments on the AQA-7, MTLAQA, and JIGSAWS datasets demonstrate the superiority of the proposed hierarchical GCN over state-of-the-art methods.
Citation
Zhou, K., Ma, Y., Shum, H. P., & Liang, X. (online). Hierarchical Graph Convolutional Networks for Action Quality Assessment. IEEE Transactions on Circuits and Systems for Video Technology, 33(12), 7749 - 7763. https://doi.org/10.1109/TCSVT.2023.3281413
Journal Article Type | Article |
---|---|
Acceptance Date | May 28, 2023 |
Online Publication Date | May 30, 2023 |
Deposit Date | May 30, 2023 |
Publicly Available Date | May 30, 2023 |
Journal | IEEE Transactions on Circuits and Systems for Video Technology |
Print ISSN | 1051-8215 |
Electronic ISSN | 1558-2205 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 33 |
Issue | 12 |
Pages | 7749 - 7763 |
DOI | https://doi.org/10.1109/TCSVT.2023.3281413 |
Public URL | https://durham-repository.worktribe.com/output/1171198 |
Publisher URL | https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=76 |
Files
Accepted Journal Article
(3.1 Mb)
PDF
Copyright Statement
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
Adaptive Graph Learning from Spatial Information for Surgical Workflow Anticipation
(2024)
Journal Article
Neural-code PIFu: High-fidelity Single Image 3D Human Reconstruction via Neural Code Integration
(2024)
Presentation / Conference Contribution
From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos
(2024)
Presentation / Conference Contribution
MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment
(2024)
Presentation / Conference Contribution
SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM
(2024)
Presentation / Conference Contribution
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search