Qianhui Men
Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition
Men, Qianhui; Ho, Edmond S.L.; Shum, Hubert P.H.; Leung, Howard
Abstract
Learning view-invariant representation is a key to improving feature discrimination power for skeleton-based action recognition. Existing approaches cannot effectively remove the impact of viewpoint due to the implicit view-dependent representations. In this work, we propose a self-supervised framework called Focalized Contrastive View-invariant Learning (FoCoViL), which significantly suppresses the view-specific information on the representation space where the viewpoints are coarsely aligned. By maximizing mutual information with an effective contrastive loss between multi-view sample pairs, FoCoViL associates actions with common view-invariant properties and simultaneously separates the dissimilar ones. We further propose an adaptive focalization method based on pairwise similarity to enhance contrastive learning for a clearer cluster boundary in the learned space. Different from many existing self-supervised representation learning work that rely heavily on supervised classifiers, FoCoViL performs well on both unsupervised and supervised cla ssifiers with superior recognition performance. Extensive experiments also show that the proposed contrastive-based focalization generates a more discriminative latent representation.
Citation
Men, Q., Ho, E. S., Shum, H. P., & Leung, H. (2023). Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition. Neurocomputing, 537, 198-209. https://doi.org/10.1016/j.neucom.2023.03.070
Journal Article Type | Article |
---|---|
Acceptance Date | Mar 28, 2023 |
Online Publication Date | Mar 31, 2023 |
Publication Date | Jun 7, 2023 |
Deposit Date | Apr 3, 2023 |
Publicly Available Date | Apr 1, 2024 |
Journal | Neurocomputing |
Print ISSN | 0925-2312 |
Electronic ISSN | 1872-8286 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 537 |
Pages | 198-209 |
DOI | https://doi.org/10.1016/j.neucom.2023.03.070 |
Public URL | https://durham-repository.worktribe.com/output/1176534 |
Files
Accepted Journal Article
(2.1 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by-nc-nd/4.0/
Copyright Statement
© 2023. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/
You might also like
Adaptive Graph Learning from Spatial Information for Surgical Workflow Anticipation
(2024)
Journal Article
Neural-code PIFu: High-fidelity Single Image 3D Human Reconstruction via Neural Code Integration
(2024)
Presentation / Conference Contribution
From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos
(2024)
Presentation / Conference Contribution
MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment
(2024)
Presentation / Conference Contribution
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search