Junyan Wang
Deconfounding Causal Inference for Zero-shot Action Recognition
Wang, Junyan; Jiang, Yiqi; Long, Yang; Sun, Xiuyu; Pagnucco, Maurice; Song, Yang
Authors
Yiqi Jiang
Dr Yang Long yang.long@durham.ac.uk
Associate Professor
Xiuyu Sun
Maurice Pagnucco
Yang Song
Abstract
Zero-shot action recognition (ZSAR) aims to recognize unseen action categories in the test set without corresponding training examples. Most existing zero-shot methods follow the feature generation framework to transfer knowledge from seen action categories to model the feature distribution of unseen categories. However, due to the complexity and diversity of actions, it remains challenging to generate unseen feature distribution, especially for the cross-dataset scenario when there is potentially larger domain shift. This paper proposes a De confounding Ca usa l GAN (DeCalGAN) for generating unseen action video features with the following technical contributions: 1) Our model unifies compositional ZSAR with traditional visual-semantic models to incorporate local object information with global semantic information for feature generation. 2) A GAN-based architecture is proposed for causal inference and unseen distribution discovery. 3) A deconfounding module is proposed to refine representations of local object and global semantic information confounder in the training data. Action descriptions and random object feature after causal inference are then used to discover unseen distributions of novel actions in different datasets. Our extensive experiments on C ross- D ataset Z ero- S hot A ction R ecognition (CD-ZSAR) demonstrate substantial improvement over the UCF101 and HMDB51 standard benchmarks for this problem.
Citation
Wang, J., Jiang, Y., Long, Y., Sun, X., Pagnucco, M., & Song, Y. (2023). Deconfounding Causal Inference for Zero-shot Action Recognition. IEEE Transactions on Multimedia, 26, 3976 - 3986. https://doi.org/10.1109/tmm.2023.3318300
Journal Article Type | Article |
---|---|
Acceptance Date | Sep 1, 2023 |
Online Publication Date | Sep 22, 2023 |
Publication Date | 2023 |
Deposit Date | Oct 23, 2023 |
Publicly Available Date | Oct 24, 2023 |
Journal | IEEE Transactions on Multimedia |
Print ISSN | 1520-9210 |
Electronic ISSN | 1941-0077 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 26 |
Pages | 3976 - 3986 |
DOI | https://doi.org/10.1109/tmm.2023.3318300 |
Public URL | https://durham-repository.worktribe.com/output/1815181 |
Files
Accepted Journal Article
(4.9 Mb)
PDF
Copyright Statement
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
Kernelized distance learning for zero-shot recognition
(2021)
Journal Article
A plug-in attribute correction module for generalized zero-shot learning
(2020)
Journal Article
Semantic combined network for zero-shot scene parsing
(2019)
Journal Article
A Joint Label Space for Generalized Zero-Shot Classification
(2020)
Journal Article