Xiaoming Liu
Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning
Liu, Xiaoming; Wang, Chen; Yang, Guan; Wang, Chunhua; Long, Yang; Liu, Jie; Zhang, Zhiyuan
Authors
Chen Wang
Guan Yang
Chunhua Wang
Dr Yang Long yang.long@durham.ac.uk
Associate Professor
Jie Liu
Zhiyuan Zhang
Abstract
Generalized zero-shot learning (GZSL) aims to simultaneously recognize both seen classes and unseen classes by training only on seen class samples and auxiliary semantic descriptions. Recent state-of-the-art methods infer unseen classes based on semantic information or synthesize unseen classes using generative models based on semantic information, all of which rely on the correct alignment of visual–semantic features. However, they often overlook the inconsistency between original visual features and semantic attributes. Additionally, due to the existence of cross-modal dataset biases, the visual features extracted and synthesized by the model may also mismatch with some semantic features, which could hinder the model from properly aligning visual–semantic features. To address this issue, this paper proposes a GZSL framework that enhances the consistency of visual–semantic features using a self-distillation and disentanglement network (SDDN). The aim is to utilize the self-distillation and disentanglement network to obtain semantically consistent refined visual features and non-redundant semantic features to enhance the consistency of visual–semantic features. Firstly, SDDN utilizes self-distillation technology to refine the extracted and synthesized visual features of the model. Subsequently, the visual–semantic features are then disentangled and aligned using a disentanglement network to enhance the consistency of the visual–semantic features. Finally, the consistent visual–semantic features are fused to jointly train a GZSL classifier. Extensive experiments demonstrate that the proposed method achieves more competitive results on four challenging benchmark datasets (AWA2, CUB, FLO, and SUN).
Citation
Liu, X., Wang, C., Yang, G., Wang, C., Long, Y., Liu, J., & Zhang, Z. (2024). Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning. Electronics, 13(10), Article 1977. https://doi.org/10.3390/electronics13101977
Journal Article Type | Article |
---|---|
Acceptance Date | May 14, 2024 |
Online Publication Date | May 18, 2024 |
Publication Date | May 2, 2024 |
Deposit Date | Jun 13, 2024 |
Publicly Available Date | Jun 13, 2024 |
Journal | Electronics |
Electronic ISSN | 2079-9292 |
Publisher | MDPI |
Peer Reviewed | Peer Reviewed |
Volume | 13 |
Issue | 10 |
Article Number | 1977 |
DOI | https://doi.org/10.3390/electronics13101977 |
Keywords | visual–semantic feature consistency, generalized zero-shot learning, self-distillation, disentanglement network |
Public URL | https://durham-repository.worktribe.com/output/2480516 |
Files
Published Journal Article
(1.1 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
EfficientTDNN: Efficient Architecture Search for Speaker Recognition
(2022)
Journal Article
Kernelized distance learning for zero-shot recognition
(2021)
Journal Article
A plug-in attribute correction module for generalized zero-shot learning
(2020)
Journal Article
Semantic combined network for zero-shot scene parsing
(2019)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search