Haofeng Zhang
Modality independent adversarial network for generalized zero shot image classification
Zhang, Haofeng; Wang, Yinduo; Long, Yang; Yang, Longzhi; Shao, Ling
Abstract
Zero Shot Learning (ZSL) aims to classify images of unseen target classes by transferring knowledge from source classes through semantic embeddings. The core of ZSL research is to embed both visual representation of object instance and semantic description of object class into a joint latent space and learn cross-modal (visual and semantic) latent representations. However, the learned representations by existing efforts often fail to fully capture the underlying cross-modal semantic consistency, and some of the representations are very similar and less discriminative. To circumvent these issues, in this paper, we propose a novel deep framework, called Modality Independent Adversarial Network (MIANet) for Generalized Zero Shot Learning (GZSL), which is an end-to-end deep architecture with three submodules. First, both visual feature and semantic description are embedded into a latent hyper-spherical space, where two orthogonal constraints are employed to ensure the learned latent representations discriminative. Second, a modality adversarial submodule is employed to make the latent representations independent of modalities to make the shared representations grab more cross-modal high-level semantic information during training. Third, a cross reconstruction submodule is proposed to reconstruct latent representations into the counterparts instead of themselves to make them capture more modality irrelevant information. Comprehensive experiments on five widely used benchmark datasets are conducted on both GZSL and standard ZSL settings, and the results show the effectiveness of our proposed method.
Citation
Zhang, H., Wang, Y., Long, Y., Yang, L., & Shao, L. (2021). Modality independent adversarial network for generalized zero shot image classification. Neural Networks, 134, 11-22. https://doi.org/10.1016/j.neunet.2020.11.007
Journal Article Type | Article |
---|---|
Acceptance Date | Nov 15, 2020 |
Online Publication Date | Nov 21, 2020 |
Publication Date | 2021-02 |
Deposit Date | May 26, 2021 |
Publicly Available Date | Nov 21, 2021 |
Journal | Neural Networks |
Print ISSN | 0893-6080 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 134 |
Pages | 11-22 |
DOI | https://doi.org/10.1016/j.neunet.2020.11.007 |
Public URL | https://durham-repository.worktribe.com/output/1274590 |
Files
Accepted Journal Article
(3.3 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by-nc-nd/4.0/
Copyright Statement
© 2021 This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
You might also like
EfficientTDNN: Efficient Architecture Search for Speaker Recognition
(2022)
Journal Article
Kernelized distance learning for zero-shot recognition
(2021)
Journal Article
A plug-in attribute correction module for generalized zero-shot learning
(2020)
Journal Article
Semantic combined network for zero-shot scene parsing
(2019)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search