Skip to main content

Research Repository

Advanced Search

Dr Yang Long's Outputs (56)

Imaginary-Connected Embedding in Complex Space for Unseen Attribute-Object Discrimination (2024)
Journal Article
Jiang, C., Wang, S., Long, Y., Li, Z., Zhang, H., & Shao, L. (online). Imaginary-Connected Embedding in Complex Space for Unseen Attribute-Object Discrimination. IEEE Transactions on Pattern Analysis and Machine Intelligence, https://doi.org/10.1109/tpami.2024.3487631

Compositional Zero-Shot Learning (CZSL) aims to recognize novel compositions of seen primitives. Prior studies have attempted to either learn primitives individually (non-connected) or establish dependencies among them in the composition (fully-conne... Read More about Imaginary-Connected Embedding in Complex Space for Unseen Attribute-Object Discrimination.

SID-NERF: Few-Shot Nerf Based on Scene Information Distribution (2024)
Presentation / Conference Contribution
Li, Y., Wan, F., & Long, Y. (2024, July). SID-NERF: Few-Shot Nerf Based on Scene Information Distribution. Presented at 2024 IEEE International Conference on Multimedia and Expo (ICME), Niagara Falls, ON, Canada

The novel view synthesis from a limited set of images is a significant research focus. Traditional NeRF methods, relying mainly on color supervision, struggle with accurate scene geometry reconstruction when faced with sparse input images, leading to... Read More about SID-NERF: Few-Shot Nerf Based on Scene Information Distribution.

CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video (2024)
Journal Article
Miao, X., Bai, Y., Duan, H., Wan, F., Huang, Y., Long, Y., & Zheng, Y. (2024). CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video. Pattern Recognition, 156, Article 110729. https://doi.org/10.1016/j.patcog.2024.110729

The goal of our work is to generate high-quality novel views from monocular videos of complex and dynamic scenes. Prior methods, such as DynamicNeRF, have shown impressive performance by leveraging time-varying dynamic radiation fields. However, thes... Read More about CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video.

Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning (2024)
Journal Article
Liu, X., Wang, C., Yang, G., Wang, C., Long, Y., Liu, J., & Zhang, Z. (2024). Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning. Electronics, 13(10), Article 1977. https://doi.org/10.3390/electronics13101977

Generalized zero-shot learning (GZSL) aims to simultaneously recognize both seen classes and unseen classes by training only on seen class samples and auxiliary semantic descriptions. Recent state-of-the-art methods infer unseen classes based on sema... Read More about Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning.

Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval (2024)
Journal Article
Ma, T., Organisciak, D., Ma, W., & Long, Y. (2024). Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval. Electronics, 13(9), Article 1660. https://doi.org/10.3390/electronics13091660

The pursuit of Artificial Intelligence (AI) that emulates human cognitive processes is a cornerstone of ethical AI development, ensuring that emerging technologies can seamlessly integrate into societal frameworks requiring nuanced understanding and... Read More about Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval.

Wearable-based behaviour interpolation for semi-supervised human activity recognition (2024)
Journal Article
Duan, H., Wang, S., Ojha, V., Wang, S., Huang, Y., Long, Y., …Zheng, Y. (2024). Wearable-based behaviour interpolation for semi-supervised human activity recognition. Information Sciences, 665, Article 120393. https://doi.org/10.1016/j.ins.2024.120393

While traditional feature engineering for Human Activity Recognition (HAR) involves a trial-and-error process, deep learning has emerged as a preferred method for high-level representations of sensor-based human activities. However, most deep learnin... Read More about Wearable-based behaviour interpolation for semi-supervised human activity recognition.

MRL-Seg: Overcoming Imbalance in Medical Image Segmentation With Multi-Step Reinforcement Learning (2023)
Journal Article
Yang, F., Li, X., Duan, H., Xu, F., Huang, Y., Zhang, X., …Zheng, Y. (2024). MRL-Seg: Overcoming Imbalance in Medical Image Segmentation With Multi-Step Reinforcement Learning. IEEE Journal of Biomedical and Health Informatics, 28(2), 858-869. https://doi.org/10.1109/jbhi.2023.3336726

Medical image segmentation is a critical task for clinical diagnosis and research. However, dealing with highly imbalanced data remains a significant challenge in this domain, where the region of interest (ROI) may exhibit substantial variations acro... Read More about MRL-Seg: Overcoming Imbalance in Medical Image Segmentation With Multi-Step Reinforcement Learning.

Deconfounding Causal Inference for Zero-shot Action Recognition (2023)
Journal Article
Wang, J., Jiang, Y., Long, Y., Sun, X., Pagnucco, M., & Song, Y. (2023). Deconfounding Causal Inference for Zero-shot Action Recognition. IEEE Transactions on Multimedia, 26, 3976 - 3986. https://doi.org/10.1109/tmm.2023.3318300

Zero-shot action recognition (ZSAR) aims to recognize unseen action categories in the test set without corresponding training examples. Most existing zero-shot methods follow the feature generation framework to transfer knowledge from seen action cat... Read More about Deconfounding Causal Inference for Zero-shot Action Recognition.

Feature fine-tuning and attribute representation transformation for zero-shot learning (2023)
Journal Article
Pang, S., He, X., Hao, W., & Long, Y. (2023). Feature fine-tuning and attribute representation transformation for zero-shot learning. Computer Vision and Image Understanding, 236, Article 103811. https://doi.org/10.1016/j.cviu.2023.103811

Zero-Shot Learning (ZSL) aims to generalize a pretrained classification model to unseen classes with the help of auxiliary semantic information. Recent generative methods are based on the paradigm of synthesizing unseen visual data from class attribu... Read More about Feature fine-tuning and attribute representation transformation for zero-shot learning.

DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume (2023)
Journal Article
Miao, X., Bai, Y., Duan, H., Huang, Y., Wan, F., Xu, X., Long, Y., & Zheng, Y. (2024). DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume. IEEE Transactions on Circuits and Systems for Video Technology, 34(4), 2564-2576. https://doi.org/10.1109/tcsvt.2023.3305776

Self-supervised monocular depth estimation methods typically rely on the reprojection error to capture geometric relationships between successive frames in static environments. However, this assumption does not hold in dynamic objects in scenarios, l... Read More about DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume.

The Importance of Expert Knowledge for Automatic Modulation Open Set Recognition (2023)
Journal Article
Li, T., Wen, Z., Long, Y., Hong, Z., Zheng, S., Yu, L., …Shao, L. (2023). The Importance of Expert Knowledge for Automatic Modulation Open Set Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11), 13730-13748. https://doi.org/10.1109/tpami.2023.3294505

Automatic modulation classification (AMC) is an important technology for the monitoring, management, and control of communication systems. In recent years, machine learning approaches are becoming popular to improve the effectiveness of AMC for radio... Read More about The Importance of Expert Knowledge for Automatic Modulation Open Set Recognition.

Dynamic Unary Convolution in Transformers (2023)
Journal Article
Duan, H., Long, Y., Wang, S., Zhang, H., Willcocks, C. G., & Shao, L. (2023). Dynamic Unary Convolution in Transformers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11), 12747 - 12759. https://doi.org/10.1109/tpami.2022.3233482

It is uncertain whether the power of transformer architectures can complement existing convolutional neural networks. A few recent attempts have combined convolution with transformer design through a range of structures in series, where the main cont... Read More about Dynamic Unary Convolution in Transformers.

EfficientTDNN: Efficient Architecture Search for Speaker Recognition (2022)
Journal Article
Wang, R., Wei, Z., Duan, H., Ji, S., Long, Y., & Hong, Z. (2022). EfficientTDNN: Efficient Architecture Search for Speaker Recognition. IEEE/ACM Transactions on Audio, Speech and Language Processing, 30, 2267-2279. https://doi.org/10.1109/taslp.2022.3182856

Convolutional neural networks (CNNs), such as the time-delay neural network (TDNN), have shown their remarkable capability in learning speaker embedding. However, they meanwhile bring a huge computational cost in storage size, processing, and memory.... Read More about EfficientTDNN: Efficient Architecture Search for Speaker Recognition.

Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models (2021)
Journal Article
Bond-Taylor, S., Leach, A., Long, Y., & Willcocks, C. G. (2021). Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11), 7327-7347. https://doi.org/10.1109/tpami.2021.3116668

Deep generative models are a class of techniques that train deep neural networks to model the distribution of training samples. Research has fragmented into various interconnected approaches, each of which make trade-offs including run-time, diversit... Read More about Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models.

Kernelized distance learning for zero-shot recognition (2021)
Journal Article
Zarei, M. R., Taheri, M., & Long, Y. (2021). Kernelized distance learning for zero-shot recognition. Information Sciences, 580, 801-818. https://doi.org/10.1016/j.ins.2021.09.032

Zero-Shot Learning (ZSL) has gained growing attention over the past few years mostly because it provides a significant scalability to recognition models for classifying instances from new unobserved classes. This scalability is achieved by providing... Read More about Kernelized distance learning for zero-shot recognition.

A plug-in attribute correction module for generalized zero-shot learning (2020)
Journal Article
Zhang, H., Bai, H., Long, Y., Liu, L., & Shao, L. (2021). A plug-in attribute correction module for generalized zero-shot learning. Pattern Recognition, 112, Article 107767. https://doi.org/10.1016/j.patcog.2020.107767

While Zero Shot Learning models can recognize new classes without training examples, they often fails to incorporate both seen and unseen classes together at the test time, which is known as the Generalized Zero-shot Learning (GZSL) problem. This pap... Read More about A plug-in attribute correction module for generalized zero-shot learning.

Modality independent adversarial network for generalized zero shot image classification (2020)
Journal Article
Zhang, H., Wang, Y., Long, Y., Yang, L., & Shao, L. (2021). Modality independent adversarial network for generalized zero shot image classification. Neural Networks, 134, 11-22. https://doi.org/10.1016/j.neunet.2020.11.007

Zero Shot Learning (ZSL) aims to classify images of unseen target classes by transferring knowledge from source classes through semantic embeddings. The core of ZSL research is to embed both visual representation of object instance and semantic descr... Read More about Modality independent adversarial network for generalized zero shot image classification.

Pseudo Distribution on Unseen Classes for Generalized Zero Shot Learning (2020)
Journal Article
Zhang, H., Liu, J., Yao, Y., & Long, Y. (2020). Pseudo Distribution on Unseen Classes for Generalized Zero Shot Learning. Pattern Recognition Letters, 135, 451-458. https://doi.org/10.1016/j.patrec.2020.05.021

Although Zero Shot Learning (ZSL) has attracted more and more attention due to its powerful ability of recognizing new objects without retraining, it has a serious drawback that it only focuses on unseen classes during prediction. To solve this issue... Read More about Pseudo Distribution on Unseen Classes for Generalized Zero Shot Learning.