Skip to main content

Research Repository

Advanced Search

Outputs (56)

Imaginary-Connected Embedding in Complex Space for Unseen Attribute-Object Discrimination (2024)
Journal Article
Jiang, C., Wang, S., Long, Y., Li, Z., Zhang, H., & Shao, L. (online). Imaginary-Connected Embedding in Complex Space for Unseen Attribute-Object Discrimination. IEEE Transactions on Pattern Analysis and Machine Intelligence, https://doi.org/10.1109/tpami.2024.3487631

Compositional Zero-Shot Learning (CZSL) aims to recognize novel compositions of seen primitives. Prior studies have attempted to either learn primitives individually (non-connected) or establish dependencies among them in the composition (fully-conne... Read More about Imaginary-Connected Embedding in Complex Space for Unseen Attribute-Object Discrimination.

SID-NERF: Few-Shot Nerf Based on Scene Information Distribution (2024)
Presentation / Conference Contribution
Li, Y., Wan, F., & Long, Y. (2024, July). SID-NERF: Few-Shot Nerf Based on Scene Information Distribution. Presented at 2024 IEEE International Conference on Multimedia and Expo (ICME), Niagara Falls, ON, Canada

The novel view synthesis from a limited set of images is a significant research focus. Traditional NeRF methods, relying mainly on color supervision, struggle with accurate scene geometry reconstruction when faced with sparse input images, leading to... Read More about SID-NERF: Few-Shot Nerf Based on Scene Information Distribution.

CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video (2024)
Journal Article
Miao, X., Bai, Y., Duan, H., Wan, F., Huang, Y., Long, Y., & Zheng, Y. (2024). CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video. Pattern Recognition, 156, Article 110729. https://doi.org/10.1016/j.patcog.2024.110729

The goal of our work is to generate high-quality novel views from monocular videos of complex and dynamic scenes. Prior methods, such as DynamicNeRF, have shown impressive performance by leveraging time-varying dynamic radiation fields. However, thes... Read More about CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video.

Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning (2024)
Journal Article
Liu, X., Wang, C., Yang, G., Wang, C., Long, Y., Liu, J., & Zhang, Z. (2024). Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning. Electronics, 13(10), Article 1977. https://doi.org/10.3390/electronics13101977

Generalized zero-shot learning (GZSL) aims to simultaneously recognize both seen classes and unseen classes by training only on seen class samples and auxiliary semantic descriptions. Recent state-of-the-art methods infer unseen classes based on sema... Read More about Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning.

Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval (2024)
Journal Article
Ma, T., Organisciak, D., Ma, W., & Long, Y. (2024). Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval. Electronics, 13(9), Article 1660. https://doi.org/10.3390/electronics13091660

The pursuit of Artificial Intelligence (AI) that emulates human cognitive processes is a cornerstone of ethical AI development, ensuring that emerging technologies can seamlessly integrate into societal frameworks requiring nuanced understanding and... Read More about Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval.

Wearable-based behaviour interpolation for semi-supervised human activity recognition (2024)
Journal Article
Duan, H., Wang, S., Ojha, V., Wang, S., Huang, Y., Long, Y., …Zheng, Y. (2024). Wearable-based behaviour interpolation for semi-supervised human activity recognition. Information Sciences, 665, Article 120393. https://doi.org/10.1016/j.ins.2024.120393

While traditional feature engineering for Human Activity Recognition (HAR) involves a trial-and-error process, deep learning has emerged as a preferred method for high-level representations of sensor-based human activities. However, most deep learnin... Read More about Wearable-based behaviour interpolation for semi-supervised human activity recognition.

MRL-Seg: Overcoming Imbalance in Medical Image Segmentation With Multi-Step Reinforcement Learning (2023)
Journal Article
Yang, F., Li, X., Duan, H., Xu, F., Huang, Y., Zhang, X., …Zheng, Y. (2024). MRL-Seg: Overcoming Imbalance in Medical Image Segmentation With Multi-Step Reinforcement Learning. IEEE Journal of Biomedical and Health Informatics, 28(2), 858-869. https://doi.org/10.1109/jbhi.2023.3336726

Medical image segmentation is a critical task for clinical diagnosis and research. However, dealing with highly imbalanced data remains a significant challenge in this domain, where the region of interest (ROI) may exhibit substantial variations acro... Read More about MRL-Seg: Overcoming Imbalance in Medical Image Segmentation With Multi-Step Reinforcement Learning.

Deconfounding Causal Inference for Zero-shot Action Recognition (2023)
Journal Article
Wang, J., Jiang, Y., Long, Y., Sun, X., Pagnucco, M., & Song, Y. (2023). Deconfounding Causal Inference for Zero-shot Action Recognition. IEEE Transactions on Multimedia, 26, 3976 - 3986. https://doi.org/10.1109/tmm.2023.3318300

Zero-shot action recognition (ZSAR) aims to recognize unseen action categories in the test set without corresponding training examples. Most existing zero-shot methods follow the feature generation framework to transfer knowledge from seen action cat... Read More about Deconfounding Causal Inference for Zero-shot Action Recognition.

Feature fine-tuning and attribute representation transformation for zero-shot learning (2023)
Journal Article
Pang, S., He, X., Hao, W., & Long, Y. (2023). Feature fine-tuning and attribute representation transformation for zero-shot learning. Computer Vision and Image Understanding, 236, Article 103811. https://doi.org/10.1016/j.cviu.2023.103811

Zero-Shot Learning (ZSL) aims to generalize a pretrained classification model to unseen classes with the help of auxiliary semantic information. Recent generative methods are based on the paradigm of synthesizing unseen visual data from class attribu... Read More about Feature fine-tuning and attribute representation transformation for zero-shot learning.