Department of Computer Science

CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video (2024)
Journal Article
Miao, X., Bai, Y., Duan, H., Wan, F., Huang, Y., Long, Y., & Zheng, Y. (2024). CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video. Pattern Recognition, 156, Article 110729. https://doi.org/10.1016/j.patcog.2024.110729

The goal of our work is to generate high-quality novel views from monocular videos of complex and dynamic scenes. Prior methods, such as DynamicNeRF, have shown impressive performance by leveraging time-varying dynamic radiation fields. However, thes... Read More about CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video.

Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning (2024)
Journal Article
Liu, X., Wang, C., Yang, G., Wang, C., Long, Y., Liu, J., & Zhang, Z. (2024). Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning. Electronics, 13(10), Article 1977. https://doi.org/10.3390/electronics13101977

Generalized zero-shot learning (GZSL) aims to simultaneously recognize both seen classes and unseen classes by training only on seen class samples and auxiliary semantic descriptions. Recent state-of-the-art methods infer unseen classes based on sema... Read More about Leveraging Self-Distillation and Disentanglement Network to Enhance Visual–Semantic Feature Consistency in Generalized Zero-Shot Learning.

Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval (2024)
Journal Article
Ma, T., Organisciak, D., Ma, W., & Long, Y. (2024). Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval. Electronics, 13(9), Article 1660. https://doi.org/10.3390/electronics13091660

The pursuit of Artificial Intelligence (AI) that emulates human cognitive processes is a cornerstone of ethical AI development, ensuring that emerging technologies can seamlessly integrate into societal frameworks requiring nuanced understanding and... Read More about Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval.

Wearable-based behaviour interpolation for semi-supervised human activity recognition (2024)
Journal Article
Duan, H., Wang, S., Ojha, V., Wang, S., Huang, Y., Long, Y., …Zheng, Y. (2024). Wearable-based behaviour interpolation for semi-supervised human activity recognition. Information Sciences, 665, Article 120393. https://doi.org/10.1016/j.ins.2024.120393

While traditional feature engineering for Human Activity Recognition (HAR) involves a trial-and-error process, deep learning has emerged as a preferred method for high-level representations of sensor-based human activities. However, most deep learnin... Read More about Wearable-based behaviour interpolation for semi-supervised human activity recognition.

MRL-Seg: Overcoming Imbalance in Medical Image Segmentation With Multi-Step Reinforcement Learning (2023)
Journal Article
Yang, F., Li, X., Duan, H., Xu, F., Huang, Y., Zhang, X., …Zheng, Y. (2024). MRL-Seg: Overcoming Imbalance in Medical Image Segmentation With Multi-Step Reinforcement Learning. IEEE Journal of Biomedical and Health Informatics, 28(2), 858-869. https://doi.org/10.1109/jbhi.2023.3336726

Medical image segmentation is a critical task for clinical diagnosis and research. However, dealing with highly imbalanced data remains a significant challenge in this domain, where the region of interest (ROI) may exhibit substantial variations acro... Read More about MRL-Seg: Overcoming Imbalance in Medical Image Segmentation With Multi-Step Reinforcement Learning.

Deconfounding Causal Inference for Zero-shot Action Recognition (2023)
Journal Article
Wang, J., Jiang, Y., Long, Y., Sun, X., Pagnucco, M., & Song, Y. (2023). Deconfounding Causal Inference for Zero-shot Action Recognition. IEEE Transactions on Multimedia, https://doi.org/10.1109/tmm.2023.3318300

Zero-shot action recognition (ZSAR) aims to recognize unseen action categories in the test set without corresponding training examples. Most existing zero-shot methods follow the feature generation framework to transfer knowledge from seen action cat... Read More about Deconfounding Causal Inference for Zero-shot Action Recognition.

Feature fine-tuning and attribute representation transformation for zero-shot learning (2023)
Journal Article
Pang, S., He, X., Hao, W., & Long, Y. (2023). Feature fine-tuning and attribute representation transformation for zero-shot learning. Computer Vision and Image Understanding, 236, Article 103811. https://doi.org/10.1016/j.cviu.2023.103811

Zero-Shot Learning (ZSL) aims to generalize a pretrained classification model to unseen classes with the help of auxiliary semantic information. Recent generative methods are based on the paradigm of synthesizing unseen visual data from class attribu... Read More about Feature fine-tuning and attribute representation transformation for zero-shot learning.

DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume (2023)
Journal Article
Miao, X., Bai, Y., Duan, H., Huang, Y., Wan, F., Xu, X., …Zheng, Y. (2023). DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume. IEEE Transactions on Circuits and Systems for Video Technology, https://doi.org/10.1109/tcsvt.2023.3305776

Self-supervised monocular depth estimation methods typically rely on the reprojection error to capture geometric relationships between successive frames in static environments. However, this assumption does not hold in dynamic objects in scenarios, l... Read More about DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume.

The Importance of Expert Knowledge for Automatic Modulation Open Set Recognition (2023)
Journal Article
Li, T., Wen, Z., Long, Y., Hong, Z., Zheng, S., Yu, L., …Shao, L. (2023). The Importance of Expert Knowledge for Automatic Modulation Open Set Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11), 13730-13748. https://doi.org/10.1109/tpami.2023.3294505

Automatic modulation classification (AMC) is an important technology for the monitoring, management, and control of communication systems. In recent years, machine learning approaches are becoming popular to improve the effectiveness of AMC for radio... Read More about The Importance of Expert Knowledge for Automatic Modulation Open Set Recognition.

Dynamic Unary Convolution in Transformers (2023)
Journal Article
Duan, H., Long, Y., Wang, S., Zhang, H., Willcocks, C. G., & Shao, L. (2023). Dynamic Unary Convolution in Transformers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11), 12747 - 12759. https://doi.org/10.1109/tpami.2022.3233482

It is uncertain whether the power of transformer architectures can complement existing convolutional neural networks. A few recent attempts have combined convolution with transformer design through a range of structures in series, where the main cont... Read More about Dynamic Unary Convolution in Transformers.

Outputs (38)