Skip to main content

Research Repository

Advanced Search

Outputs (93)

STGAE: Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising
Presentation / Conference Contribution
Zhou, K., Cheng, Z., Shum, H. P., Li, F. W., & Liang, X. (2021, October). STGAE: Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising. Presented at 2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Bari, Italy

Hand object interaction in mixed reality (MR) relies on the accurate tracking and estimation of human hands, which provide users with a sense of immersion. However, raw captured hand motion data always contains errors such as joints occlusion, disloc... Read More about STGAE: Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising.

Semantics-STGCNN: A Semantics-guided Spatial-Temporal Graph Convolutional Network for Multi-class Trajectory Prediction
Presentation / Conference Contribution
Rainbow, B. A., Men, Q., & Shum, H. P. (2021, October). Semantics-STGCNN: A Semantics-guided Spatial-Temporal Graph Convolutional Network for Multi-class Trajectory Prediction. Presented at 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Melbourne, Australia

Predicting the movement trajectories of multiple classes of road users in real-world scenarios is a challenging task due to the diverse trajectory patterns. While recent works of pedestrian trajectory prediction successfully modelled the influence of... Read More about Semantics-STGCNN: A Semantics-guided Spatial-Temporal Graph Convolutional Network for Multi-class Trajectory Prediction.

A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition
Presentation / Conference Contribution
Men, Q., Hoy, E. S., Shum, H. P., & Leung, H. (2021, January). A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition. Presented at 25th International Conference on Pattern Recognition (ICPR 2020), Milan, Italy

This paper addresses the problem of recognizing human-human interaction from skeletal sequences. Existing methods are mainly designed to classify single human action. Many of them simply stack the movement features of two characters to deal with huma... Read More about A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition.

Correlation-Distance Graph Learning for Treatment Response Prediction from rs-fMRI
Presentation / Conference Contribution
Zhang, X., Zheng, S., Shum, H. P., Zhang, H., Song, N., Song, M., & Jia, H. (2023, November). Correlation-Distance Graph Learning for Treatment Response Prediction from rs-fMRI. Presented at ICONIP 2023: 2023 International Conference on Neural Information Processing, Changsha, China

Resting-state fMRI (rs-fMRI) functional connectivity (FC)
analysis provides valuable insights into the relationships between different brain regions and their potential implications for neurological or psychiatric disorders. However, specific design... Read More about Correlation-Distance Graph Learning for Treatment Response Prediction from rs-fMRI.

Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient
Presentation / Conference Contribution
Lu, Z., Wang, H., Chang, Z., Yang, G., & Shum, H. P. (2023, October). Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient. Presented at ICCV 2023: 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France

Recently, methods for skeleton-based human activity recognition have been shown to be vulnerable to adversarial attacks. However, these attack methods require either the full knowledge of the victim (i.e. white-box attacks), access to training data (... Read More about Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient.

A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments
Presentation / Conference Contribution
Zhou, K., Chen, C., Ma, Y., Leng, Z., Shum, H. P., Li, F. W., & Liang, X. (2023, October). A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments. Presented at ISMAR 23: International Symposium on Mixed and Augmented Reality, Sydney, Australia

As human exploration of space continues to progress, the use of Mixed Reality (MR) for simulating microgravity environments and facilitating training in hand-object interaction holds immense practical significance. However, hand-object interaction in... Read More about A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments.

Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation
Presentation / Conference Contribution
Feng, Q., Shum, H. P., & Morishima, S. (2023, October). Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation. Presented at ISMAR 23: International Symposium on Mixed and Augmented Reality, Sydney, Australia

Pre-captured immersive environments using omnidirectional cameras provide a wide range of virtual reality applications. Previous research has shown that manipulating the eye height in egocentric virtual environments can significantly affect distance... Read More about Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation.

Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers
Presentation / Conference Contribution
Corona-Figueroa, A., Bond-Taylor, S., Bhowmik, N., Gaus, Y. F. A., Breckon, T. P., Shum, H. P., & Willcocks, C. G. (2023, October). Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers. Presented at ICCV23: 2023 IEEE/CVF International Conference on Computer Vision, Paris, France

Generating 3D images of complex objects conditionally from a few 2D views is a difficult synthesis problem, compounded by issues such as domain gap and geometric misalignment. For instance, a unified framework such as Generative Adversarial Networks... Read More about Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers.

U3DS3 : Unsupervised 3D Semantic Scene Segmentation
Presentation / Conference Contribution
Liu, J., Yu, Z., Breckon, T. P., & Shum, H. P. H. (2024, January). U3DS3 : Unsupervised 3D Semantic Scene Segmentation. Presented at 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, Hawaii, USA

Contemporary point cloud segmentation approaches largely rely on richly annotated 3D training data. However , it is both time-consuming and challenging to obtain consistently accurate annotations for such 3D scene data. Moreover, there is still a lac... Read More about U3DS3 : Unsupervised 3D Semantic Scene Segmentation.

A Virtual Reality Framework for Human-Driver Interaction Research: Safe and Cost-Effective Data Collection
Presentation / Conference Contribution
Crosato, L., Wei, C., Ho, E. S. L., Shum, H. P. H., & Sun, Y. (2024, March). A Virtual Reality Framework for Human-Driver Interaction Research: Safe and Cost-Effective Data Collection. Presented at 2024 ACM/IEEE International Conference on Human Robot Interaction (HRI '24), Boulder, CO, USA

The advancement of automated driving technology has led to new challenges in the interaction between automated vehicles and human road users. However, there is currently no complete theory that explains how human road users interact with vehicles, an... Read More about A Virtual Reality Framework for Human-Driver Interaction Research: Safe and Cost-Effective Data Collection.