Professor Hubert Shum

Neural-code PIFu: High-fidelity Single Image 3D Human Reconstruction via Neural Code Integration (2024)
Presentation / Conference Contribution
Liu, R., Remagnino, P., & Shum, H. P. (2024, December). Neural-code PIFu: High-fidelity Single Image 3D Human Reconstruction via Neural Code Integration. Presented at 2024 International Conference on Pattern Recognition, Kolkata, India

We introduce neural-code PIFu, a novel implicit function for 3D human reconstruction, leveraging neural codebooks, our approach learns recurrent patterns in the feature space and reuses them to improve current features. Many existing methods predict... Read More about Neural-code PIFu: High-fidelity Single Image 3D Human Reconstruction via Neural Code Integration.

From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos (2024)
Presentation / Conference Contribution
Qiao, T., Li, R., Li, F. W. B., & Shum, H. P. H. (2024, December). From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos. Presented at ICPR 2024: International Conference on Pattern Recognition, Kolkata, India

Video-based Human-Object Interaction (HOI) recognition explores the intricate dynamics between humans and objects, which are essential for a comprehensive understanding of human behavior and intentions. While previous work has made significant stride... Read More about From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos.

MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment (2024)
Presentation / Conference Contribution
Zhou, K., Wang, L., Zhang, X., Shum, H. P. H., Li, F. W. B., Li, J., & Liang, X. (2024, September). MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment. Presented at ECCV 2024: The 18th European Conference on Computer Vision, Milan, Italy

Action Quality Assessment (AQA) evaluates diverse skills but models struggle with non-stationary data. We propose Continual AQA (CAQA) to refine models using sparse new data. Feature replay preserves memory without storing raw inputs. However, the mi... Read More about MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment.

SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM (2024)
Presentation / Conference Contribution
Chen, S., Zhang, H., Atapour-Abarghouei, A., & Shum, H. P. H. (2025, February). SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM. Presented at 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, Arizona

Chatbots and Art Critique: A Comparative Study of Chatbot and Human Experts in Traditional Chinese Painting Education (2024)
Presentation / Conference Contribution
Liu, J., Law, E. L.-C., & Shum, H. P. H. (2024, October). Chatbots and Art Critique: A Comparative Study of Chatbot and Human Experts in Traditional Chinese Painting Education. Presented at NordiCHI 2024: Nordic Conference on Human-Computer Interaction, Uppsala Sweden

Driven by the recent incorporation of chatbots into art education, art critique as a key factor in this realm poses distinct challenges and opportunities for this technology intervention. This study investigates the efficacy of chatbot-generated crit... Read More about Chatbots and Art Critique: A Comparative Study of Chatbot and Human Experts in Traditional Chinese Painting Education.

RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation (2024)
Presentation / Conference Contribution
Li, L., Shum, H. P. H., & Breckon, T. P. (2024, September). RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for 3D LiDAR Segmentation. Presented at ECCV 2024: European Conference on Computer Vision, Milan, Italy

Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling (2024)
Presentation / Conference Contribution
Corona-Figueroa, A., Shum, H. P. H., & Willcocks, C. G. (2024, June). Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling. Presented at 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington

Two-Person Interaction Augmentation with Skeleton Priors (2024)
Presentation / Conference Contribution
Li, B., Ho, E. S. L., Shum, H. P. H., & Wang, H. (2024, June). Two-Person Interaction Augmentation with Skeleton Priors. Presented at 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, Washington

Close and continuous interaction with rich contacts is a crucial aspect of human activities (e.g. hugging, dancing) and of interest in many domains like activity recognition, motion prediction, character animation, etc. However, acquiring such skelet... Read More about Two-Person Interaction Augmentation with Skeleton Priors.

MxT: Mamba x Transformer for Image Inpainting (2024)
Presentation / Conference Contribution
Chen, S., Atapour-Abarghouei, A., Zhang, H., & Shum, H. P. H. (2024, November). MxT: Mamba x Transformer for Image Inpainting. Presented at BMVC 2024: The 35th British Machine Vision Conference, Glasgow, UK

TraIL-Det: Transformation-Invariant Local Feature Networks for 3D LiDAR Object Detection with Unsupervised Pre-Training (2024)
Presentation / Conference Contribution
Li, L., Qiao, T., Shum, H. P. H., & Breckon, T. P. (2024, November). TraIL-Det: Transformation-Invariant Local Feature Networks for 3D LiDAR Object Detection with Unsupervised Pre-Training. Presented at BMVC'24: The 35th British Machine Vision Conference, Glasgow, UK

Depth-Aware Endoscopic Video Inpainting (2024)
Presentation / Conference Contribution
Xiatian Zhang, F., Chen, S., Xie, X., & Shum, H. P. (2024, October). Depth-Aware Endoscopic Video Inpainting. Presented at 27th International Conference on Medical Image Computing and Computer Assisted Intervention, Marrakesh, Morocco

Self-Regulated Sample Diversity in Large Language Models (2024)
Presentation / Conference Contribution
Liu, M., Frawley, J., Wyer, S., Shum, H. P. H., Uckelman, S. L., Black, S., & Willcocks, C. G. (2024, June). Self-Regulated Sample Diversity in Large Language Models. Presented at NAACL 2024: 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Mexico City

U3DS3 : Unsupervised 3D Semantic Scene Segmentation (2024)
Presentation / Conference Contribution
Liu, J., Yu, Z., Breckon, T. P., & Shum, H. P. H. (2024, January). U3DS3 : Unsupervised 3D Semantic Scene Segmentation. Presented at 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, Hawaii, USA

Contemporary point cloud segmentation approaches largely rely on richly annotated 3D training data. However , it is both time-consuming and challenging to obtain consistently accurate annotations for such 3D scene data. Moreover, there is still a lac... Read More about U3DS3 : Unsupervised 3D Semantic Scene Segmentation.

A Virtual Reality Framework for Human-Driver Interaction Research: Safe and Cost-Effective Data Collection (2024)
Presentation / Conference Contribution
Crosato, L., Wei, C., Ho, E. S. L., Shum, H. P. H., & Sun, Y. (2024, March). A Virtual Reality Framework for Human-Driver Interaction Research: Safe and Cost-Effective Data Collection. Presented at 2024 ACM/IEEE International Conference on Human Robot Interaction (HRI '24), Boulder, CO, USA

The advancement of automated driving technology has led to new challenges in the interaction between automated vehicles and human road users. However, there is currently no complete theory that explains how human road users interact with vehicles, an... Read More about A Virtual Reality Framework for Human-Driver Interaction Research: Safe and Cost-Effective Data Collection.

Tackling Data Bias in Painting Classification with Style Transfer (2023)
Presentation / Conference Contribution
Vijendran, M., Li, F. W., & Shum, H. P. (2023, February). Tackling Data Bias in Painting Classification with Style Transfer. Presented at VISAPP '23: 2023 International Conference on Computer Vision Theory and Applications, Lisbon, Portugal

It is difficult to train classifiers on paintings collections due to model bias from domain gaps and data bias from the uneven distribution of artistic styles. Previous techniques like data distillation, traditional data augmentation and style transf... Read More about Tackling Data Bias in Painting Classification with Style Transfer.

Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models (2023)
Presentation / Conference Contribution
Chang, Z., Findlay, E. J., Zhang, H., & Shum, H. P. (2023, February). Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models. Presented at GRAPP 2023: 2023 International Conference on Computer Graphics Theory and Applications, Lisbon, Portugal

Generating realistic motions for digital humans is a core but challenging part of computer animations and games, as human motions are both diverse in content and rich in styles. While the latest deep learning approaches have made significant advancem... Read More about Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models.

Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient (2023)
Presentation / Conference Contribution
Lu, Z., Wang, H., Chang, Z., Yang, G., & Shum, H. P. (2023, October). Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient. Presented at ICCV 2023: 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France

Recently, methods for skeleton-based human activity recognition have been shown to be vulnerable to adversarial attacks. However, these attack methods require either the full knowledge of the victim (i.e. white-box attacks), access to training data (... Read More about Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient.

Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers (2023)
Presentation / Conference Contribution
Corona-Figueroa, A., Bond-Taylor, S., Bhowmik, N., Gaus, Y. F. A., Breckon, T. P., Shum, H. P., & Willcocks, C. G. (2023, October). Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers. Presented at ICCV23: 2023 IEEE/CVF International Conference on Computer Vision, Paris, France

Generating 3D images of complex objects conditionally from a few 2D views is a difficult synthesis problem, compounded by issues such as domain gap and geometric misalignment. For instance, a unified framework such as Generative Adversarial Networks... Read More about Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers.

A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments (2023)
Presentation / Conference Contribution
Zhou, K., Chen, C., Ma, Y., Leng, Z., Shum, H. P., Li, F. W., & Liang, X. (2023, October). A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments. Presented at ISMAR 23: International Symposium on Mixed and Augmented Reality, Sydney, Australia

As human exploration of space continues to progress, the use of Mixed Reality (MR) for simulating microgravity environments and facilitating training in hand-object interaction holds immense practical significance. However, hand-object interaction in... Read More about A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments.

Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation (2023)
Presentation / Conference Contribution
Feng, Q., Shum, H. P., & Morishima, S. (2023, October). Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation. Presented at ISMAR 23: International Symposium on Mixed and Augmented Reality, Sydney, Australia

Pre-captured immersive environments using omnidirectional cameras provide a wide range of virtual reality applications. Previous research has shown that manipulating the eye height in egocentric virtual environments can significantly affect distance... Read More about Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation.

Professor Hubert Shum's Outputs (45)