On the Design Fundamentals of Diffusion Models: A Survey
(2025)
Journal Article
Chang, Z., Koulieris, G. A., Chang, H. J., & Shum, H. P. H. (in press). On the Design Fundamentals of Diffusion Models: A Survey. Pattern Recognition,
Professor Hubert Shum's Outputs (105)
Geometric Visual Fusion Graph Neural Networks for Multi-Person Human-Object Interaction Recognition in Videos (2025)
Journal Article
Qiao, T., Li, R., Li, F. W. B., Kubotani, Y., Morishima, S., & Shum, H. P. H. (in press). Geometric Visual Fusion Graph Neural Networks for Multi-Person Human-Object Interaction Recognition in Videos. Expert Systems with Applications,
Integrating Human-in-the-loop AI to Tackle Space Communication Delay Challenges (2025)
Presentation / Conference Contribution
Mavrakis, N., Law, E. L.-C., & Shum, H. P. H. (2025, June). Integrating Human-in-the-loop AI to Tackle Space Communication Delay Challenges. Presented at SpaceCHI '25: 2025 Human Computer Interaction for Space Exploration, Cologne, Germany
Large-Scale Multi-Character Interaction Synthesis (2025)
Presentation / Conference Contribution
Chang, Z., Wang, H., Koulieris, G. A., & Shum, H. P. (2025, August). Large-Scale Multi-Character Interaction Synthesis. Presented at ACM SIGGRAPH 2025, Vancouver, Canada
SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM (2025)
Presentation / Conference Contribution
Chen, S., Zhang, H., Atapour-Abarghouei, A., & Shum, H. P. H. (2025, February). SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM. Presented at 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, ArizonaImage inpainting aims to repair a partially damaged image based on the information from known regions of the images. Achieving semantically plausible inpainting results is particularly challenging because it requires the reconstructed regions to exhi... Read More about SEM-Net: Efficient Pixel Modelling for Image Inpainting with Spatially Enhanced SSM.
FineCausal: A Causal-Based Framework for Interpretable Fine-Grained Action Quality Assessment (2025)
Presentation / Conference Contribution
Han, R., Zhou, K., Atapour-Abarghouei, A., Liang, X., & Shum, H. P. H. (2025, June). FineCausal: A Causal-Based Framework for Interpretable Fine-Grained Action Quality Assessment. Presented at Proceedings of the 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025, Music City Center, Nashville TNAction quality assessment (AQA) is critical for evaluating athletic performance, informing training strategies, and ensuring safety in competitive sports. However, existing deep learning approaches often operate as black boxes and are vulnerable to s... Read More about FineCausal: A Causal-Based Framework for Interpretable Fine-Grained Action Quality Assessment.
BP-SGCN: Behavioral Pseudo-Label Informed Sparse Graph Convolution Network for Pedestrian and Heterogeneous Trajectory Prediction (2025)
Journal Article
Li, R., Katsigiannis, S., Kim, T.-K., & Shum, H. P. H. (online). BP-SGCN: Behavioral Pseudo-Label Informed Sparse Graph Convolution Network for Pedestrian and Heterogeneous Trajectory Prediction. IEEE Transactions on Neural Networks and Learning Systems, https://doi.org/10.1109/TNNLS.2025.3545268Trajectory prediction allows better decision-making in applications of autonomous vehicles (AVs) or surveillance by predicting the short-term future movement of traffic agents. It is classified into pedestrian or heterogeneous trajectory prediction.... Read More about BP-SGCN: Behavioral Pseudo-Label Informed Sparse Graph Convolution Network for Pedestrian and Heterogeneous Trajectory Prediction.
PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction (2025)
Journal Article
Zhou, K., Shum, H. P. H., Li, F. W. B., Zhang, X., & Liang, X. (in press). PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction. IEEE Transactions on Image Processing,
Using Fixed and Mobile Eye Tracking to Understand How Visitors View Art in a Museum: A Study at the Bowes Museum, County Durham, UK (2025)
Presentation / Conference Contribution
Warwick, C., Beresford, A., Casteau, S., Shum, H. P. H., Smith, D., & Zhang, F. X. (2025, July). Using Fixed and Mobile Eye Tracking to Understand How Visitors View Art in a Museum: A Study at the Bowes Museum, County Durham, UK. Paper presented at 2025 ADHO Digital Humanities Conference, Lisbon, Portugal
Unified Spatial-Temporal Edge-Enhanced Graph Networks for Pedestrian Trajectory Prediction (2025)
Journal Article
Li, R., Qiao, T., Katsigiannis, S., Zhu, Z., & Shum, H. P. (online). Unified Spatial-Temporal Edge-Enhanced Graph Networks for Pedestrian Trajectory Prediction. IEEE Transactions on Circuits and Systems for Video Technology, https://doi.org/10.1109/TCSVT.2025.3539522Pedestrian trajectory prediction aims to forecast future movements based on historical paths. Spatial-temporal (ST) methods often separately model spatial interactions among pedestrians and temporal dependencies of individuals. They overlook the dire... Read More about Unified Spatial-Temporal Edge-Enhanced Graph Networks for Pedestrian Trajectory Prediction.
MxT: Mamba x Transformer for Image Inpainting (2024)
Presentation / Conference Contribution
Chen, S., Atapour-Abarghouei, A., Zhang, H., & Shum, H. P. H. (2024, November). MxT: Mamba x Transformer for Image Inpainting. Presented at BMVC 2024: The 35th British Machine Vision Conference, Glasgow, UKImage inpainting, or image completion, is a crucial task in computer vision that aims to restore missing or damaged regions of images with semantically coherent content. This technique requires a precise balance of local texture replication and globa... Read More about MxT: Mamba x Transformer for Image Inpainting.
TraIL-Det: Transformation-Invariant Local Feature Networks for 3D LiDAR Object Detection with Unsupervised Pre-Training (2024)
Presentation / Conference Contribution
Li, L., Qiao, T., Shum, H. P. H., & Breckon, T. P. (2024, November). TraIL-Det: Transformation-Invariant Local Feature Networks for 3D LiDAR Object Detection with Unsupervised Pre-Training. Presented at BMVC'24: The 35th British Machine Vision Conference, Glasgow, UK
Artificial intelligence for geometry-based feature extraction, analysis and synthesis in artistic images: a survey (2024)
Journal Article
Vijendran, M., Deng, J., Chen, S., Ho, E. S. L., & Shum, H. P. H. (2025). Artificial intelligence for geometry-based feature extraction, analysis and synthesis in artistic images: a survey. Artificial Intelligence Review, 58(2), Article 64. https://doi.org/10.1007/s10462-024-11051-3Artificial Intelligence significantly enhances the visual art industry by analyzing, identifying and generating digitized artistic images. This review highlights the substantial benefits of integrating geometric data into AI models, addressing challe... Read More about Artificial intelligence for geometry-based feature extraction, analysis and synthesis in artistic images: a survey.
Adaptive Graph Learning from Spatial Information for Surgical Workflow Anticipation (2024)
Journal Article
Zhang, F. X., Deng, J., Lieck, R., & Shum, H. P. (2025). Adaptive Graph Learning from Spatial Information for Surgical Workflow Anticipation. IEEE Transactions on Medical Robotics and Bionics, 7(1), 266-280. https://doi.org/10.1109/TMRB.2024.3517137Surgical workflow anticipation is the task of predicting the timing of relevant surgical events from live video data, which is critical in Robotic-Assisted Surgery (RAS). Accurate predictions require the use of spatial information to model surgical i... Read More about Adaptive Graph Learning from Spatial Information for Surgical Workflow Anticipation.
Neural-code PIFu: High-fidelity Single Image 3D Human Reconstruction via Neural Code Integration (2024)
Presentation / Conference Contribution
Liu, R., Remagnino, P., & Shum, H. P. (2024, December). Neural-code PIFu: High-fidelity Single Image 3D Human Reconstruction via Neural Code Integration. Presented at 2024 International Conference on Pattern Recognition, Kolkata, IndiaWe introduce neural-code PIFu, a novel implicit function for 3D human reconstruction, leveraging neural codebooks, our approach learns recurrent patterns in the feature space and reuses them to improve current features. Many existing methods predict... Read More about Neural-code PIFu: High-fidelity Single Image 3D Human Reconstruction via Neural Code Integration.
From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos (2024)
Presentation / Conference Contribution
Qiao, T., Li, R., Li, F. W. B., & Shum, H. P. H. (2024, December). From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos. Presented at ICPR 2024: International Conference on Pattern Recognition, Kolkata, IndiaVideo-based Human-Object Interaction (HOI) recognition explores the intricate dynamics between humans and objects, which are essential for a comprehensive understanding of human behavior and intentions. While previous work has made significant stride... Read More about From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos.
MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment (2024)
Presentation / Conference Contribution
Zhou, K., Wang, L., Zhang, X., Shum, H. P., Li, F. W. B., Li, J., & Liang, X. (2024, September). MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment. Presented at Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Milan, ItalyAction Quality Assessment (AQA) evaluates diverse skills but models struggle with non-stationary data. We propose Continual AQA (CAQA) to refine models using sparse new data. Feature replay preserves memory without storing raw inputs. However, the mi... Read More about MAGR: Manifold-Aligned Graph Regularization for Continual Action Quality Assessment.
Chatbots and Art Critique: A Comparative Study of Chatbot and Human Experts in Traditional Chinese Painting Education (2024)
Presentation / Conference Contribution
Liu, J., Law, L.-C., & Shum, H. P. H. (2024, October). Chatbots and Art Critique: A Comparative Study of Chatbot and Human Experts in Traditional Chinese Painting Education. Presented at NordiCHI 2024, UppsalaDriven by the recent incorporation of chatbots into art education, art critique as a key factor in this realm poses distinct challenges and opportunities for this technology intervention. This study investigates the efficacy of chatbot-generated crit... Read More about Chatbots and Art Critique: A Comparative Study of Chatbot and Human Experts in Traditional Chinese Painting Education.
Unraveling the brain dynamics of Depersonalization-Derealization Disorder: a dynamic functional network connectivity analysis (2024)
Journal Article
Zheng, S., Zhang, F. X., Shum, H. P. H., Zhang, H., Song, N., Song, M., & Jia, H. (2024). Unraveling the brain dynamics of Depersonalization-Derealization Disorder: a dynamic functional network connectivity analysis. BMC Psychiatry, 24, Article 685. https://doi.org/10.1186/s12888-024-06096-1Background: Depersonalization-Derealization Disorder (DPD), a prevalent psychiatric disorder, fundamentally disrupts self-consciousness and could significantly impact the quality of life of those affected. While existing research has provided foundat... Read More about Unraveling the brain dynamics of Depersonalization-Derealization Disorder: a dynamic functional network connectivity analysis.
Chatbots and Art Critique: A Comparative Study of Chatbot and Human Experts in Traditional Chinese Painting Education (2024)
Presentation / Conference Contribution
Liu, J., Law, E. L.-C., & Shum, H. P. H. (2024, October). Chatbots and Art Critique: A Comparative Study of Chatbot and Human Experts in Traditional Chinese Painting Education. Presented at NordiCHI 2024: Nordic Conference on Human-Computer Interaction, Uppsala SwedenDriven by the recent incorporation of chatbots into art education, art critique as a key factor in this realm poses distinct challenges and opportunities for this technology intervention. This study investigates the efficacy of chatbot-generated crit... Read More about Chatbots and Art Critique: A Comparative Study of Chatbot and Human Experts in Traditional Chinese Painting Education.