Prof. Hubert Shum

From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos (2024)
Presentation / Conference Contribution
Qiao, T., Li, R., Li, F. W. B., & Shum, H. P. H. (2024, December). From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos. Presented at Proceedings of the 2024 International Conference on Pattern Recognition, Kolkata, India, 2024., Kolkata, India

Video-based Human-Object Interaction (HOI) recognition explores the intricate dynamics between humans and objects, which are essential for a comprehensive understanding of human behavior and intentions. While previous work has made significant stride... Read More about From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos.

Depth-Aware Endoscopic Video Inpainting (2024)
Presentation / Conference Contribution
Xiatian Zhang, F., Chen, S., Xie, X., & Shum, H. P. (in press). Depth-Aware Endoscopic Video Inpainting.

Self-Regulated Sample Diversity in Large Language Models (2024)
Presentation / Conference Contribution
Liu, M., Frawley, J., Wyer, S., Shum, H. P. H., Uckelman, S. L., Black, S., & Willcocks, C. G. (2024). Self-Regulated Sample Diversity in Large Language Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics (1891–1899)

Geometric Features Enhanced Human-Object Interaction Detection (2024)
Journal Article
Zhu, M., Ho, E. S. L., Yang, L., Shum, H. P. H., & Chen, S. (in press). Geometric Features Enhanced Human-Object Interaction Detection. IEEE Transactions on Instrumentation and Measurement,

RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for LiDAR Semantic Segmentation (2024)
Presentation / Conference Contribution
Li, L., Shum, H. P. H., & Breckon, T. P. (2024, September). RAPiD-Seg: Range-Aware Pointwise Distance Distribution Networks for LiDAR Semantic Segmentation. Paper presented at European Conference on Computer Vision, Milan, Italy

U3DS3 : Unsupervised 3D Semantic Scene Segmentation (2024)
Presentation / Conference Contribution
Liu, J., Yu, Z., Breckon, T. P., & Shum, H. P. H. (2024). U3DS3 : Unsupervised 3D Semantic Scene Segmentation. In 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (3747-3756). https://doi.org/10.1109/WACV57701.2024.00372

Contemporary point cloud segmentation approaches largely rely on richly annotated 3D training data. However , it is both time-consuming and challenging to obtain consistently accurate annotations for such 3D scene data. Moreover, there is still a lac... Read More about U3DS3 : Unsupervised 3D Semantic Scene Segmentation.

Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling (2024)
Presentation / Conference Contribution
Corona-Figueroa, A., Shum, H. P. H., & Willcocks, C. G. (in press). Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling.

Two-Person Interaction Augmentation with Skeleton Priors (2024)
Presentation / Conference Contribution
Li, B., Ho, E. S. L., Shum, H. P. H., & Wang, H. (2024, June). Two-Person Interaction Augmentation with Skeleton Priors. Presented at 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington

A Virtual Reality Framework for Human-Driver Interaction Research: Safe and Cost-Effective Data Collection (2024)
Presentation / Conference Contribution
Crosato, L., Wei, C., Ho, E. S. L., Shum, H. P. H., & Sun, Y. (2024). A Virtual Reality Framework for Human-Driver Interaction Research: Safe and Cost-Effective Data Collection. In HRI '24: Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (167-174). https://doi.org/10.1145/3610977.3634923

The advancement of automated driving technology has led to new challenges in the interaction between automated vehicles and human road users. However, there is currently no complete theory that explains how human road users interact with vehicles, an... Read More about A Virtual Reality Framework for Human-Driver Interaction Research: Safe and Cost-Effective Data Collection.

HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention (2024)
Journal Article
Chen, S., Atapour-Abarghouei, A., & Shum, H. P. H. (2024). HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention. IEEE Transactions on Multimedia, 26, 7649-7660. https://doi.org/10.1109/TMM.2024.3369897

Existing image inpainting methods leverage convolution-based downsampling approaches to reduce spatial dimensions. This may result in information loss from corrupted images where the available information is inherently sparse, especially for the scen... Read More about HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention.

Enhancing surgical performance in cardiothoracic surgery with innovations from computer vision and artificial intelligence: a narrative review (2024)
Journal Article
Constable, M. D., Shum, H. P. H., & Clark, S. (2024). Enhancing surgical performance in cardiothoracic surgery with innovations from computer vision and artificial intelligence: a narrative review. Journal of Cardiothoracic Surgery, 19(1), Article 94. https://doi.org/10.1186/s13019-024-02558-5

When technical requirements are high, and patient outcomes are critical, opportunities for monitoring and improving surgical skills via objective motion analysis feedback may be particularly beneficial. This narrative review synthesises work on techn... Read More about Enhancing surgical performance in cardiothoracic surgery with innovations from computer vision and artificial intelligence: a narrative review.

Pose-based tremor type and level analysis for Parkinson’s disease from video (2024)
Journal Article
Zhang, H., Ho, E. S. L., Zhang, X., Del Din, S., & Shum, H. P. H. (2024). Pose-based tremor type and level analysis for Parkinson’s disease from video. International Journal of Computer Assisted Radiology and Surgery, 19(5), 831-840. https://doi.org/10.1007/s11548-023-03052-4

Current methods for diagnosis of PD rely on clinical examination. The accuracy of diagnosis ranges between 73 and 84%, and is influenced by the experience of the clinical assessor. Hence, an automatic, effective and interpretable supporting system fo... Read More about Pose-based tremor type and level analysis for Parkinson’s disease from video.

Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models (2023)
Presentation / Conference Contribution
Chang, Z., Findlay, E. J., Zhang, H., & Shum, H. P. (2023). Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - GRAPP (64-74). https://doi.org/10.5220/0011631000003417

Generating realistic motions for digital humans is a core but challenging part of computer animations and games, as human motions are both diverse in content and rich in styles. While the latest deep learning approaches have made significant advancem... Read More about Unifying Human Motion Synthesis and Style Transfer with Denoising Diffusion Probabilistic Models.

Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers (2023)
Presentation / Conference Contribution
Corona-Figueroa, A., Bond-Taylor, S., Bhowmik, N., Gaus, Y. F. A., Breckon, T. P., Shum, H. P., & Willcocks, C. G. (2023). Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers. In ICCV '23: Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. https://doi.org/10.1109/ICCV51070.2023.01341

Generating 3D images of complex objects conditionally from a few 2D views is a difficult synthesis problem, compounded by issues such as domain gap and geometric misalignment. For instance, a unified framework such as Generative Adversarial Networks... Read More about Unaligned 2D to 3D Translation with Conditional Vector-Quantized Code Diffusion using Transformers.

Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient (2023)
Presentation / Conference Contribution
Lu, Z., Wang, H., Chang, Z., Yang, G., & Shum, H. P. (2023). Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient. . https://doi.org/10.1109/ICCV51070.2023.00424

Recently, methods for skeleton-based human activity recognition have been shown to be vulnerable to adversarial attacks. However, these attack methods require either the full knowledge of the victim (i.e. white-box attacks), access to training data (... Read More about Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient.

Tackling Data Bias in Painting Classification with Style Transfer (2023)
Presentation / Conference Contribution
Vijendran, M., Li, F. W., & Shum, H. P. (2023). Tackling Data Bias in Painting Classification with Style Transfer. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5 VISAPP: VISAPP (250-261). https://doi.org/10.5220/0011776600003417

It is difficult to train classifiers on paintings collections due to model bias from domain gaps and data bias from the uneven distribution of artistic styles. Previous techniques like data distillation, traditional data augmentation and style transf... Read More about Tackling Data Bias in Painting Classification with Style Transfer.

A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments (2023)
Presentation / Conference Contribution
Zhou, K., Chen, C., Ma, Y., Leng, Z., Shum, H. P., Li, F. W., & Liang, X. (2023). A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments. In 2023 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). https://doi.org/10.1109/ISMAR59233.2023.00031

As human exploration of space continues to progress, the use of Mixed Reality (MR) for simulating microgravity environments and facilitating training in hand-object interaction holds immense practical significance. However, hand-object interaction in... Read More about A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments.

Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation (2023)
Presentation / Conference Contribution
Feng, Q., Shum, H. P., & Morishima, S. (2023). Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation. In 2023 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). https://doi.org/10.1109/ISMAR59233.2023.00055

Pre-captured immersive environments using omnidirectional cameras provide a wide range of virtual reality applications. Previous research has shown that manipulating the eye height in egocentric virtual environments can significantly affect distance... Read More about Enhancing Perception and Immersion in Pre-Captured Environments through Learning-Based Eye Height Adaptation.

Social Interaction‐Aware Dynamical Models and Decision‐Making for Autonomous Vehicles (2023)
Journal Article
Crosato, L., Tian, K., Shum, H. P., Ho, E. S., Wang, Y., & Wei, C. (2023). Social Interaction‐Aware Dynamical Models and Decision‐Making for Autonomous Vehicles. Advanced Intelligent Systems, https://doi.org/10.1002/aisy.202300575

Interaction‐aware autonomous driving (IAAD) is a rapidly growing field of research that focuses on the development of autonomous vehicles (AVs) that are capable of interacting safely and efficiently with human road users. This is a challenging task,... Read More about Social Interaction‐Aware Dynamical Models and Decision‐Making for Autonomous Vehicles.

Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising (2023)
Journal Article
Zhou, K., Shum, H. P., Li, F. W., & Liang, X. (2023). Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising. IEEE Transactions on Visualization and Computer Graphics, https://doi.org/10.1109/TVCG.2023.3337868

In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the inte... Read More about Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising.

Correlation-Distance Graph Learning for Treatment Response Prediction from rs-fMRI (2023)
Presentation / Conference Contribution
Zhang, X., Zheng, S., Shum, H. P., Zhang, H., Song, N., Song, M., & Jia, H. (2023). Correlation-Distance Graph Learning for Treatment Response Prediction from rs-fMRI. In Neural Information Processing 30th International Conference, ICONIP 2023, Changsha, China, November 20–23, 2023, Proceedings, Part IX (298-312). https://doi.org/10.1007/978-981-99-8138-0_24

Resting-state fMRI (rs-fMRI) functional connectivity (FC) analysis provides valuable insights into the relationships between different brain regions and their potential implications for neurological or psychiatric disorders. However, specific design... Read More about Correlation-Distance Graph Learning for Treatment Response Prediction from rs-fMRI.

Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation (2023)
Presentation / Conference Contribution
Li, L., Shum, H. P., & Breckon, T. P. (2023). Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR52729.2023.00903

Whilst the availability of 3D LiDAR point cloud data has significantly grown in recent years, annotation remains expensive and time-consuming, leading to a demand for semisupervised semantic segmentation methods with application domains such as auton... Read More about Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation.

Region-based Appearance and Flow Characteristics for Anomaly Detection in Infrared Surveillance Imagery (2023)
Presentation / Conference Contribution
Gaus, Y., Bhowmik, N., Issac-Medina, B., Atapour-Abarghouei, A., Shum, H., & Breckon, T. (2023). Region-based Appearance and Flow Characteristics for Anomaly Detection in Infrared Surveillance Imagery. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). https://doi.org/10.1109/CVPRW59228.2023.00301

Anomaly detection is a classical problem within automated visual surveillance, namely the determination of the normal from the abnormal when operational data availability is highly biased towards one class (normal) due to both insufficient sample siz... Read More about Region-based Appearance and Flow Characteristics for Anomaly Detection in Infrared Surveillance Imagery.

Hierarchical Graph Convolutional Networks for Action Quality Assessment (2023)
Journal Article
Zhou, K., Ma, Y., Shum, H. P., & Liang, X. (2023). Hierarchical Graph Convolutional Networks for Action Quality Assessment. IEEE Transactions on Circuits and Systems for Video Technology, https://doi.org/10.1109/TCSVT.2023.3281413

Action quality assessment (AQA) automatically evaluates how well humans perform actions in a given video, a technique widely used in fields such as rehabilitation medicine, athletic competitions, and specific skills assessment. However, existing work... Read More about Hierarchical Graph Convolutional Networks for Action Quality Assessment.

INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network (2023)
Journal Article
Chen, S., Atapour-Abarghouei, A., Ho, E. S., & Shum, H. P. (2023). INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network. Software impacts, 17, Article 100517. https://doi.org/10.1016/j.simpa.2023.100517

We present a software that predicts non-cleft facial images for patients with cleft lip, thereby facilitating the understanding, awareness and discussion of cleft lip surgeries. To protect patients’ privacy, we design a software framework using image... Read More about INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network.

Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition (2023)
Journal Article
Men, Q., Ho, E. S., Shum, H. P., & Leung, H. (2023). Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition. Neurocomputing, 537, 198-209. https://doi.org/10.1016/j.neucom.2023.03.070

Learning view-invariant representation is a key to improving feature discrimination power for skeleton-based action recognition. Existing approaches cannot effectively remove the impact of viewpoint due to the implicit view-dependent representations.... Read More about Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition.

A Video-Based Augmented Reality System for Human-in-the-Loop Muscle Strength Assessment of Juvenile Dermatomyositis (2023)
Journal Article
Zhou, K., Cai, R., Ma, Y., Tan, Q., Wang, X., Li, J., …Liang, X. (2023). A Video-Based Augmented Reality System for Human-in-the-Loop Muscle Strength Assessment of Juvenile Dermatomyositis. IEEE Transactions on Visualization and Computer Graphics, 29(5), 2456-2466. https://doi.org/10.1109/tvcg.2023.3247092

As the most common idiopathic inflammatory myopathy in children, juvenile dermatomyositis (JDM) is characterized by skin rashes and muscle weakness. The childhood myositis assessment scale (CMAS) is commonly used to measure the degree of muscle invol... Read More about A Video-Based Augmented Reality System for Human-in-the-Loop Muscle Strength Assessment of Juvenile Dermatomyositis.

Denoising Diffusion Probabilistic Models for Styled Walking Synthesis (2022)
Presentation / Conference Contribution
Findlay, E., Zhang, H., Chang, Z., & Shum, H. P. (2022). Denoising Diffusion Probabilistic Models for Styled Walking Synthesis. . https://doi.org/10.1145/3561975

Generating realistic motions for digital humans is time-consuming for many graphics applications. Data-driven motion synthesis approaches have seen solid progress in recent years through deep generative models. These results offer high-quality motion... Read More about Denoising Diffusion Probabilistic Models for Styled Walking Synthesis.

UAV-ReID: A Benchmark on Unmanned Aerial Vehicle Re-Identification in Video Imagery (2022)
Presentation / Conference Contribution
Organisciak, D., Poyser, M., Alsehaim, A., Hu, S., Isaac-Medina, B. K., Breckon, T. P., & Shum, H. P. (2022). UAV-ReID: A Benchmark on Unmanned Aerial Vehicle Re-Identification in Video Imagery. . https://doi.org/10.5220/0010836600003124

As unmanned aerial vehicles (UAV) become more accessible with a growing range of applications, the risk of UAV disruption increases. Recent development in deep learning allows vision-based counter-UAV systems to detect and track UAVs with a single ca... Read More about UAV-ReID: A Benchmark on Unmanned Aerial Vehicle Re-Identification in Video Imagery.

3D Reconstruction of Sculptures from Single Images via Unsupervised Domain Adaptation on Implicit Models (2022)
Presentation / Conference Contribution
Chang, Z., Koulieris, G. A., & Shum, H. P. (2022). 3D Reconstruction of Sculptures from Single Images via Unsupervised Domain Adaptation on Implicit Models. . https://doi.org/10.1145/3562939.3565632

A Skeleton-aware Graph Convolutional Network for Human-Object Interaction Detection (2022)
Presentation / Conference Contribution
Zhu, M., Ho, E. S., & Shum, H. P. (2022). A Skeleton-aware Graph Convolutional Network for Human-Object Interaction Detection. . https://doi.org/10.1109/smc53654.2022.9945149

Detecting human-object interactions is essential for comprehensive understanding of visual scenes. In particular, spatial connections between humans and objects are important cues for reasoning interactions. To this end, we propose a skeleton-aware g... Read More about A Skeleton-aware Graph Convolutional Network for Human-Object Interaction Detection.

A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip (2022)
Presentation / Conference Contribution
Chen, S., Atapour-Abarghouei, A., Kerby, J., Ho, E. S., Sainsbury, D. C., Butterworth, S., & Shum, H. P. (2022). A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip. . https://doi.org/10.1109/bhi56158.2022.9926917

A Cleft lip is a congenital abnormality requiring surgical repair by a specialist. The surgeon must have extensive experience and theoretical knowledge to perform surgery, and Artificial Intelligence (AI) method has been proposed to guide surgeons in... Read More about A Feasibility Study on Image Inpainting for Non-cleft Lip Generation from Patients with Cleft Lip.

Towards Graph Representation Learning Based Surgical Workflow Anticipation (2022)
Presentation / Conference Contribution
Zhang, X., Al Moubayed, N., & Shum, H. P. (2022). Towards Graph Representation Learning Based Surgical Workflow Anticipation. . https://doi.org/10.1109/bhi56158.2022.9926801

Surgical workflow anticipation can give predictions on what steps to conduct or what instruments to use next, which is an essential part of the computer-assisted intervention system for surgery, e.g. workflow reasoning in robotic surgery. However, cu... Read More about Towards Graph Representation Learning Based Surgical Workflow Anticipation.

Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos (2022)
Presentation / Conference Contribution
Qiao, T., Men, Q., Li, F. W., Kubotani, Y., Morishima, S., & Shum, H. P. (2022). Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos. . https://doi.org/10.1007/978-3-031-19772-7_28

Human-Object Interaction (HOI) recognition in videos is important for analysing human activity. Most existing work focusing on visual features usually suffer from occlusion in the real-world scenarios. Such a problem will be further complicated when... Read More about Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos.

Multiclass-SGCN: Sparse Graph-based Trajectory Prediction with Agent Class Embedding (2022)
Presentation / Conference Contribution
Li, R., Katsigiannis, S., & Shum, H. P. (2022). Multiclass-SGCN: Sparse Graph-based Trajectory Prediction with Agent Class Embedding. In 2022 IEEE International Conference on Image Processing (ICIP) Proceedings (2346-2350). https://doi.org/10.1109/icip46576.2022.9897644

Trajectory prediction of road users in real-world scenarios is challenging because their movement patterns are stochastic and complex. Previous pedestrian-oriented works have been successful in modelling the complex interactions among pedestrians, bu... Read More about Multiclass-SGCN: Sparse Graph-based Trajectory Prediction with Agent Class Embedding.

A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction (2022)
Journal Article
Zhu, M., Men, Q., Ho, E. S., Leung, H., & Shum, H. P. (2022). A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction. Journal of Medical Systems, 46(11), Article 76. https://doi.org/10.1007/s10916-022-01857-5

Musculoskeletal and neurological disorders are the most common causes of walking problems among older people, and they often lead to diminished quality of life. Analyzing walking motion data manually requires trained professionals and the evaluations... Read More about A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction.

CP-AGCN: Pytorch-based Attention Informed Graph Convolutional Network for Identifying Infants at Risk of Cerebral Palsy (2022)
Journal Article
Zhang, H., Ho, E. S., & Shum, H. P. (2022). CP-AGCN: Pytorch-based Attention Informed Graph Convolutional Network for Identifying Infants at Risk of Cerebral Palsy. Software impacts, 14, Article 100419. https://doi.org/10.1016/j.simpa.2022.100419

Early prediction is clinically considered one of the essential parts of cerebral palsy (CP) treatment. We propose to implement a low-cost and interpretable classification system for supporting CP prediction based on General Movement Assessment (GMA).... Read More about CP-AGCN: Pytorch-based Attention Informed Graph Convolutional Network for Identifying Infants at Risk of Cerebral Palsy.

Pose-based Tremor Classification for Parkinson’s Disease Diagnosis from Video (2022)
Presentation / Conference Contribution
Zhang, X., Zhang, H., & Shum, H. P. (2022). Pose-based Tremor Classification for Parkinson’s Disease Diagnosis from Video. . https://doi.org/10.1007/978-3-031-16440-8_47

Parkinson’s disease (PD) is a progressive neurodegenerative disorder that results in a variety of motor dysfunction symptoms, including tremors, bradykinesia, rigidity and postural instability. The diagnosis of PD mainly relies on clinical experience... Read More about Pose-based Tremor Classification for Parkinson’s Disease Diagnosis from Video.

MedNeRF: Medical Neural Radiance Fields for Reconstructing 3D-aware CT-Projections from a Single X-ray (2022)
Presentation / Conference Contribution
Corona-Figueroa, A., Frawley, J., Bond-Taylor, S., Bethapudi, S., Shum, H. P., & Willcocks, C. G. (2022). MedNeRF: Medical Neural Radiance Fields for Reconstructing 3D-aware CT-Projections from a Single X-ray. . https://doi.org/10.1109/embc48229.2022.9871757

Computed tomography (CT) is an effective med-ical imaging modality, widely used in the field of clinical medicine for the diagnosis of various pathologies. Advances in Multidetector CT imaging technology have enabled additional functionalities, inclu... Read More about MedNeRF: Medical Neural Radiance Fields for Reconstructing 3D-aware CT-Projections from a Single X-ray.

Cerebral Palsy Prediction with Frequency Attention Informed Graph Convolutional Networks (2022)
Presentation / Conference Contribution
Zhang, H., Shum, H. P., & Ho, E. S. (2022). Cerebral Palsy Prediction with Frequency Attention Informed Graph Convolutional Networks. . https://doi.org/10.1109/embc48229.2022.9871230

Early diagnosis and intervention are clinically considered the paramount part of treating cerebral palsy (CP), so it is essential to design an efficient and interpretable automatic prediction system for CP. We highlight a significant difference betwe... Read More about Cerebral Palsy Prediction with Frequency Attention Informed Graph Convolutional Networks.

Interaction-aware Decision-making for Automated Vehicles using Social Value Orientation (2022)
Journal Article
Crosato, L., Shum, H. P., Ho, E. S., & Wei, C. (2023). Interaction-aware Decision-making for Automated Vehicles using Social Value Orientation. IEEE Transactions on Intelligent Vehicles, 8(2), 1339-1349. https://doi.org/10.1109/tiv.2022.3189836

Motion control algorithms in the presence of pedestrians are critical for the development of safe and reliable Autonomous Vehicles (AVs). Traditional motion control algorithms rely on manually designed decision-making policies which neglect the mutua... Read More about Interaction-aware Decision-making for Automated Vehicles using Social Value Orientation.

Formation Control for UAVs Using a Flux Guided Approach (2022)
Journal Article
Hartley, J., Shum, H. P., Ho, E. S., Wang, H., & Ramamoorthyd, S. (2022). Formation Control for UAVs Using a Flux Guided Approach. Expert Systems with Applications, 205, Article 117665. https://doi.org/10.1016/j.eswa.2022.117665

Existing studies on formation control for unmanned aerial vehicles (UAV) have not considered encircling targets where an optimum coverage of the target is required at all times. Such coverage plays a critical role in many real-world applications such... Read More about Formation Control for UAVs Using a Flux Guided Approach.

RobIn: A Robust Interpretable Deep Network for Schizophrenia Diagnosis (2022)
Journal Article
Organisciak, D., Shum, H. P., Nwoye, E., & Woo, W. L. (2022). RobIn: A Robust Interpretable Deep Network for Schizophrenia Diagnosis. Expert Systems with Applications, 201, Article 117158. https://doi.org/10.1016/j.eswa.2022.117158

Schizophrenia is a severe mental health condition that requires a long and complicated diagnostic process. However, early diagnosis is vital to control symptoms. Deep learning has recently become a popular way to analyse and interpret medical data. P... Read More about RobIn: A Robust Interpretable Deep Network for Schizophrenia Diagnosis.

360 Depth Estimation in the Wild - The Depth360 Dataset and the SegFuse Network (2022)
Presentation / Conference Contribution
Feng, Q., Shum, H. P., & Morishima, S. (2022). 360 Depth Estimation in the Wild - The Depth360 Dataset and the SegFuse Network. . https://doi.org/10.1109/vr51125.2022.00087

Single-view depth estimation from omnidirectional images has gained popularity with its wide range of applications such as autonomous driving and scene reconstruction. Although data-driven learning-based methods demonstrate significant potential in t... Read More about 360 Depth Estimation in the Wild - The Depth360 Dataset and the SegFuse Network.

DurLAR: A High-Fidelity 128-Channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery for Multi-Modal Autonomous Driving Applications (2021)
Presentation / Conference Contribution
Li, L., Ismail, K. N., Shum, H. P., & Breckon, T. P. (2021). DurLAR: A High-Fidelity 128-Channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery for Multi-Modal Autonomous Driving Applications. . https://doi.org/10.1109/3dv53792.2021.00130

We present DurLAR, a high-fidelity 128-channel 3D LiDAR dataset with panoramic ambient (near infrared) and reflectivity imagery, as well as a sample benchmark task using depth estimation for autonomous driving applications. Our driving platform is eq... Read More about DurLAR: A High-Fidelity 128-Channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery for Multi-Modal Autonomous Driving Applications.

Semantics-STGCNN: A Semantics-guided Spatial-Temporal Graph Convolutional Network for Multi-class Trajectory Prediction (2021)
Presentation / Conference Contribution
Rainbow, B. A., Men, Q., & Shum, H. P. (2021). Semantics-STGCNN: A Semantics-guided Spatial-Temporal Graph Convolutional Network for Multi-class Trajectory Prediction. . https://doi.org/10.1109/smc52423.2021.9658781

Predicting the movement trajectories of multiple classes of road users in real-world scenarios is a challenging task due to the diverse trajectory patterns. While recent works of pedestrian trajectory prediction successfully modelled the influence of... Read More about Semantics-STGCNN: A Semantics-guided Spatial-Temporal Graph Convolutional Network for Multi-class Trajectory Prediction.

Bi-projection-based Foreground-aware Omnidirectional Depth Prediction (2021)
Presentation / Conference Contribution
Feng, Q., Shum, H. P., & Morishima, S. (2021). Bi-projection-based Foreground-aware Omnidirectional Depth Prediction.

Due to the increasing availability of commercial 360- degree cameras, accurate depth prediction for omnidirectional images can be beneficial to a wide range of applications including video editing and augmented reality. Regarding existing methods, so... Read More about Bi-projection-based Foreground-aware Omnidirectional Depth Prediction.

A Pose-based Feature Fusion and Classification Framework for the Early Prediction of Cerebral Palsy in Infants (2021)
Journal Article
McCay, K. D., Hu, P., Shum, H. P., Woo, W. L., Marcroft, C., Embleton, N. D., …Ho, E. S. (2022). A Pose-based Feature Fusion and Classification Framework for the Early Prediction of Cerebral Palsy in Infants. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 30, 8-19. https://doi.org/10.1109/tnsre.2021.3138185

The early diagnosis of cerebral palsy is an area which has recently seen significant multi-disciplinary research. Diagnostic tools such as the General Movements Assessment (GMA), have produced some very promising results. However, the prospect of aut... Read More about A Pose-based Feature Fusion and Classification Framework for the Early Prediction of Cerebral Palsy in Infants.

PyTorch-based Implementation of Label-aware Graph Representation for Multi-class Trajectory Prediction (2021)
Journal Article
Men, Q., & Shum, H. P. (2022). PyTorch-based Implementation of Label-aware Graph Representation for Multi-class Trajectory Prediction. Software impacts, 11, Article 100201. https://doi.org/10.1016/j.simpa.2021.100201

Trajectory Prediction under diverse patterns has attracted increasing attention in multiple real-world applications ranging from urban traffic analysis to human motion understanding, among which graph convolution network (GCN) is frequently adopted w... Read More about PyTorch-based Implementation of Label-aware Graph Representation for Multi-class Trajectory Prediction.

Unmanned Aerial Vehicle Visual Detection and Tracking using Deep Neural Networks: A Performance Benchmark (2021)
Presentation / Conference Contribution
Isaac-Medina, B. K., Poyser, M., Organisciak, D., Willcocks, C. G., Breckon, T. P., & Shum, H. P. (2021). Unmanned Aerial Vehicle Visual Detection and Tracking using Deep Neural Networks: A Performance Benchmark. . https://doi.org/10.1109/iccvw54120.2021.00142

Unmanned Aerial Vehicles (UAV) can pose a major risk for aviation safety, due to both negligent and malicious use. For this reason, the automated detection and tracking of UAV is a fundamental task in aerial security systems. Common technologies for... Read More about Unmanned Aerial Vehicle Visual Detection and Tracking using Deep Neural Networks: A Performance Benchmark.

STGAE: Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising (2021)
Presentation / Conference Contribution
Zhou, K., Cheng, Z., Shum, H. P., Li, F. W., & Liang, X. (2021). STGAE: Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising. . https://doi.org/10.1109/ismar52148.2021.00018

Hand object interaction in mixed reality (MR) relies on the accurate tracking and estimation of human hands, which provide users with a sense of immersion. However, raw captured hand motion data always contains errors such as joints occlusion, disloc... Read More about STGAE: Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising.

Human-centric Autonomous Driving in an AV-Pedestrian Interactive Environment Using SVO (2021)
Presentation / Conference Contribution
Crosato, L., Wei, C., Ho, E. S., & Shum, H. P. (2021). Human-centric Autonomous Driving in an AV-Pedestrian Interactive Environment Using SVO. . https://doi.org/10.1109/ichms53169.2021.9582640

As Autonomous Vehicles (AV) are becoming a reality, the design of efficient motion control algorithms will have to deal with the unpredictable and interactive nature of other road users. Current AV motion planning algorithms suffer from the freezing... Read More about Human-centric Autonomous Driving in an AV-Pedestrian Interactive Environment Using SVO.

GAN-based Reactive Motion Synthesis with Class-aware Discriminators for Human-human Interaction (2021)
Journal Article
Men, Q., Shum, H. P., Ho, E. S., & Leung, H. (2022). GAN-based Reactive Motion Synthesis with Class-aware Discriminators for Human-human Interaction. Computers and Graphics, 102, 634-645. https://doi.org/10.1016/j.cag.2021.09.014

Creating realistic characters that can react to the users’ or another character’s movement can benefit computer graphics, games and virtual reality hugely. However, synthesizing such reactive motions in human-human interactions is a challenging task... Read More about GAN-based Reactive Motion Synthesis with Class-aware Discriminators for Human-human Interaction.

Interpreting Deep Learning based Cerebral Palsy Prediction with Channel Attention (2021)
Presentation / Conference Contribution
Zhu, M., Men, Q., Ho, E. S., Leung, H., & Shum, H. P. (2021). Interpreting Deep Learning based Cerebral Palsy Prediction with Channel Attention. . https://doi.org/10.1109/bhi50953.2021.9508619

Early prediction of cerebral palsy is essential as it leads to early treatment and monitoring. Deep learning has shown promising results in biomedical engineering thanks to its capacity of modelling complicated data with its non-linear architecture.... Read More about Interpreting Deep Learning based Cerebral Palsy Prediction with Channel Attention.

Spoofing Detection on Hand Images Using Quality Assessment (2021)
Journal Article
Bera, A., Dey, R., Bhattacharjee, D., Nasipuri, M. *., & Shum, H. (2021). Spoofing Detection on Hand Images Using Quality Assessment. Multimedia Tools and Applications, 80(19), 28603-28626. https://doi.org/10.1007/s11042-021-10976-z

Recent research on biometrics focuses on achieving a high success rate of authentication and addressing the concern of various spoofing attacks. Although hand geometry recognition provides adequate security over unauthorized access, it is susceptible... Read More about Spoofing Detection on Hand Images Using Quality Assessment.

Stable Hand Pose Estimation under Tremor via Graph Neural Network (2021)
Presentation / Conference Contribution
Leng, Z., Chen, J., Shum, H. P., Li, F. W., & Liang, X. (2021). Stable Hand Pose Estimation under Tremor via Graph Neural Network. In 2021 IEEE Virtual Reality and 3D User Interfaces (VR) (226-234). https://doi.org/10.1109/vr50410.2021.00044

Hand pose estimation, which predicts the spatial location of hand joints, is a fundamental task in VR/AR applications. Although existing methods can recover hand pose competently, the tremor issue occurring in hand motion has not been completely solv... Read More about Stable Hand Pose Estimation under Tremor via Graph Neural Network.

Makeup Style Transfer on Low-quality Images with Weighted Multi-scale Attention (2021)
Presentation / Conference Contribution
Organisciak, D., Ho, E. S., & Shum, H. P. (2021). Makeup Style Transfer on Low-quality Images with Weighted Multi-scale Attention. . https://doi.org/10.1109/icpr48806.2021.9412604

Facial makeup style transfer is an extremely challenging sub-field of image-to-image-translation. Due to this difficulty, state-of-the-art results are mostly reliant on the Face Parsing Algorithm, which segments a face into parts in order to easily e... Read More about Makeup Style Transfer on Low-quality Images with Weighted Multi-scale Attention.

A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition (2021)
Presentation / Conference Contribution
Men, Q., Hoy, E. S., Shum, H. P., & Leung, H. (2021). A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition. . https://doi.org/10.1109/icpr48806.2021.9412538

This paper addresses the problem of recognizing human-human interaction from skeletal sequences. Existing methods are mainly designed to classify single human action. Many of them simply stack the movement features of two characters to deal with huma... Read More about A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition.

3D car shape reconstruction from a contour sketch using GAN and lazy learning (2021)
Journal Article
Nozawa, N., Shum, H. P., Feng, Q., Ho, E. S., & Morishima, S. (2022). 3D car shape reconstruction from a contour sketch using GAN and lazy learning. Visual Computer, 38(4), 1317-1330. https://doi.org/10.1007/s00371-020-02024-y

3D car models are heavily used in computer games, visual effects, and even automotive designs. As a result, producing such models with minimal labour costs is increasingly more important. To tackle the challenge, we propose a novel system to reconstr... Read More about 3D car shape reconstruction from a contour sketch using GAN and lazy learning.

Two-stage human verification using HandCAPTCHA and anti-spoofed finger biometrics with feature selection (2021)
Journal Article
Bera, A., Bhattacharjee, D., & Shum, H. P. (2021). Two-stage human verification using HandCAPTCHA and anti-spoofed finger biometrics with feature selection. Expert Systems with Applications, 171, https://doi.org/10.1016/j.eswa.2021.114583

This paper presents a human verification scheme in two independent stages to overcome the vulnerabilities of attacks and to enhance security. At the first stage, a hand image-based CAPTCHA (HandCAPTCHA) is tested to avert automated bot-attacks on the... Read More about Two-stage human verification using HandCAPTCHA and anti-spoofed finger biometrics with feature selection.

Multi-task Deep Learning with Optical Flow Features for Self-Driving Cars (2020)
Journal Article
Hu, Y., Shum, H. P., & Ho, E. S. (2020). Multi-task Deep Learning with Optical Flow Features for Self-Driving Cars. IET Intelligent Transport Systems, 14(13), 1845-1854. https://doi.org/10.1049/iet-its.2020.0439

The control of self-driving cars has received growing attention recently. Although existing research shows promising results in the vehicle control using video from a monocular dash camera, there has been very limited work on directly learning vehicl... Read More about Multi-task Deep Learning with Optical Flow Features for Self-Driving Cars.

A Privacy-Preserving Efficient Location-Sharing Scheme for Mobile Online Social Network Applications (2020)
Journal Article
Bhattacharya, M., Roy, S., Mistry, K., Shum, H. P., & Chattopadhyay, S. (2020). A Privacy-Preserving Efficient Location-Sharing Scheme for Mobile Online Social Network Applications. IEEE Access, 8, 221330 - 221351. https://doi.org/10.1109/ACCESS.2020.3043621

The rapid development of mobile internet technology and the better availability of GPS have made mobile online social networks (mOSNs) more popular than traditional online social networks (OSNs) over the last few years. They necessitate fundamental s... Read More about A Privacy-Preserving Efficient Location-Sharing Scheme for Mobile Online Social Network Applications.

A Privacy-Preserving Efficient Location-Sharing Scheme for Mobile Online Social Network Applications (2020)
Journal Article
Bhattacharya, M., Roy, S., Mistry, K., Shum, H. P., & Chattopadhyay, S. (2020). A Privacy-Preserving Efficient Location-Sharing Scheme for Mobile Online Social Network Applications. IEEE Access, 8, https://doi.org/10.1109/access.2020.3043621

The rapid development of mobile internet technology and the better availability of GPS have made mobile online social networks (mOSNs) more popular than traditional online social networks (OSNs) over the last few years. They necessitate fundamental s... Read More about A Privacy-Preserving Efficient Location-Sharing Scheme for Mobile Online Social Network Applications.

A Quadruple Diffusion Convolutional Recurrent Network for Human Motion Prediction (2020)
Journal Article
Men, Q., Ho, E. S., Shum, H. P., & Leung, H. (2021). A Quadruple Diffusion Convolutional Recurrent Network for Human Motion Prediction. IEEE Transactions on Circuits and Systems for Video Technology, 31(9), 3417-3432. https://doi.org/10.1109/tcsvt.2020.3038145

Recurrent neural network (RNN) has become popular for human motion prediction thanks to its ability to capture temporal dependencies. However, it has limited capacity in modeling the complex spatial relationship in the human skeletal structure. In th... Read More about A Quadruple Diffusion Convolutional Recurrent Network for Human Motion Prediction.

Facial reshaping operator for controllable face beautification (2020)
Journal Article
Hu, S., Shum, H. P., Liang, X., Li, F. W., & Aslam, N. (2021). Facial reshaping operator for controllable face beautification. Expert Systems with Applications, 167, Article 114067. https://doi.org/10.1016/j.eswa.2020.114067

Posting attractive facial photos is part of everyday life in the social media era. Motivated by the demand, we propose a lightweight method to automatically and efficiently beautify the shapes of both portrait and non-portrait faces in photos, while... Read More about Facial reshaping operator for controllable face beautification.

LMZMPM: Local Modified Zernike Moment Per-unit Mass for Robust Human Face Recognition (2020)
Journal Article
Kar, A., Pramanik, S., Chakraborty, A., Bhattacharjee, D., Ho, E. S., & Shum, H. P. (2020). LMZMPM: Local Modified Zernike Moment Per-unit Mass for Robust Human Face Recognition. IEEE Transactions on Information Forensics and Security, 16, 495-509. https://doi.org/10.1109/tifs.2020.3015552

In this work, we proposed a novel method, called Local Modified Zernike Moment per unit Mass (LMZMPM), for face recognition, which is invariant to illumination, scaling, noise, in-plane rotation, and translation, along with other orthogonal and inher... Read More about LMZMPM: Local Modified Zernike Moment Per-unit Mass for Robust Human Face Recognition.

Cumuliform Cloud Formation Control using Parameter-Predicting Convolutional Neural Network (2020)
Journal Article
Zhang, Z., Ma, Y., Li, Y., Li, F. W., Shum, H. P., Yang, B., …Liang, X. (2020). Cumuliform Cloud Formation Control using Parameter-Predicting Convolutional Neural Network. Graphical Models, 111, Article 101083. https://doi.org/10.1016/j.gmod.2020.101083

Physically-based cloud simulation is an effective approach for synthesizing realistic cloud. However, generating clouds with desired shapes requires a time-consuming process for selecting the appropriate simulation parameters. This paper addresses su... Read More about Cumuliform Cloud Formation Control using Parameter-Predicting Convolutional Neural Network.

Sparse Metric-based Mesh Saliency (2020)
Journal Article
Hu, S., Liang, X., Shum, H. P., Li, F. W., & Aslam, N. (2020). Sparse Metric-based Mesh Saliency. Neurocomputing, 400, 11-23. https://doi.org/10.1016/j.neucom.2020.02.106

In this paper, we propose an accurate and robust approach to salient region detection for 3D polygonal surface meshes. The salient regions of a mesh are those that geometrically stand out from their contexts and therefore are semantically important f... Read More about Sparse Metric-based Mesh Saliency.

A Unified Deep Metric Representation for Mesh Saliency Detection and Non-rigid Shape Matching (2019)
Journal Article
Hu, S., Shum, H., Aslam, N., Li, F. W., & Liang, X. (2020). A Unified Deep Metric Representation for Mesh Saliency Detection and Non-rigid Shape Matching. IEEE Transactions on Multimedia, 22(9), 2278-2292. https://doi.org/10.1109/tmm.2019.2952983

In this paper, we propose a deep metric for unifying the representation of mesh saliency detection and non-rigid shape matching. While saliency detection and shape matching are two closely related and fundamental tasks in shape analysis, previous met... Read More about A Unified Deep Metric Representation for Mesh Saliency Detection and Non-rigid Shape Matching.

Spatio-temporal Manifold Learning for Human Motions via Long-horizon Modeling (2019)
Journal Article
Wang, H., Ho, E. S., Shum, H. P., & Zhu, Z. (2019). Spatio-temporal Manifold Learning for Human Motions via Long-horizon Modeling. IEEE Transactions on Visualization and Computer Graphics, https://doi.org/10.1109/tvcg.2019.2936810

Data-driven modeling of human motions is ubiquitous in computer graphics and vision applications. Such problems can be approached by deep learning on a large amount data. However, existing methods can be sub-optimal for two reasons. First, skeletal i... Read More about Spatio-temporal Manifold Learning for Human Motions via Long-horizon Modeling.

Interaction-Based Human Activity Comparison (2019)
Journal Article
Shen, Y., Yang, L., Ho, E. S., & Shum, H. P. (2020). Interaction-Based Human Activity Comparison. IEEE Transactions on Visualization and Computer Graphics, 26(8), 2620-2633. https://doi.org/10.1109/tvcg.2019.2893247

Traditional methods for motion comparison consider features from individual characters. However, the semantic meaning of many human activities is usually defined by the interaction between them, such as a high-five interaction of two characters. Ther... Read More about Interaction-Based Human Activity Comparison.

Automatic Musculoskeletal and Neurological Disorder Diagnosis With Relative Joint Displacement From Human Gait (2018)
Journal Article
Rueangsirarak, W., Zhang, J., Aslam, N., Ho, E. S., & Shum, H. P. (2018). Automatic Musculoskeletal and Neurological Disorder Diagnosis With Relative Joint Displacement From Human Gait. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(12), 2387-2396. https://doi.org/10.1109/tnsre.2018.2880871

Musculoskeletal and neurological disorders are common devastating companions of ageing, leading to a reduction in quality of life and increased mortality. Gait analysis is a popular method for diagnosing these disorders. However, manually analyzing t... Read More about Automatic Musculoskeletal and Neurological Disorder Diagnosis With Relative Joint Displacement From Human Gait.

Action Recognition From Arbitrary Views Using Transferable Dictionary Learning (2018)
Journal Article
Zhang, J., Shum, H. P., Han, J., & Shao, L. (2018). Action Recognition From Arbitrary Views Using Transferable Dictionary Learning. IEEE Transactions on Image Processing, 27(10), 4709-4723. https://doi.org/10.1109/tip.2018.2836323

Human action recognition is crucial to many practical applications, ranging from human-computer interaction to video surveillance. Most approaches either recognize the human action from a fixed view or require the knowledge of view angle, which is us... Read More about Action Recognition From Arbitrary Views Using Transferable Dictionary Learning.

Manifold Regularized Experimental Design for Active Learning (2016)
Journal Article
Zhang, L., Shum, H. P., & Shao, L. (2017). Manifold Regularized Experimental Design for Active Learning. IEEE Transactions on Image Processing, 26(2), 969-981. https://doi.org/10.1109/tip.2016.2635440

Various machine learning and data mining tasks in classification require abundant data samples to be labeled for training. Conventional active learning methods aim at labeling the most informative samples for alleviating the labor of the user. Many p... Read More about Manifold Regularized Experimental Design for Active Learning.

Validation of an ergonomic assessment method using Kinect data in real workplace conditions (2016)
Journal Article
Plantard, P., Shum, H. P., Le Pierres, A.-S., & Multon, F. (2017). Validation of an ergonomic assessment method using Kinect data in real workplace conditions. Applied Ergonomics: Human Factors in Technology and Society, 65, 562-569. https://doi.org/10.1016/j.apergo.2016.10.015

Evaluating potential musculoskeletal disorders risks in real workstations is challenging as the environment is cluttered, which makes it difficult to accurately assess workers' postures. Being marker-free and calibration-free, Microsoft Kinect is a p... Read More about Validation of an ergonomic assessment method using Kinect data in real workplace conditions.

Discriminative Semantic Subspace Analysis for Relevance Feedback (2016)
Journal Article
Zhang, L., Shum, H., & Shao, L. (2016). Discriminative Semantic Subspace Analysis for Relevance Feedback. IEEE Transactions on Image Processing, 25(3), 1275-1287. https://doi.org/10.1109/tip.2016.2516947

Content-based image retrieval (CBIR) has attracted much attention during the past decades for its potential practical applications to image database management. A variety of relevance feedback (RF) schemes have been designed to bridge the gap between... Read More about Discriminative Semantic Subspace Analysis for Relevance Feedback.

Kinect Posture Reconstruction Based on a Local Mixture of Gaussian Process Models (2015)
Journal Article
Liu, Z., Zhou, L., Leung, H., & Shum, H. P. (2016). Kinect Posture Reconstruction Based on a Local Mixture of Gaussian Process Models. IEEE Transactions on Visualization and Computer Graphics, 22(11), 2437-2450. https://doi.org/10.1109/tvcg.2015.2510000

Depth sensor based 3D human motion estimation hardware such as Kinect has made interactive applications more popular recently. However, it is still challenging to accurately recognize postures from a single depth camera due to the inherently noisy da... Read More about Kinect Posture Reconstruction Based on a Local Mixture of Gaussian Process Models.

Multi-layer Lattice Model for Real-Time Dynamic Character Deformation (2015)
Journal Article
Iwamoto, N., Shum, H. P., Yang, L., & Morishima, S. (2015). Multi-layer Lattice Model for Real-Time Dynamic Character Deformation. Computer Graphics Forum, 34(7), 99-109. https://doi.org/10.1111/cgf.12749

Real-Time Posture Reconstruction for Microsoft Kinect (2013)
Journal Article
Shum, H. P., Ho, E. S., Jiang, Y., & Takagi, S. (2013). Real-Time Posture Reconstruction for Microsoft Kinect. IEEE Transactions on Cybernetics, 43(5), 1357-1369. https://doi.org/10.1109/tcyb.2013.2275945

The recent advancement of motion recognition using Microsoft Kinect stimulates many new ideas in motion capture and virtual reality applications. Utilizing a pattern recognition algorithm, Kinect can determine the positions of different body parts fr... Read More about Real-Time Posture Reconstruction for Microsoft Kinect.

Interactive Formation Control in Complex Environments (2013)
Journal Article
Henry, J., Shum, H. P., & Komura, T. (2014). Interactive Formation Control in Complex Environments. IEEE Transactions on Visualization and Computer Graphics, 20(2), 211-222. https://doi.org/10.1109/tvcg.2013.116

The degrees of freedom of a crowd is much higher than that provided by a standard user input device. Typically, crowd-control systems require multiple passes to design crowd movements by specifying waypoints, and then defining character trajectories... Read More about Interactive Formation Control in Complex Environments.

Simulating Multiple Character Interactions with Collaborative and Adversarial Goals (2010)
Journal Article
Shum, H. P., Komura, T., & Yamazaki, S. (2012). Simulating Multiple Character Interactions with Collaborative and Adversarial Goals. IEEE Transactions on Visualization and Computer Graphics, 18(5), 741-752. https://doi.org/10.1109/tvcg.2010.257

This paper proposes a new methodology for synthesizing animations of multiple characters, allowing them to intelligently compete with one another in dense environments, while still satisfying requirements set by an animator. To achieve these two conf... Read More about Simulating Multiple Character Interactions with Collaborative and Adversarial Goals.

Interaction patches for multi-character animation (2008)
Journal Article
Shum, H. P., Komura, T., Shiraishi, M., & Yamazaki, S. (2008). Interaction patches for multi-character animation. ACM Transactions on Graphics, 27(5), Article 114. https://doi.org/10.1145/1409060.1409067

We propose a data-driven approach to automatically generate a scene where tens to hundreds of characters densely interact with each other. During off-line processing, the close interactions between characters are precomputed by expanding a game tree,... Read More about Interaction patches for multi-character animation.

Outputs (82)