Skip to main content

Research Repository

Advanced Search

All Outputs (44)

From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos (2024)
Journal Article
Qiao, T., Li, R., Li, F. W. B., & Shum, H. P. H. (in press). From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos. International Conference on Pattern Recognition,

Video-based Human-Object Interaction (HOI) recognition explores the intricate dynamics between humans and objects, which are essential for a comprehensive understanding of human behavior and intentions. While previous work has made significant stride... Read More about From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos.

HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention (2024)
Journal Article
Chen, S., Atapour-Abarghouei, A., & Shum, H. P. H. (2024). HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention. IEEE Transactions on Multimedia, 26, 7649-7660. https://doi.org/10.1109/TMM.2024.3369897

Existing image inpainting methods leverage convolution-based downsampling approaches to reduce spatial dimensions. This may result in information loss from corrupted images where the available information is inherently sparse, especially for the scen... Read More about HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention.

Enhancing surgical performance in cardiothoracic surgery with innovations from computer vision and artificial intelligence: a narrative review (2024)
Journal Article
Constable, M. D., Shum, H. P. H., & Clark, S. (2024). Enhancing surgical performance in cardiothoracic surgery with innovations from computer vision and artificial intelligence: a narrative review. Journal of Cardiothoracic Surgery, 19(1), Article 94. https://doi.org/10.1186/s13019-024-02558-5

When technical requirements are high, and patient outcomes are critical, opportunities for monitoring and improving surgical skills via objective motion analysis feedback may be particularly beneficial. This narrative review synthesises work on techn... Read More about Enhancing surgical performance in cardiothoracic surgery with innovations from computer vision and artificial intelligence: a narrative review.

Pose-based tremor type and level analysis for Parkinson’s disease from video (2024)
Journal Article
Zhang, H., Ho, E. S. L., Zhang, X., Del Din, S., & Shum, H. P. H. (2024). Pose-based tremor type and level analysis for Parkinson’s disease from video. International Journal of Computer Assisted Radiology and Surgery, https://doi.org/10.1007/s11548-023-03052-4

Current methods for diagnosis of PD rely on clinical examination. The accuracy of diagnosis ranges between 73 and 84%, and is influenced by the experience of the clinical assessor. Hence, an automatic, effective and interpretable supporting system fo... Read More about Pose-based tremor type and level analysis for Parkinson’s disease from video.

Social Interaction‐Aware Dynamical Models and Decision‐Making for Autonomous Vehicles (2023)
Journal Article
Crosato, L., Tian, K., Shum, H. P., Ho, E. S., Wang, Y., & Wei, C. (2023). Social Interaction‐Aware Dynamical Models and Decision‐Making for Autonomous Vehicles. Advanced Intelligent Systems, https://doi.org/10.1002/aisy.202300575

Interaction‐aware autonomous driving (IAAD) is a rapidly growing field of research that focuses on the development of autonomous vehicles (AVs) that are capable of interacting safely and efficiently with human road users. This is a challenging task,... Read More about Social Interaction‐Aware Dynamical Models and Decision‐Making for Autonomous Vehicles.

Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising (2023)
Journal Article
Zhou, K., Shum, H. P., Li, F. W., & Liang, X. (2023). Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising. IEEE Transactions on Visualization and Computer Graphics, https://doi.org/10.1109/TVCG.2023.3337868

In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the inte... Read More about Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising.

Hierarchical Graph Convolutional Networks for Action Quality Assessment (2023)
Journal Article
Zhou, K., Ma, Y., Shum, H. P., & Liang, X. (2023). Hierarchical Graph Convolutional Networks for Action Quality Assessment. IEEE Transactions on Circuits and Systems for Video Technology, https://doi.org/10.1109/TCSVT.2023.3281413

Action quality assessment (AQA) automatically evaluates how well humans perform actions in a given video, a technique widely used in fields such as rehabilitation medicine, athletic competitions, and specific skills assessment. However, existing work... Read More about Hierarchical Graph Convolutional Networks for Action Quality Assessment.

INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network (2023)
Journal Article
Chen, S., Atapour-Abarghouei, A., Ho, E. S., & Shum, H. P. (2023). INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network. Software impacts, 17, Article 100517. https://doi.org/10.1016/j.simpa.2023.100517

We present a software that predicts non-cleft facial images for patients with cleft lip, thereby facilitating the understanding, awareness and discussion of cleft lip surgeries. To protect patients’ privacy, we design a software framework using image... Read More about INCLG: Inpainting for Non-Cleft Lip Generation with a Multi-Task Image Processing Network.

Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition (2023)
Journal Article
Men, Q., Ho, E. S., Shum, H. P., & Leung, H. (2023). Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition. Neurocomputing, 537, 198-209. https://doi.org/10.1016/j.neucom.2023.03.070

Learning view-invariant representation is a key to improving feature discrimination power for skeleton-based action recognition. Existing approaches cannot effectively remove the impact of viewpoint due to the implicit view-dependent representations.... Read More about Focalized Contrastive View-invariant Learning for Self-supervised Skeleton-based Action Recognition.

A Video-Based Augmented Reality System for Human-in-the-Loop Muscle Strength Assessment of Juvenile Dermatomyositis (2023)
Journal Article
Zhou, K., Cai, R., Ma, Y., Tan, Q., Wang, X., Li, J., …Liang, X. (2023). A Video-Based Augmented Reality System for Human-in-the-Loop Muscle Strength Assessment of Juvenile Dermatomyositis. IEEE Transactions on Visualization and Computer Graphics, 29(5), 2456-2466. https://doi.org/10.1109/tvcg.2023.3247092

As the most common idiopathic inflammatory myopathy in children, juvenile dermatomyositis (JDM) is characterized by skin rashes and muscle weakness. The childhood myositis assessment scale (CMAS) is commonly used to measure the degree of muscle invol... Read More about A Video-Based Augmented Reality System for Human-in-the-Loop Muscle Strength Assessment of Juvenile Dermatomyositis.

A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction (2022)
Journal Article
Zhu, M., Men, Q., Ho, E. S., Leung, H., & Shum, H. P. (2022). A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction. Journal of Medical Systems, 46(11), Article 76. https://doi.org/10.1007/s10916-022-01857-5

Musculoskeletal and neurological disorders are the most common causes of walking problems among older people, and they often lead to diminished quality of life. Analyzing walking motion data manually requires trained professionals and the evaluations... Read More about A Two-stream Convolutional Network for Musculoskeletal and Neurological Disorders Prediction.

CP-AGCN: Pytorch-based Attention Informed Graph Convolutional Network for Identifying Infants at Risk of Cerebral Palsy (2022)
Journal Article
Zhang, H., Ho, E. S., & Shum, H. P. (2022). CP-AGCN: Pytorch-based Attention Informed Graph Convolutional Network for Identifying Infants at Risk of Cerebral Palsy. Software impacts, 14, Article 100419. https://doi.org/10.1016/j.simpa.2022.100419

Early prediction is clinically considered one of the essential parts of cerebral palsy (CP) treatment. We propose to implement a low-cost and interpretable classification system for supporting CP prediction based on General Movement Assessment (GMA).... Read More about CP-AGCN: Pytorch-based Attention Informed Graph Convolutional Network for Identifying Infants at Risk of Cerebral Palsy.

Interaction-aware Decision-making for Automated Vehicles using Social Value Orientation (2022)
Journal Article
Crosato, L., Shum, H. P., Ho, E. S., & Wei, C. (2023). Interaction-aware Decision-making for Automated Vehicles using Social Value Orientation. IEEE Transactions on Intelligent Vehicles, 8(2), 1339-1349. https://doi.org/10.1109/tiv.2022.3189836

Motion control algorithms in the presence of pedestrians are critical for the development of safe and reliable Autonomous Vehicles (AVs). Traditional motion control algorithms rely on manually designed decision-making policies which neglect the mutua... Read More about Interaction-aware Decision-making for Automated Vehicles using Social Value Orientation.

Formation Control for UAVs Using a Flux Guided Approach (2022)
Journal Article
Hartley, J., Shum, H. P., Ho, E. S., Wang, H., & Ramamoorthyd, S. (2022). Formation Control for UAVs Using a Flux Guided Approach. Expert Systems with Applications, 205, Article 117665. https://doi.org/10.1016/j.eswa.2022.117665

Existing studies on formation control for unmanned aerial vehicles (UAV) have not considered encircling targets where an optimum coverage of the target is required at all times. Such coverage plays a critical role in many real-world applications such... Read More about Formation Control for UAVs Using a Flux Guided Approach.

RobIn: A Robust Interpretable Deep Network for Schizophrenia Diagnosis (2022)
Journal Article
Organisciak, D., Shum, H. P., Nwoye, E., & Woo, W. L. (2022). RobIn: A Robust Interpretable Deep Network for Schizophrenia Diagnosis. Expert Systems with Applications, 201, Article 117158. https://doi.org/10.1016/j.eswa.2022.117158

Schizophrenia is a severe mental health condition that requires a long and complicated diagnostic process. However, early diagnosis is vital to control symptoms. Deep learning has recently become a popular way to analyse and interpret medical data. P... Read More about RobIn: A Robust Interpretable Deep Network for Schizophrenia Diagnosis.

A Pose-based Feature Fusion and Classification Framework for the Early Prediction of Cerebral Palsy in Infants (2021)
Journal Article
McCay, K. D., Hu, P., Shum, H. P., Woo, W. L., Marcroft, C., Embleton, N. D., …Ho, E. S. (2022). A Pose-based Feature Fusion and Classification Framework for the Early Prediction of Cerebral Palsy in Infants. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 30, 8-19. https://doi.org/10.1109/tnsre.2021.3138185

The early diagnosis of cerebral palsy is an area which has recently seen significant multi-disciplinary research. Diagnostic tools such as the General Movements Assessment (GMA), have produced some very promising results. However, the prospect of aut... Read More about A Pose-based Feature Fusion and Classification Framework for the Early Prediction of Cerebral Palsy in Infants.

PyTorch-based Implementation of Label-aware Graph Representation for Multi-class Trajectory Prediction (2021)
Journal Article
Men, Q., & Shum, H. P. (2022). PyTorch-based Implementation of Label-aware Graph Representation for Multi-class Trajectory Prediction. Software impacts, 11, Article 100201. https://doi.org/10.1016/j.simpa.2021.100201

Trajectory Prediction under diverse patterns has attracted increasing attention in multiple real-world applications ranging from urban traffic analysis to human motion understanding, among which graph convolution network (GCN) is frequently adopted w... Read More about PyTorch-based Implementation of Label-aware Graph Representation for Multi-class Trajectory Prediction.

GAN-based Reactive Motion Synthesis with Class-aware Discriminators for Human-human Interaction (2021)
Journal Article
Men, Q., Shum, H. P., Ho, E. S., & Leung, H. (2022). GAN-based Reactive Motion Synthesis with Class-aware Discriminators for Human-human Interaction. Computers and Graphics, 102, 634-645. https://doi.org/10.1016/j.cag.2021.09.014

Creating realistic characters that can react to the users’ or another character’s movement can benefit computer graphics, games and virtual reality hugely. However, synthesizing such reactive motions in human-human interactions is a challenging task... Read More about GAN-based Reactive Motion Synthesis with Class-aware Discriminators for Human-human Interaction.

Spoofing Detection on Hand Images Using Quality Assessment (2021)
Journal Article
Bera, A., Dey, R., Bhattacharjee, D., Nasipuri, M. *., & Shum, H. (2021). Spoofing Detection on Hand Images Using Quality Assessment. Multimedia Tools and Applications, 80(19), 28603-28626. https://doi.org/10.1007/s11042-021-10976-z

Recent research on biometrics focuses on achieving a high success rate of authentication and addressing the concern of various spoofing attacks. Although hand geometry recognition provides adequate security over unauthorized access, it is susceptible... Read More about Spoofing Detection on Hand Images Using Quality Assessment.