Skip to main content

Research Repository

Advanced Search

All Outputs (73)

From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos (2024)
Presentation / Conference Contribution
Qiao, T., Li, R., Li, F. W. B., & Shum, H. P. H. (2024, December). From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos. Presented at Proceedings of the 2024 International Conference on Pattern Recognition, Kolkata, India, 2024., Kolkata, India

Video-based Human-Object Interaction (HOI) recognition explores the intricate dynamics between humans and objects, which are essential for a comprehensive understanding of human behavior and intentions. While previous work has made significant stride... Read More about From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos.

Multi-Feature Fusion Enhanced Monocular Depth Estimation With Boundary Awareness (2024)
Journal Article
Song, C., Chen, Q., Li, F. W. B., Jiang, Z., Zheng, D., Shen, Y., & Yang, B. (2024). Multi-Feature Fusion Enhanced Monocular Depth Estimation With Boundary Awareness. Visual Computer, 40, 4955–4967. https://doi.org/10.1007/s00371-024-03498-w

Self-supervised monocular depth estimation has opened up exciting possibilities for practical applications, including scene understanding, object detection, and autonomous driving, without the need for expensive depth annotations. However, traditiona... Read More about Multi-Feature Fusion Enhanced Monocular Depth Estimation With Boundary Awareness.

Color Theme Evaluation through User Preference Modeling (2024)
Journal Article
Yang, B., Wei, T., Li, F. W. B., Liang, X., Deng, Z., & Fang, Y. (2024). Color Theme Evaluation through User Preference Modeling. ACM Transactions on Applied Perception, 21(3), 1-35. https://doi.org/10.1145/3665329

Color composition (or color theme) is a key factor to determine how well a piece of art work or graphical design is perceived by humans. Despite a few color harmony models have been proposed, their results are often less satisfactory since they mostl... Read More about Color Theme Evaluation through User Preference Modeling.

Multi-Style Cartoonization: Leveraging Multiple Datasets With GANs (2024)
Journal Article
Cai, J., Li, F. W. B., Nan, F., & Yang, B. (2024). Multi-Style Cartoonization: Leveraging Multiple Datasets With GANs. Computer Animation and Virtual Worlds, 35(3), Article e2269. https://doi.org/10.1002/cav.2269

Scene cartoonization aims to convert photos into stylized cartoons. While GANs can generate high-quality images, previous methods focus on individual images or single styles, ignoring relationships between datasets. We propose a novel multi-style sce... Read More about Multi-Style Cartoonization: Leveraging Multiple Datasets With GANs.

Laplacian Projection Based Global Physical Prior Smoke Reconstruction (2024)
Journal Article
Xiao, S., Tong, C., Zhang, Q., Cen, Y., Li, F. W. B., & Liang, X. (2024). Laplacian Projection Based Global Physical Prior Smoke Reconstruction. IEEE Transactions on Visualization and Computer Graphics, https://doi.org/10.1109/tvcg.2024.3358636

We present a novel framework for reconstructing fluid dynamics in real-life scenarios. Our approach leverages sparse view images and incorporates physical priors across long series of frames, resulting in reconstructed fluids with enhanced physical c... Read More about Laplacian Projection Based Global Physical Prior Smoke Reconstruction.

HSE: Hybrid Species Embedding for Deep Metric Learning (2024)
Presentation / Conference Contribution
Yang, B., Sun, H., Li, F. W. B., Chen, Z., Cai, J., & Song, C. (2024). HSE: Hybrid Species Embedding for Deep Metric Learning. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). https://doi.org/10.1109/ICCV51070.2023.01014

Deep metric learning is crucial for finding an embedding function that can generalize to training and testing data, including unknown test classes. However, limited training samples restrict the model's generalization to downstream tasks. While addin... Read More about HSE: Hybrid Species Embedding for Deep Metric Learning.

Tackling Data Bias in Painting Classification with Style Transfer (2023)
Presentation / Conference Contribution
Vijendran, M., Li, F. W., & Shum, H. P. (2023). Tackling Data Bias in Painting Classification with Style Transfer. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5 VISAPP: VISAPP (250-261). https://doi.org/10.5220/0011776600003417

It is difficult to train classifiers on paintings collections due to model bias from domain gaps and data bias from the uneven distribution of artistic styles. Previous techniques like data distillation, traditional data augmentation and style transf... Read More about Tackling Data Bias in Painting Classification with Style Transfer.

Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model (2023)
Presentation / Conference Contribution
Wang, Y., Leng, Z., Li, F. W. B., Wu, S., & Liang, X. (2023). Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). https://doi.org/10.1109/ICCV51070.2023.02014

Text-driven human motion generation in computer vision is both significant and challenging. However, current methods are limited to producing either deterministic or imprecise motion sequences, failing to effectively control the temporal and spatial... Read More about Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model.

DrawGAN: Multi-view Generative Model Inspired By The Artist's Drawing Method (2023)
Presentation / Conference Contribution
Yang, B., Chen, Z., Li, F. W. B., Sun, H., & Cai, J. (2023, August). DrawGAN: Multi-view Generative Model Inspired By The Artist's Drawing Method. Presented at CGI 2023: Advances in Computer Graphics, Shanghai, China

We present a novel approach for modeling artists' drawing processes using an architecture that combines an unconditional generative adversarial network (GAN) with a multi-view generator and multi-discriminator. Our method excels in synthesizing vario... Read More about DrawGAN: Multi-view Generative Model Inspired By The Artist's Drawing Method.

A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments (2023)
Presentation / Conference Contribution
Zhou, K., Chen, C., Ma, Y., Leng, Z., Shum, H. P., Li, F. W., & Liang, X. (2023). A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments. In 2023 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). https://doi.org/10.1109/ISMAR59233.2023.00031

As human exploration of space continues to progress, the use of Mixed Reality (MR) for simulating microgravity environments and facilitating training in hand-object interaction holds immense practical significance. However, hand-object interaction in... Read More about A Mixed Reality Training System for Hand-Object Interaction in Simulated Microgravity Environments.

Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising (2023)
Journal Article
Zhou, K., Shum, H. P., Li, F. W., & Liang, X. (2023). Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising. IEEE Transactions on Visualization and Computer Graphics, https://doi.org/10.1109/TVCG.2023.3337868

In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the inte... Read More about Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising.

A Differential Diffusion Theory for Participating Media (2023)
Journal Article
Cen, Y., Li, C., Li, F. W. B., Yang, B., & Liang, X. (2023). A Differential Diffusion Theory for Participating Media. Computer Graphics Forum, 42(7), Article e14956. https://doi.org/10.1111/cgf.14956

We present a novel approach to differentiable rendering for participating media, addressing the challenge of computing scene parameter derivatives. While existing methods focus on derivative computation within volumetric path tracing, they fail to si... Read More about A Differential Diffusion Theory for Participating Media.

An end-to-end dynamic point cloud geometry compression in latent space (2023)
Journal Article
Jiang, Z., Wang, G., Tam, G. K. L., Song, C., Yang, B., & Li, F. W. B. (2023). An end-to-end dynamic point cloud geometry compression in latent space. Displays, 80, Article 102528. https://doi.org/10.1016/j.displa.2023.102528

Dynamic point clouds are widely used for 3D data representation in various applications such as immersive and mixed reality, robotics and autonomous driving. However, their irregularity and large scale make efficient compression and... Read More about An end-to-end dynamic point cloud geometry compression in latent space.

WDFSR: Normalizing Flow based on Wavelet-Domain for Super-Resolution (2023)
Journal Article
Song, C., Li, S., Li, F. W. B., & Yang, B. (in press). WDFSR: Normalizing Flow based on Wavelet-Domain for Super-Resolution. Computational Visual Media,

We propose a Normalizing flow based on the wavelet framework for super-resolution called WDFSR. It learns the conditional distribution mapping between low-resolution images in the RGB domain and high-resolution images in the wavelet domain to generat... Read More about WDFSR: Normalizing Flow based on Wavelet-Domain for Super-Resolution.

C2SPoint: A classification-to-saliency network for point cloud saliency detection (2023)
Journal Article
Jiang, Z., Ding, L., Tam, G., Song, C., Li, F. W., & Yang, B. (2023). C2SPoint: A classification-to-saliency network for point cloud saliency detection. Computers and Graphics, 115, 274-284. https://doi.org/10.1016/j.cag.2023.07.003

Point cloud saliency detection is an important technique that support downstream tasks in 3D graphics and vision, like 3D model simplification, compression, reconstruction and viewpoint selection. Existing approaches often rely on hand-crafted featur... Read More about C2SPoint: A classification-to-saliency network for point cloud saliency detection.

Gamifying Experiential Learning Theory (2023)
Presentation / Conference Contribution
Alsaqqaf, A., & Li, F. W. (2022, December). Gamifying Experiential Learning Theory. Presented at International Conference On Web-Based Learning (ICWL 2022), Tenerife, Spain

IAACS: Image Aesthetic Assessment Through Color Composition And Space Formation (2023)
Journal Article
Yang, B., zhu, C., Li, F. W., Wei, T., Liang, X., & Wang, Q. (2023). IAACS: Image Aesthetic Assessment Through Color Composition And Space Formation. Virtual Reality & Intelligent Hardware, 5(1), https://doi.org/10.1016/j.vrih.2022.06.006

Judging how an image is visually appealing is a complicated and subjective task. This highly motivates having a machine learning model to automatically evaluate image aesthetic by matching the aesthetics of general public. Although deep learning meth... Read More about IAACS: Image Aesthetic Assessment Through Color Composition And Space Formation.

A Video-Based Augmented Reality System for Human-in-the-Loop Muscle Strength Assessment of Juvenile Dermatomyositis (2023)
Journal Article
Zhou, K., Cai, R., Ma, Y., Tan, Q., Wang, X., Li, J., …Liang, X. (2023). A Video-Based Augmented Reality System for Human-in-the-Loop Muscle Strength Assessment of Juvenile Dermatomyositis. IEEE Transactions on Visualization and Computer Graphics, 29(5), 2456-2466. https://doi.org/10.1109/tvcg.2023.3247092

As the most common idiopathic inflammatory myopathy in children, juvenile dermatomyositis (JDM) is characterized by skin rashes and muscle weakness. The childhood myositis assessment scale (CMAS) is commonly used to measure the degree of muscle invol... Read More about A Video-Based Augmented Reality System for Human-in-the-Loop Muscle Strength Assessment of Juvenile Dermatomyositis.