Department of Computer Science

From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos (2024)
Journal Article
Qiao, T., Li, R., Li, F. W. B., & Shum, H. P. H. (in press). From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos. International Conference on Pattern Recognition,

Video-based Human-Object Interaction (HOI) recognition explores the intricate dynamics between humans and objects, which are essential for a comprehensive understanding of human behavior and intentions. While previous work has made significant stride... Read More about From Category to Scenery: An End-to-End Framework for Multi-Person Human-Object Interaction Recognition in Videos.

Multi-Feature Fusion Enhanced Monocular Depth Estimation With Boundary Awareness (2024)
Journal Article
Song, C., Chen, Q., Li, F. W. B., Jiang, Z., Zheng, D., Shen, Y., & Yang, B. (2024). Multi-Feature Fusion Enhanced Monocular Depth Estimation With Boundary Awareness. Visual Computer,

Self-supervised monocular depth estimation has opened up exciting possibilities for practical applications, including scene understanding, object detection, and autonomous driving, without the need for expensive depth annotations. However, traditiona... Read More about Multi-Feature Fusion Enhanced Monocular Depth Estimation With Boundary Awareness.

Color Theme Evaluation through User Preference Modeling (2024)
Journal Article
Yang, B., Wei, T., Li, F. W. B., Liang, X., Deng, Z., & Fang, Y. (2024). Color Theme Evaluation through User Preference Modeling. ACM Transactions on Applied Perception, https://doi.org/10.1145/3665329

Color composition (or color theme) is a key factor to determine how well a piece of art work or graphical design is perceived by humans. Despite a few color harmony models have been proposed, their results are often less satisfactory since they mostl... Read More about Color Theme Evaluation through User Preference Modeling.

Multi-Style Cartoonization: Leveraging Multiple Datasets With GANs (2024)
Journal Article
Cai, J., Li, F. W. B., Nan, F., & Yang, B. (2024). Multi-Style Cartoonization: Leveraging Multiple Datasets With GANs. Computer Animation and Virtual Worlds, 35(3), Article e2269. https://doi.org/10.1002/cav.2269

Scene cartoonization aims to convert photos into stylized cartoons. While GANs can generate high-quality images, previous methods focus on individual images or single styles, ignoring relationships between datasets. We propose a novel multi-style sce... Read More about Multi-Style Cartoonization: Leveraging Multiple Datasets With GANs.

Laplacian Projection Based Global Physical Prior Smoke Reconstruction (2024)
Journal Article
Xiao, S., Tong, C., Zhang, Q., Cen, Y., Li, F. W. B., & Liang, X. (2024). Laplacian Projection Based Global Physical Prior Smoke Reconstruction. IEEE Transactions on Visualization and Computer Graphics, https://doi.org/10.1109/tvcg.2024.3358636

We present a novel framework for reconstructing fluid dynamics in real-life scenarios. Our approach leverages sparse view images and incorporates physical priors across long series of frames, resulting in reconstructed fluids with enhanced physical c... Read More about Laplacian Projection Based Global Physical Prior Smoke Reconstruction.

Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising (2023)
Journal Article
Zhou, K., Shum, H. P., Li, F. W., & Liang, X. (2023). Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising. IEEE Transactions on Visualization and Computer Graphics, https://doi.org/10.1109/TVCG.2023.3337868

In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the inte... Read More about Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising.

A Differential Diffusion Theory for Participating Media (2023)
Journal Article
Cen, Y., Li, C., Li, F. W. B., Yang, B., & Liang, X. (2023). A Differential Diffusion Theory for Participating Media. Computer Graphics Forum, https://doi.org/10.1111/cgf.14956

We present a novel approach to differentiable rendering for participating media, addressing the challenge of computing scene parameter derivatives. While existing methods focus on derivative computation within volumetric path tracing, they fail to si... Read More about A Differential Diffusion Theory for Participating Media.

An end-to-end dynamic point cloud geometry compression in latent space (2023)
Journal Article
Jiang, Z., Wang, G., Tam, G. K. L., Song, C., Yang, B., & Li, F. W. B. (2023). An end-to-end dynamic point cloud geometry compression in latent space. Displays, 80, Article 102528. https://doi.org/10.1016/j.displa.2023.102528

Dynamic point clouds are widely used for 3D data representation in various applications such as immersive and mixed reality, robotics and autonomous driving. However, their irregularity and large scale make efficient compression and... Read More about An end-to-end dynamic point cloud geometry compression in latent space.

WDFSR: Normalizing Flow based on Wavelet-Domain for Super-Resolution (2023)
Journal Article
Song, C., Li, S., Li, F. W. B., & Yang, B. (in press). WDFSR: Normalizing Flow based on Wavelet-Domain for Super-Resolution. Computational Visual Media,

We propose a Normalizing flow based on the wavelet framework for super-resolution called WDFSR. It learns the conditional distribution mapping between low-resolution images in the RGB domain and high-resolution images in the wavelet domain to generat... Read More about WDFSR: Normalizing Flow based on Wavelet-Domain for Super-Resolution.

C2SPoint: A classification-to-saliency network for point cloud saliency detection (2023)
Journal Article
Jiang, Z., Ding, L., Tam, G., Song, C., Li, F. W., & Yang, B. (2023). C2SPoint: A classification-to-saliency network for point cloud saliency detection. Computers and Graphics, https://doi.org/10.1016/j.cag.2023.07.003

Point cloud saliency detection is an important technique that support downstream tasks in 3D graphics and vision, like 3D model simplification, compression, reconstruction and viewpoint selection. Existing approaches often rely on hand-crafted featur... Read More about C2SPoint: A classification-to-saliency network for point cloud saliency detection.

Outputs (47)