Multi-modal Dynamic Point Cloud Geometric Compression Based on Bidirectional Recurrent Scene Flow*

Nan, Fangzhe; Li, Frederick; Wang, Zhuoyue; Tam, Gary K. L.; Jiang, Zhaoyi; DongZheng, DongZheng; Yang, Bailin

doi:10.1109/icassp49660.2025.10888353

Multi-modal Dynamic Point Cloud Geometric Compression Based on Bidirectional Recurrent Scene Flow*

Nan, Fangzhe; Li, Frederick; Wang, Zhuoyue; Tam, Gary K. L.; Jiang, Zhaoyi; DongZheng, DongZheng; Yang, Bailin

Authors

Fangzhe Nan

Dr Frederick Li frederick.li@durham.ac.uk
Associate Professor

Zhuoyue Wang

Gary K. L. Tam

Zhaoyi Jiang

DongZheng DongZheng

Bailin Yang

Abstract

Deep learning methods have recently shown significant promise in compressing the geometric features of point clouds. However, challenges arise when consecutive point clouds contain holes, resulting in incomplete information that complicates motion estimation. To our knowledge, most existing dynamic point cloud compression methods have largely overlooked this critical issue. Moreover, these methods typically employ a multi-scale single-pass approach for motion estimation, performing only one estimation at each scale. This limits accuracy and adversely impacts compression performance. To address these challenges, we propose a dynamic point cloud compression model called M2BR-DPCC (Multi-Modal Multi-Scale Bidirectional Recursion for Dynamic Point Cloud Compression). Our method introduces two key innovations. First, we integrate both point cloud and image data as inputs, leveraging a multi-modal feature representation completion (MFRepC) approach to align information across modalities. This addresses the issue of missing data in point clouds by using complementary information from images. Second, we implement a multi-scale bidirectional recursive (MSBR) motion estimation method. This module iteratively refines motion flows in both forward and backward directions, progressively enhancing point cloud features and improving motion estimation accuracy. Experimental results on widely used datasets, including MVUB and 8iVFB, demonstrate the effectiveness of our approach. Compared to existing methods, M2BR-DPCC achieves superior performance, with an average BD-rate improvement of 95.23% over V-PCC, 12.92% over D-DPCC, and 16.16% over patchDPCC. These results underscore the potential of leveraging multi-modal data and bidirectional refinement for dynamic point cloud compression.

Citation

Nan, F., Li, F., Wang, Z., Tam, G. K. L., Jiang, Z., DongZheng, D., & Yang, B. (2025, April). Multi-modal Dynamic Point Cloud Geometric Compression Based on Bidirectional Recurrent Scene Flow*. Presented at ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India

Presentation Conference Type	Conference Paper (published)
Conference Name	ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Start Date	Apr 6, 2025
End Date	Apr 11, 2025
Acceptance Date	Jan 1, 2025
Online Publication Date	Mar 7, 2025
Publication Date	Mar 7, 2025
Deposit Date	Mar 27, 2025
Publicly Available Date	Mar 28, 2025
Publisher	Institute of Electrical and Electronics Engineers
Peer Reviewed	Peer Reviewed
Pages	1-5
Series ISSN	1520-6149
Book Title	ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
DOI	https://doi.org/10.1109/icassp49660.2025.10888353
Public URL	https://durham-repository.worktribe.com/output/3742974

Files

Accepted Conference Paper (1.4 Mb)
PDF

Licence
http://creativecommons.org/licenses/by/4.0/

Copyright Statement
For the purpose of Open Access the author has applied a CC BY copyright licence to any Author Accepted Manuscript version arising from this submission.