Yin Wang
Fg-T2M++: LLMs-Augmented Fine-Grained Text Driven Human Motion Generation
Wang, Yin; Li, Mu; Liu, Jiapeng; Leng, Zhiying; Li, Frederick W. B.; Zhang, Ziyao; Liang, Xiaohui
Authors
Mu Li
Jiapeng Liu
Zhiying Leng
Dr Frederick Li frederick.li@durham.ac.uk
Associate Professor
Ziyao Zhang
Xiaohui Liang
Abstract
We address the challenging problem of fine-grained text-driven human motion generation. Existing works generate imprecise motions that fail to accurately capture relationships specified in text due to: (1) lack of effective text parsing for detailed semantic cues regarding body parts, (2) not fully modeling linguistic structures between words to comprehend text comprehensively. To tackle these limitations, we propose a novel fine-grained framework Fg-T2M++ that consists of: (1) an LLMs semantic parsing module to extract body part descriptions and semantics from text, (2) a hyperbolic text representation module to encode relational information between text units by embedding the syntactic dependency graph into hyperbolic space, and (3) a multi-modal fusion module to hierarchically fuse text and motion features. Extensive experiments on HumanML3D and KIT-ML datasets demonstrate that Fg-T2M++ outperforms SOTA methods, validating its ability to accurately generate motions adhering to comprehensive text semantics.
Citation
Wang, Y., Li, M., Liu, J., Leng, Z., Li, F. W. B., Zhang, Z., & Liang, X. (online). Fg-T2M++: LLMs-Augmented Fine-Grained Text Driven Human Motion Generation. International Journal of Computer Vision, https://doi.org/10.1007/s11263-025-02392-9
Journal Article Type | Article |
---|---|
Acceptance Date | Feb 8, 2025 |
Online Publication Date | Feb 27, 2025 |
Deposit Date | Mar 27, 2025 |
Publicly Available Date | Mar 27, 2025 |
Journal | International Journal of Computer Vision |
Print ISSN | 0920-5691 |
Electronic ISSN | 1573-1405 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
DOI | https://doi.org/10.1007/s11263-025-02392-9 |
Public URL | https://durham-repository.worktribe.com/output/3742944 |
Files
Accepted Journal Article
(9.4 Mb)
PDF
You might also like
A Differential Diffusion Theory for Participating Media
(2023)
Journal Article
An end-to-end dynamic point cloud geometry compression in latent space
(2023)
Journal Article
IAACS: Image Aesthetic Assessment Through Color Composition And Space Formation
(2023)
Journal Article