Yin Wang
Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model
Wang, Yin; Leng, Zhiying; Li, Frederick W. B.; Wu, Shun-Cheng; Liang, Xiaohui
Authors
Zhiying Leng
Dr Frederick Li frederick.li@durham.ac.uk
Associate Professor
Shun-Cheng Wu
Xiaohui Liang
Abstract
Text-driven human motion generation in computer vision is both significant and challenging. However, current methods are limited to producing either deterministic or imprecise motion sequences, failing to effectively control the temporal and spatial relationships required to conform to a given text description. In this work, we propose a fine-grained method for generating high-quality, conditional human motion sequences supporting precise text description. Our approach consists of two key components: 1) a linguistics-structure assisted module that constructs accurate and complete language feature to fully utilize text information; and 2) a context-aware progressive reasoning module that learns neighborhood and overall semantic linguistics features from shallow and deep graph neural networks to achieve a multi-step inference. Experiments show that our approach outperforms text-driven motion generation methods on HumanML3D and KIT test sets and generates better visually confirmed motion to the text conditions.
Citation
Wang, Y., Leng, Z., Li, F. W. B., Wu, S., & Liang, X. (2023). Fg-T2M: Fine-Grained Text-Driven Human Motion Generation via Diffusion Model. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). https://doi.org/10.1109/ICCV51070.2023.02014
Presentation Conference Type | Conference Paper (Published) |
---|---|
Conference Name | 2023 IEEE/CVF International Conference on Computer Vision (ICCV) |
Start Date | Oct 2, 2023 |
End Date | Oct 6, 2023 |
Acceptance Date | Aug 11, 2023 |
Online Publication Date | Jan 15, 2024 |
Publication Date | 2023 |
Deposit Date | Sep 12, 2023 |
Publicly Available Date | Dec 31, 2023 |
Publisher | Institute of Electrical and Electronics Engineers |
Series ISSN | 1550-5499 |
Book Title | 2023 IEEE/CVF International Conference on Computer Vision (ICCV) |
ISBN | 9798350307191 |
DOI | https://doi.org/10.1109/ICCV51070.2023.02014 |
Public URL | https://durham-repository.worktribe.com/output/1735623 |
Files
Accepted Conference Paper
(4.3 Mb)
PDF
You might also like
Advances in Web-Based Learning - ICWL 2015
(-0001)
Book
Tackling Data Bias in Painting Classification with Style Transfer
(2023)
Presentation / Conference Contribution
Aesthetic Enhancement via Color Area and Location Awareness
(2022)
Presentation / Conference Contribution
STIT: Spatio-Temporal Interaction Transformers for Human-Object Interaction Recognition in Videos
(2022)
Presentation / Conference Contribution
STGAE: Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising
(2021)
Presentation / Conference Contribution
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search