Federico Angelini
2D Pose-Based Real-Time Human Action Recognition With Occlusion-Handling
Angelini, Federico; Fu, Zeyu; Long, Yang; Shao, Ling; Naqvi, Syed Mohsen
Abstract
Human Action Recognition (HAR) for CCTV-oriented applications is still a challenging problem. Real-world scenarios HAR implementations is difficult because of the gap between Deep Learning data requirements and what the CCTV-based frameworks can offer in terms of data recording equipments. We propose to reduce this gap by exploiting human poses provided by the OpenPose, which has been already proven to be an effective detector in CCTV-like recordings for tracking applications. Therefore, in this work, we first propose ActionXPose: a novel 2D pose-based approach for pose-level HAR. ActionXPose extracts low- and high-level features from body poses which are provided to a Long Short-Term Memory Neural Network and a 1D Convolutional Neural Network for the classification. We also provide a new dataset, named ISLD, for realistic pose-level HAR in a CCTV-like environment, recorded in the Intelligent Sensing Lab. ActionXPose is extensively tested on ISLD under multiple experimental settings, e.g. Dataset Augmentation and Cross-Dataset setting, as well as revising other existing datasets for HAR. ActionXPose achieves state-of-the-art performance in terms of accuracy, very high robustness to occlusions and missing data, and promising results for practical implementation in real-world applications.
Citation
Angelini, F., Fu, Z., Long, Y., Shao, L., & Naqvi, S. M. (2020). 2D Pose-Based Real-Time Human Action Recognition With Occlusion-Handling. IEEE Transactions on Multimedia, 22(6), 1433-1446. https://doi.org/10.1109/tmm.2019.2944745
Journal Article Type | Article |
---|---|
Acceptance Date | Sep 25, 2019 |
Online Publication Date | Sep 30, 2019 |
Publication Date | 2020-06 |
Deposit Date | Jun 16, 2020 |
Publicly Available Date | Jun 16, 2020 |
Journal | IEEE Transactions on Multimedia |
Print ISSN | 1520-9210 |
Electronic ISSN | 1941-0077 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 22 |
Issue | 6 |
Pages | 1433-1446 |
DOI | https://doi.org/10.1109/tmm.2019.2944745 |
Public URL | https://durham-repository.worktribe.com/output/1299993 |
Files
Accepted Journal Article
(1.4 Mb)
PDF
Copyright Statement
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
EfficientTDNN: Efficient Architecture Search for Speaker Recognition
(2022)
Journal Article
Kernelized distance learning for zero-shot recognition
(2021)
Journal Article
A plug-in attribute correction module for generalized zero-shot learning
(2020)
Journal Article
Semantic combined network for zero-shot scene parsing
(2019)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search