Share Email Print

Proceedings Paper

Video action recognition based on improved 3D convolutional network and sparse representation classification
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

In view of the problem that the typical convolutional neural networks fail to model actions at their full temporal extent, a novel video action recognition algorithm, which is based on improved 3D Convolutional Network (iC3D) architecture with K-means keyframes extraction and sparse representation classification (SRC), is proposed in this study. During the feature extraction process, the K-means keyframes extraction is constrained to reduce redundant information generated by continuous video frames and increase the temporal acceptance region. Meanwhile, to improve the noise immunity, sparse coding and its reconstruction errors are used for classification. The proposed method has 96.5% recognition accuracy on the typical video action classification dataset UCF101 that outperforms other competing methods. In addition, we built a wild test dataset to verify the generalization performance of the proposed model.

Paper Details

Date Published: 27 November 2019
PDF: 6 pages
Proc. SPIE 11321, 2019 International Conference on Image and Video Processing, and Artificial Intelligence, 1132115 (27 November 2019); doi: 10.1117/12.2542195
Show Author Affiliations
Wang Liu, Shanghai Univ. (China)
Qi Fu, Shanghai Univ. (China)
Yuqiu Lu, Shanghai Univ. (China)
Jinyu Sun, Shanghai Univ. (China)
Shiwei Ma, Shanghai Univ. (China)

Published in SPIE Proceedings Vol. 11321:
2019 International Conference on Image and Video Processing, and Artificial Intelligence
Ruidan Su, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?