Share Email Print

Proceedings Paper

Multi-dimension feature fusion for action recognition
Author(s): Pei Dong; Jie Li; Junyu Dong; Lin Qi
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. The challenge for action recognition is to capture and fuse the multi-dimension information in video data. In order to take into account these characteristics simultaneously, we present a novel method that fuses multiple dimensional features, such as chromatic images, depth and optical flow fields. We built our model based on the multi-stream deep convolutional networks with the help of temporal segment networks and extract discriminative spatial and temporal features by fusing ConvNets towers multi-dimension, in which different feature weights are assigned in order to take full advantage of this multi-dimension information. Our architecture is trained and evaluated on the currently largest and most challenging benchmark NTU RGB-D dataset. The experiments demonstrate that the performance of our method outperforms the state-of-the-art methods.

Paper Details

Date Published: 10 April 2018
PDF: 8 pages
Proc. SPIE 10615, Ninth International Conference on Graphic and Image Processing (ICGIP 2017), 106151C (10 April 2018); doi: 10.1117/12.2302485
Show Author Affiliations
Pei Dong, Ocean Univ. of China (China)
Jie Li, Ocean Univ. of China (China)
Junyu Dong, Ocean Univ. of China (China)
Lin Qi, Ocean Univ. of China (China)

Published in SPIE Proceedings Vol. 10615:
Ninth International Conference on Graphic and Image Processing (ICGIP 2017)
Hui Yu; Junyu Dong, Editor(s)

© SPIE. Terms of Use
Back to Top