Share Email Print

Proceedings Paper

Action recognition by mid-level discriminative spatial-temporal volume
Author(s): Feifei Chen; Nong Sang
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

Most of recent work on action recognition in video employ action parts, attributes etc. as mid- and high-level features to represent an action. However, these action parts, attributes subject to some aspects of weak discrimination and being difficult to obtain. In this paper, we present an approach that uses mid-level discriminative Spatial-Temporal Volume to recognize human actions. The spatial-temporal volume is represented by a Feature Graph which is constructed beyond on a local collection of feature points (e.g., cuboids, STIP) located in the corresponding spatial-temporal volume. Firstly, we densely sampling spatial-temporal volumes from training videos and construct a feature graph for each volume. Then, all feature graphs are clustered using spectral cluster method. We regard feature graphs as video words and characterize videos with the bag-of-features framework which we call it the bag-of-feature-graphs framework. While, in the process of clustering, the distance between two feature graphs is computed using an efficient spectral method. Final recognition is accomplished using a linear-SVM classifier. We test our algorithm in a publicly available human action dataset, the experimental results show the effectiveness of our method.

Paper Details

Date Published: 27 October 2013
PDF: 6 pages
Proc. SPIE 8919, MIPPR 2013: Pattern Recognition and Computer Vision, 89190H (27 October 2013); doi: 10.1117/12.2031129
Show Author Affiliations
Feifei Chen, Huazhong Univ. of Science and Technology (China)
Nong Sang, Huazhong Univ. of Science and Technology (China)

Published in SPIE Proceedings Vol. 8919:
MIPPR 2013: Pattern Recognition and Computer Vision
Zhiguo Cao, Editor(s)

© SPIE. Terms of Use
Back to Top