Share Email Print

Proceedings Paper • new

Video segmentation using keywords
Author(s): Vinh Ton-That; Chi-Tai Vong; Xuan-Truong Nguyen-Dao; Minh-Triet Tran
Format Member Price Non-Member Price
PDF $14.40 $18.00

Paper Abstract

At DAVIS-2016 Challenge, many state-of-art video segmentation methods achieve potential results, but they still much depend on annotated frames to distinguish between background and foreground. It takes a lot of time and efforts to create these frames exactly. In this paper, we introduce a method to segment objects from video based on keywords given by user. First, we use a real-time object detection system - YOLOv2 to identify regions containing objects that have labels match with the given keywords in the first frame. Then, for each region identified from the previous step, we use Pyramid Scene Parsing Network to assign each pixel as foreground or background. These frames can be used as input frames for Object Flow algorithm to perform segmentation on entire video. We conduct experiments on a subset of DAVIS-2016 dataset in half the size of its original size, which shows that our method can handle many popular classes in PASCAL VOC 2012 dataset with acceptable accuracy, about 75.03%. We suggest widely testing by combining other methods to improve this result in the future.

Paper Details

Date Published: 13 April 2018
PDF: 8 pages
Proc. SPIE 10696, Tenth International Conference on Machine Vision (ICMV 2017), 106960U (13 April 2018); doi: 10.1117/12.2310102
Show Author Affiliations
Vinh Ton-That, Univ. of Science, VNU-HCM (Viet Nam)
Chi-Tai Vong, Univ. of Science, VNU-HCM (Viet Nam)
Xuan-Truong Nguyen-Dao, Univ. of Science, VNU-HCM (Viet Nam)
Minh-Triet Tran, Univ. of Science, VNU-HCM (Viet Nam)

Published in SPIE Proceedings Vol. 10696:
Tenth International Conference on Machine Vision (ICMV 2017)
Antanas Verikas; Petia Radeva; Dmitry Nikolaev; Jianhong Zhou, Editor(s)

© SPIE. Terms of Use
Back to Top