Share Email Print
cover

Proceedings Paper

An image caption model incorporating high-level semantic features
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Encoder-decoder framework attracts great interests in image caption. It focuses on the extraction of low-level features and achieves good results. The performance can be further improved if high-level semantics are considered. In this work, we propose a new image caption model incorporating high-level semantic features through an revised Convolutional Neural Network(CNN). Both the low-level image features and high-level semantic features are fed into the Long-Short Term Memory networks(LSTMs) to acquire natural sentence descriptions. We show in a number of experiments on Flickr8K and Flickr30K datasets that our method outperforms most standard network baseline for image caption.

Paper Details

Date Published: 14 August 2019
PDF: 8 pages
Proc. SPIE 11179, Eleventh International Conference on Digital Image Processing (ICDIP 2019), 1117917 (14 August 2019); doi: 10.1117/12.2540579
Show Author Affiliations
Zhiwang Luo, Wuhan Univ. of Technology (China)
Jiwei Hu, Wuhan Univ. of Technology (China)
Quan Liu, Wuhan Univ. of Technology (China)
Jiamei Deng, Wuhan Univ. of Technology (China)


Published in SPIE Proceedings Vol. 11179:
Eleventh International Conference on Digital Image Processing (ICDIP 2019)
Jenq-Neng Hwang; Xudong Jiang, Editor(s)

© SPIE. Terms of Use
Back to Top
PREMIUM CONTENT
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?
close_icon_gray