Share Email Print

Proceedings Paper

Image caption generation method based on adaptive attention mechanism
Author(s): Huazhong Jin; Yu Wu; Fang Wan; Man Hu; Qingqing Li
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

An image caption generation model with adaptive attention mechanism is proposed for dealing with the weakness of the image description model by the local image features. Under the framework of encoder and decoder architecture, the local and global features of images are extracted by using inception V3 and VGG19 network models at the encoder. Since the adaptive attention mechanism proposed in this paper can automatically identify and acquire the importance of local and global image information, the decoder can generate sentences describing the image more intuitively and accurately. The proposed model is trained and tested on Microsoft COCO dataset. The experimental results show that the proposed method can extract more abundant and complete information from the image and generate more accurate sentences, compared with the image caption model based on local features.

Paper Details

Date Published: 14 February 2020
PDF: 8 pages
Proc. SPIE 11430, MIPPR 2019: Pattern Recognition and Computer Vision, 114301C (14 February 2020); doi: 10.1117/12.2539338
Show Author Affiliations
Huazhong Jin, Hubei Univ. of Technology (China)
Yu Wu, Hubei Univ. of Technology (China)
Fang Wan, Hubei Univ. of Technology (China)
Man Hu, Hubei Univ. of Technology (China)
Qingqing Li, Hubei Univ. of Technology (China)

Published in SPIE Proceedings Vol. 11430:
MIPPR 2019: Pattern Recognition and Computer Vision
Nong Sang; Jayaram K. Udupa; Yuehuan Wang; Zhenbing Liu, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?