Share Email Print

Proceedings Paper

A Lite Asymmetric DenseNet for effective object detection based on convolutional neural networks (CNN)
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Recently, convolutional neural networks (CNN) have been widely used in object detection and image recognition for their effectiveness. Many highly accurate classification models based on CNN have been developed for various machine learning applications, but they generally computationally costly and require a hardware-based platform with super computing power and memory resources to implement the algorithm. In order to accurately and efficiently achieve object detection tasks using CNN on a system with limited resources such as a mobile device, we propose an innovative type of DenseNet, which is a lightweight convolutional neural network algorithm called Lite Asymmetric DenseNet (LADenseNet). Aiming to compress the CNN model complexity, we replace the 7 x 7 convolution and 3 x 3 max-pool with multiple 3 x 3 convolutions and a 2 x 2 max-pool in the initial down-sampling process to significantly reduce the computing cost. In the design of the dense blocks, channel splitting and channel shuffling are employed to enhance the information exchange of feature maps and improve the expressive ability of the network. We decompose the 3 x 3 convolution in the dense block into a combination of 3 x 1 and 1 x 3 convolutions, which can speed up the computations and extract more spatial features by using asymmetric convolutions. To evaluate the performance of the proposed approach we develop an experimental system in which LA-DenseNet is used to extract features and Single Shot MultiBox Detector (SSD) is used to detect objects. With VOC2007+12 as training and testing datasets, our model achieves comparable detection accuracy as YOLOv2 with a fraction of its computational cost and memory usage.

Paper Details

Date Published: 18 November 2019
PDF: 10 pages
Proc. SPIE 11187, Optoelectronic Imaging and Multimedia Technology VI, 111871T (18 November 2019); doi: 10.1117/12.2538755
Show Author Affiliations
Long Huang, Beijing Univ. of Technology (China)
Ministry of Education (China)
Beijing Lab. for Urban Mass Transit (China)
Kun Ren, Beijing Univ. of Technology (China)
Ministry of Education (China)
Beijing Lab. for Urban Mass Transit (China)
Chunqi Fan, Beijing Univ. of Technology (China)
Ministry of Education (China)
Beijing Lab. for Urban Mass Transit (China)
Hai Deng, Florida International Univ. (United States)

Published in SPIE Proceedings Vol. 11187:
Optoelectronic Imaging and Multimedia Technology VI
Qionghai Dai; Tsutomu Shimura; Zhenrong Zheng, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?