Share Email Print

Proceedings Paper • new

Speech enhancement based on spectrogram conditional generative adversarial networks
Author(s): Ru Han; Jianming Liu; Mingwen Wang
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Voice is the main way of communication and information sharing with others, It brings great convenience to human life. The existing speech recognition classification has the problem of considerable performance attenuation to environment noise and accent. Most of these problems can be mitigated by training on large amounts of data. However, collecting large Numbers of high-quality datasets in real life is time-consuming and expensive. In order to solve this problem, this paper proposes a data enhancement method,which is suitable for the speech image extension of small samples. S-GAN is used to generate datasets that conform to the real distribution of samples, and GAN-train and GAN-test methods are used to evaluate the quality and diversity of network generated images. Meanwhile, the spatial transformation network (STN) and CNN framework are combined to get the useful information part of the data for data classification. The results show that this method can significantly improve the classification accuracy of speech recognition and lay a foundation for small sample data enhancement.

Paper Details

Date Published: 3 January 2020
PDF: 10 pages
Proc. SPIE 11373, Eleventh International Conference on Graphics and Image Processing (ICGIP 2019), 113732S (3 January 2020); doi: 10.1117/12.2557256
Show Author Affiliations
Ru Han, Jiangxi Normal Univ. (China)
Jianming Liu, Jiangxi Normal Univ. (China)
Mingwen Wang, Jiangxi Normal Univ. (China)

Published in SPIE Proceedings Vol. 11373:
Eleventh International Conference on Graphics and Image Processing (ICGIP 2019)
Zhigeng Pan; Xun Wang, Editor(s)

© SPIE. Terms of Use
Back to Top