Share Email Print

Proceedings Paper

Speech enhancement with stacked frames and deep neural network for VoIP applications
Author(s): Jiantao Liu; Xiaoxiang Yang; Mingzhu Zhu; Bingwei He
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Speech enhancement is a critical part of variety types of communication systems and automatic speech recognition (ASR) applications. In this study we propose a speech enhancement method for real time VoIP applications with stacked frames and deep neural network, a novel data preparation approach is also introduced. In contrast to many states of art learning-based method, we focused on real-time implement in VoIP applications. Experiments were conducted on speech degraded by different noise types and SNR levels which were not seen in the training stage of the deep neural network and achieved a significant improvement on PESQ. Important traditional real-time speech enhancement method and most recent states of art learning-based method were also tested and compared with proposed method. The results show that proposed method effectively improve the speech intelligibility, greatly outperform traditional real-time minimum-mean square error (MMSE) algorithm and real-time learning-based CNN method in PESQ. We also achieve comparable PESQ in comparison with most recent state of the art learning-based method, but outperform it in time complexity. Making this method attractive in VoIP communication system applications which is high demand on communication latency.

Paper Details

Date Published: 14 February 2019
PDF: 7 pages
Proc. SPIE 11048, 17th International Conference on Optical Communications and Networks (ICOCN2018), 1104808 (14 February 2019); doi: 10.1117/12.2518296
Show Author Affiliations
Jiantao Liu, Fuzhou Univ. (China)
Xiaoxiang Yang, Fuzhou Univ. (China)
Quanzhou Normal Univ. (China)
Mingzhu Zhu, Fuzhou Univ. (China)
Bingwei He, Fuzhou Univ. (China)

Published in SPIE Proceedings Vol. 11048:
17th International Conference on Optical Communications and Networks (ICOCN2018)
Zhaohui Li, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?