TK366 : Time and frequency domain feature extraction from persian speech signals to improve performance of a Voice Activity Detection (VAD) system in human robot interaction
Thesis > Central Library of Shahrood University > Electrical Engineering > MSc > 2014
Authors:
Fahimeh Jomhoori [Author], Hosein Marvi[Supervisor], Alireza Ahmadifard[Advisor]
Abstarct: The use of social robots in human’s life has increased and the main way of Human Robot Interaction (HRI) is baxsed on verbal communication. social robots are endowed with microphones to receive speech signal for interaction with people. In order to receive environmental noise when human’s speech is recording, we need to have a system for detecting speech segments in recorded signals. Therefore, the goal of this work is to design a Voice Activity Detection (VAD) system to detect speech segments in noisy environment and increase the performance of the speech processing system in a social robot. In this work, different features are proposed for extraction from speech for VAD system. These features are presented with combination of energy and other features such as Root Mel Frequency Cepstral Coefficients (RMFCC), Bark Frequency Cepstral Coefficients (BFCC), Perceptual Linear Prediction (PLP) and Revised Perceptual Linear Prediction (RPLP). Another proposed method is baxsed on Wigner Ville Distribution (WVD) as a time-frequency feature. It has a better performance in compared to other proposed method. Therefore, for improving the performance of cepstral-baxsed methods, we combined them with WVD. To evaluate the performance of these methods, we utilized FarsDat databaxse as a Persian standard databaxse. To compute robustness of proposed feature extraction methods to noise, we add different kinds of noise in different level of Signal to Noise Ratio (SNR). Exprimental results show that some of the proposed feature extraction methods has the better performance in comparison with MFCC in different noisy environment.
Keywords:
#Voice Activity Detection #Wigner Ville Distribution #Root Mel Frequency Cepstral Coefficients #Bark Frequency Cepstral Coefficients #Perceptual Linear Prediction #Revised Perceptual Linear Prediction #energy #feature extraction #Human Robot Interaction #social robots. Link
Keeping place: Central Library of Shahrood University
Visitor: