TK553 : Speech Separation Of Two Speakers baxsed on Appropriate Time-Frequency Features
Thesis > Central Library of Shahrood University > Electrical Engineering > MSc > 2016
Authors:
Rana Dehghani [Author], Hosein Marvi[Supervisor]
Abstarct: One of the most important subjects in the field of communication is separating speech by machine that remains as a challenge due to the problems and weaknesses in this system. This is while human hearing system have remarkable abilities in comparison to speech separator machines. Consequently, due to the differences in the operation of speech separator system in machine and human hearing systems, an efficient system for speech separation by machines is indispensable. For instance, one of the disadvantages of automated speech recognizer is its performance decrement in noisy environment. Therefore, such a system have to be equipped with a proper speech separator, not only to improve its performance in different conditions, but also to improve speech quality and reduce non-speech signal transfer cost. As accurate Time-Frequency units selection has a significant impact on separation results, in this thesis we propose a supervised method to select appropriate frequency-time units. In the proposed approach, speakers' speech samples are first modeled via training a dictionary and obtaining fundamental vectors related to each speaker. After modeling two synchronous speakers' speeches, separation operation is done for each speech according to its frequency component. Furthermore, to reduce noise and to remove irrelevant frequency components, a post processing step is also carried out on the output speech. As can be seen from the systems' output results, our proposed method has a considerable improvement in its operation compared to other feature-baxsed methods.
Keywords:
#Computational auditory scene analysis (CASA)- cochannel speech separation- supervised segregation Link
Keeping place: Central Library of Shahrood University
Visitor: