TK701 : Speaker diarization baxsed on mixed feature extraction in multi -speaker environment
Thesis > Central Library of Shahrood University > Electrical Engineering > MSc > 2019
Authors:
Mitra Jahanian [Author], Hosein Marvi[Supervisor], Seyed Masoud Mirrezaei[Advisor]
Abstarct: Nowadays, in the world of data processing with high speed and accuracy is very important. Processing speech Data because of the widespread use will contribute in all aspects of human life. Speaker diarization is to recognize who speaks when. The goal is to design a speaker recognition system that identify the speaker change in the audio files and correctly labeled and cluster each speaker's speech. This process is known speaker diarization named today. In this context, the aim is design a system that uses acoustic features MFCC and its first and second order along with energy and zero-crossing rate. Then model silence and music using frxame that are absolutely non-speech or music and in tow-steps -silence removal and music removal- seprate speech from audio file. Therefore by using i-vectors in Feature space, reduce dimentional of system. We use integer linear programming (ILP) to label and cluster speaker and modify parameters in this models. Designed system, was examined on AMI corpus. We achieved good result and show in case of DER error.
Keywords:
#speaker diarization #integer linear programming #i-vector #speech processing Link
Keeping place: Central Library of Shahrood University
Visitor: