TK408 : Time-frequency feature extraction for visual recognition of persian vowels
Thesis > Central Library of Shahrood University > Electrical Engineering > MSc > 2013
Authors:
Nasrin Yadegar Khosravieh [Author], Hosein Marvi[Supervisor], Alireza Ahmadifard[Advisor]
Abstarct: Visual features have been widely used to improve the performance of speech recognition. In this thesis time - frequency features extracted from the images of the speaker 's mouth and extracted features are used as input parameters to a neural network system for recognition. Because we used the video images so we got to work a different number of video frxames. First separated the frxames manually and then selected the area around the mouth and desired features for the area of each frxame obtained. To improve performance and reduce the dimensions of features, we used dimensionality reduction technique LSDA. Using this approach we have reduced the size of our feature. The databaxse consists of different individuals, that have been uttered monosyllabic words 2 or 3 times. Finally the vowel recognition rate 95.75 was achieved.
Keywords:
#Lip reading #Vowel recognition #Time-frequency features #Feature dimension reduction #Neural networks Link
Keeping place: Central Library of Shahrood University
Visitor: