TK408 : Time-frequency feature extraction for visual recognition of persian vowels
Thesis > Central Library of Shahrood University > Electrical Engineering > MSc > 2013
Authors:
Abstarct: Visual features have been widely used to improve the performance of speech
recognition. In this thesis time - frequency features extracted from the images of the
speaker 's mouth and extracted features are used as input parameters to a neural network
system for recognition. Because we used the video images so we got to work a different
number of video frxames. First separated the frxames manually and then selected the area
around the mouth and desired features for the area of each frxame obtained. To improve
performance and reduce the dimensions of features, we used dimensionality reduction
technique LSDA. Using this approach we have reduced the size of our feature. The
databaxse consists of different individuals, that have been uttered monosyllabic words 2
or 3 times. Finally the vowel recognition rate 95.75 was achieved.
Keywords:
#Lip reading #Vowel recognition #Time-frequency features #Feature dimension reduction #Neural networks
Keeping place: Central Library of Shahrood University
Visitor:
Keeping place: Central Library of Shahrood University
Visitor: