TK260 : Offline Farsi Handwritten Word Recognition Using Hidden Markov Model
Thesis > Central Library of Shahrood University > Electrical Engineering > MSc > 2013
Authors:
zahra emani [Author], Alireza Ahmadifard[Supervisor], Hossein Khosravi[Advisor]
Abstarct: In this thesis we address the problem of recognizing Farsi handwritten words. Three methods have been proposed to implement this system. In the first method, the image of a word is divided into overlapped vertical stripes and for each strip chain-code of word boundary are extracted and used as features. The extracted feature vectors are coded using Self Organizing Map vector quantization. The codes of quantized vectors are then used for training the model of each word in the databaxse. We model each word in databaxse using discrete Hidden Markov Models (HMM). In the second method, we extract two types of features from each word in the training and evaluation phase, chain code of boundary and distribution of foreground density across the image word. To improve the performance of classification system we utilize the provided confidence measure by HMM for each test data. For those test data (feature vector of input words) which the confidence measure are below a pre-define threshold, we use k-nearest neighbor classifier as tandem to decide to which class the data point belongs. In order to evaluate the performance of the proposed system we conducted an experiment using a new prepared databaxse FARSA. We tested the proposed method using 198 word classes in this databaxse. Result of experiment three proposed methods showed first method achieved 66.57% with considered codebook size 49 and smoothing factor 0.001, then with same parameters value second method evaluated that result of this experiment showed 2% improvement than first method. Finally in third proposed method, we experimentally set a proper threshold (threshold is equal to 5) for confidence measure, then we evaluated the system with same previous parameters and several value for k in k-nearest neighbor classifier. The best result obtained 61.69% with city-block distance and k equal to 11. So total recognition rate equal to 76.49% that showed 7% improvement than first method. This result showed the performance of proposed systems is considerably better than the performance of existing methods.
Keywords:
#Handwritten Farsi word recognition #FARSA databaxse #chain-code histogram #Self Organization Map #Hidden Markov Models #Confidence measure #K-Nearest Neighbors Link
Keeping place: Central Library of Shahrood University
Visitor: