TK282 : Document image restoration for images scanned from Persian books
Thesis > Central Library of Shahrood University > Electrical Engineering > MSc > 2013
Authors:
Abstarct: Document images produced by scanner or digital camera, usually suffer from two main distortions: geometric and photometric. Both of them deteriorate the performance of OCR systems. In this thesis, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is baxsed on adaptive document image binarization using a modified Sauvola method, and transformation which addresses the projection of the curve lines to 2-D rectangular area combined with finding text lines. We proposed 4 methods, suitable for English and Farsi documents. To evaluate the proposed methods, we used two OCR softwares, Persian Reader and OmniPage. Experimental results on several document images indicate the effectiveness of the proposed method.
Keywords:
#Document image processing #document image rectification #image dewarping #text line detection #photometric distortion #geometric distortion #dataset #OCR.
Keeping place: Central Library of Shahrood University
Visitor:
Keeping place: Central Library of Shahrood University
Visitor: