Q22 : Persian Text Classification Using FarsNet Ontology
Thesis > Central Library of Shahrood University > Computer Engineering > MSc > 2012
Saba Sadat Madani [Author], Prof. Hamid Hassanpour[Supervisor], Morteza Zahedi[Advisor]
Abstarct: Due to the rapid growth of electronic documents, the need for an efficient classifier in data mining is obvious. Recently, in order to increase classification accuracy, using lexical ontology as an external reference and the process of extracting knowledge from text classification has been proposed. Hence, the aim of this project is to provide and implement a system to automatically classify documents in a way that uses FarsNet lexical ontology within its operations. Consequently, the weights associated to words of knowledge background in text increases. The proposed approach uses lexical ontology and focuses on semantic feature vector, Thereby improve the classification process. In this project, in addition to study on the methods of using lexical ontology in the text classification process, the FarsNet lexical ontology in order to extract semantic relationships is used. In the proposed system, all components of the text classification system, including word processor, reducing the features, the feature selectors, the feature weighting and classification of documents are considered. In this project, the χ^2 algorithm as feature selection method and normalized TFIDF as weighting method are applied. The results of our study showed significant improvement in the efficiency and accuracy of classification algorithms using a lexical ontology.
#Classification of persian texts #FarsNet ontology #semantic feature vector #disambiguation #semantic relations #first concept method Keeping place: Central Library of Shahrood University