Q51 : Persian News Classification Using Artificial Intelligence
Thesis > Central Library of Shahrood University > Computer Engineering > MSc > 2014
Authors:
Zahra Robati [Author], Morteza Zahedi[Supervisor], [Advisor]
Abstarct: Due to increasing growth of electronic texts, such as news texts, the problem of developing an efficient classifier becomes an important issue in many text-related applications such as news websites. The most important issues in the domain of text classification are efficient feature selection and feature extraction. In previous works, different feature validation criteria have been proposed and used. In this thesis, a feature validation criterion called E-Dominance is proposed. Using this criterion causes a significant reduction of number of selected features. Features used in this categorization, are called co-occurrence features wich have not been used in classification of Persian texts yet. In Englisg text classification, researches which use co-occurrence features, usually use binary wheighening method. In this thesis is proposed a wheighening method for co-occurrence features which is called Co-occur TFIDF. IN this study, developer features are used to develop texts, Therefor the problem of class overlapping is solved partially. Experiments show a significant improvement in the efficiency and accuracy of classification algorithms using E-Dominance criterion and co-occurrence features using Co-occur TFIDF weightening method.
Keywords:
#Persian text classification #feature selection #feature extraction #E-Dominance criterion #co-occurrence features #developer features #Co-occur TFIDF weightening method Link
Keeping place: Central Library of Shahrood University
Visitor: