Q112 : Probabilistic Topic Modeling: Incorporating Spatial Context
Thesis > Central Library of Shahrood University > Computer Engineering > PhD > 2017
Authors:
Marziea Rahimi [Author], Morteza Zahedi[Supervisor], Hoda Mashayekhi[Advisor]
Abstarct: Probabilistic topic models are well established tools in text analysis to deal with high-dimensionality of text data. Topics are also more meaningful than single words. A limitation of many probabilistic topic models, which work baxsed on document-level word co-occurrences, is their inability to use local context and spatial information. Some models capture the local context by integrating language and topic models. However, due to taking the exact word order into account, such models suffer severely from sparseness. On the other hand, in many applications, words order does not play a critical role. Our purpose is to introduce a model that benefits from local and spatial word relationships without amplifying the sparseness problem. For this purpose, it is assumed that each word corresponds to a window that covers some of its surrounding words. This window is used to encode spatial context. Evaluations are performed using perplexity, topic coherence and clustering performance on real data. The results of these evaluations are compared to the results of some baxseline models. According to these comparisons the proposed models outperform the baxselines in many cases. In the best case, the coherence has been improved by 28 percent.
Keywords:
#Probabilistic topic modeling #Text analysis #LDA #Graphical models #Gibbs sampling #Generative models #local word relationships #co-occurrence Link
Keeping place: Central Library of Shahrood University
Visitor: