Q180 : Improving Sentence Similarity Measures Using Statistical Approaches and WordNet-baxsed Metrics
Thesis > Central Library of Shahrood University > Computer Engineering > MSc > 2021
Authors:
Reza Javadzadeh [Author], Morteza Zahedi[Supervisor], [Advisor]
Abstarct: There is a widespread of information published every day on the internet. there are a lot of books from different cultures that are now digitized and kept in digital media in the form of natural language. this means that it is key to have solutions for extracting important or useful information from the massive amount of data. Computing sentence similarity and the performance of such systems is a delicate matter of huge potential. Semantic search, extractive summarization, sentiment analysis, document classification, plagiarism detection can all be considered special cases of similarity. Sentence similarity assesses the similarity of phrases in a sentence pair. The topic of this research is sentence similarity assessment, which is composed of short texts of less than two lines. In the case of sentence pair similarity, two sentences are sent to the system and the system must produce an output of range zero to one (zero means the pair share no similarity, and one means they are semantically equivalent). The purpose of this research is to design a system that can have a better correlation with those of human judges and to study the key elements of state-of-the-art systems. In the proposed wavelet transform method, first, we fine-tune the sentence encoder on the corresponding training data. Next, we decompose "sentence descxription" using wavelet transform, next, the cosine similarities of the corresponding channels of descxriptors are used to compute sentence similarity. The results suggest significant improvement over the Roberta-baxse method and all other baxse encoders. Most superior methods use a large model encodes however, in comparison our approach uses a baxse model which means less training and test time.
Keywords:
#Text similarity #Word co-occurrence #perplexity #word pair similarity #regression Keeping place: Central Library of Shahrood University
Visitor: