Q66 : Text Compression Using Artificial Intelligence Techniques
Thesis > Central Library of Shahrood University > Computer Engineering > MSc > 2015
Authors:
Mahbubeh Soleymanian [Author], Ali Pouyan[Supervisor], Hoda Mashayekhi[Advisor]
Abstarct: The growth of digital data in recent years, cause of increasing attention has been text compressing. Information of text type that every day, witness to send and receive it. The need to reduce the amount of data and saving storage space, compression has become a critical phenomenon. With increasing non-English and non-Latin texts, the need for the development of compression algorithms also be felt in other languages. This thesis is an attempt to provide a technique to compress Persian texts. In this study, the purpose is using the rules and techniques of modeling languages. The rules that in the known and common compression algorithms such as zip is not considered. In this technique, with use a statistical model N-gram, we investigate the probability of the sequence of words and language characters after another, considering the number of repetitions and the length parameters. To evaluate and select the model with the more performance is using from perplexity criteria that is independent of the system and fit with probability attributed to exxpressions. Compare the results obtained, show the compression rate 82% with regard to language information the output the zip compression. In the future stages of analysis baxsed on language modeling, results will be described the procedures and result obtained.
Keywords:
#Compression #Coding data #Persian literature #language model Link
Keeping place: Central Library of Shahrood University
Visitor: