Q253 : Analysis of summarizing the timeline in Persian news articles
Thesis > Central Library of Shahrood University > Computer Engineering > MSc > 2023
Authors:
Mahla Hesamifar [Author], Hoda Mashayekhi[Supervisor]
Abstarct: Due to the increasing use of news websites and digital news agencies by the public, as well as the proliferation of published news articles on specific topics, tracking and understanding the progression of a unique issue has become challenging. This issue has prompted researchers in the field of data mining to facilitate the rapid comprehension of news events over time by generating and summarizing timelines and making them accessible to the general public. Timelines essentially represent a sequence of significant events that ultimately lead to a particular outcome. Manual generation of these timelines is a labor-intensive task that requires significant time and resources. By creating a system for generating and summarizing timelines, news can be made available to readers and followers in an effective manner without the need for extensive time and cost. Unfortunately, no research has been conducted in the Persian language for the generation and summarization of timelines, and consequently, no Persian-language dataset is available for this purpose. Therefore, we attempted to lay the groundwork for research on the generation and summarization of timelines by initially collecting a Persian-language dataset from published digital news articles on four topics: "JCPOA" (Joint Comprehensive Plan of Action), "Metropl", "Recent Poisonings," and "Taliban" from four news agencies: "Aftab," "Khabar Online," "Fars," and "Young Journalists Club". In the next step, we endeavored to achieve valid and effective timelines through the development and presentation of algorithms for generating and summarizing timelines automatically, without human intervention. Finally, we compared the timelines produced by the automated algorithms with those created by human experts to ensure the accuracy and effectiveness of the algorithms. Observations and experiments indicate an average accuracy of about 60% for the timelines generated by the algorithms compared to those created by human experts, demonstrating a considerable level of accuracy and effectiveness in the performance of the algorithms. Furthermore, at the end of the project, we attempted to compare the performance of our proposed algorithm with other algorithms suggested in recent articles by applying it to a benchmark dataset for timeline generation and summarization using the Rouge measurement metric. The results show an average improvement of 33% in the Rouge-1, Rouge-2, and Rouge-S scores. It is hoped that this research will serve as a foundation for further studies and investigations in the field of timeline generation and summarization in the Persian language and be regarded as a reference for future endeavors.
Keywords:
#Time line #summary of time lines #summarization of news #evaluation of time lines. Keeping place: Central Library of Shahrood University
Visitor: