Q282 : Using Clustering For Timeline Summarization In News Articles
Thesis > Central Library of Shahrood University > Computer Engineering > MSc > 2024
Authors:
Abstarct:
With the explosion of information in modern society, a huge amount of news articles are continuously produced on the Internet by various news agencies. It is difficult for the average reader to summarize the huge volume of daily news articles that may cover a variety of topics and contain redundant or overlapping information. Many users of news programs have the experience of being overloaded with information about a number of hot current events, while still unable to obtain information about the events they are really interested in. In addition, search engines retrieve documents from large collections baxsed on user-entered queries. However, they do not provide a logical way for users to view trending topics or breaking news.
An emerging alternative way to present news collections without predetermined queries is to organize and present news articles through timeline summarization. Timeline summarization is an effective way to help readers of online news articles keep track of long-term news. This method automatically detects important events for key dates and provides a short summary of what happened on those dates.
Most works in timeline summarization have focused on improving summarization performance. However, these methods have drawbacks including: (a) The methods basically work on a homogeneous type of data sets. (b) The output is usually a single timeline regardless of the size and complexity of the input data set.
In this thesis, we intend to develop the flexibility and adaptability of summarizing the timeline by using the multiple timeline summarization method. According to a set of time stamped news articles, our method discovers important and different stories using two clustering stages and creates a timeline related to each story. In this method, we used two main modules to achieve our goals, which include the event generation module and the timeline generation module.
Finally, we compared the timelines produced by our algorithm with one or more reference standard summaries that were manually created by human experts to ensure the correctness and accuracy of our algorithm. The experiments show that the proposed algorithm has an improvement of 8.8% in the ROUGE-1 score and 1.3% in the ROUGE-2 score compared to the similar algorithm (Yu et al. 2021).
Keywords:
#news articles #event #summarization #timeline #multiple timeline #clustering Keeping place: Central Library of Shahrood University
Visitor:
Visitor: