|
Journal of Zhejiang University SCIENCE C
ISSN 1869-1951(Print), 1869-196x(Online), Monthly
2010 Vol.11 No.5 P.340-355
Online detection of bursty events and their evolution in news streams
Abstract: Online monitoring of temporally-sequenced news streams for interesting patterns and trends has gained popularity in the last decade. In this paper, we study a particular news stream monitoring task: timely detection of bursty events which have happened recently and discovery of their evolutionary patterns along the timeline. Here, a news stream is represented as feature streams of tens of thousands of features (i.e., keyword. Each news story consists of a set of keywords.). A bursty event therefore is composed of a group of bursty features, which show bursty rises in frequency as the related event emerges. In this paper, we give a formal definition to the above problem and present a solution with the following steps: (1) applying an online multi-resolution burst detection method to identify bursty features with different bursty durations within a recent time period; (2) clustering bursty features to form bursty events and associating each event with a power value which reflects its bursty level; (3) applying an information retrieval method based on cosine similarity to discover the event’s evolution (i.e., highly related bursty events in history) along the timeline. We extensively evaluate the proposed methods on the Reuters Corpus Volume 1. Experimental results show that our methods can detect bursty events in a timely way and effectively discover their evolution. The power values used in our model not only measure event’s bursty level or relative importance well at a certain time point but also show relative strengths of events along the same evolution.
Key words: Online event detection, Event’s evolution, News stream, Affinity propagation
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/jzus.C0910245
CLC number:
TP391
Download Full Text:
Downloaded:
3977
Clicked:
8085
Cited:
8
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked:
2010-04-09