Parallel Processing of Burst Detection in Large-Scale Document Streams and Its Performance Evaluation

Keiichi Tamura ., Kaishi Hirahara ., Hajime Kitakami ., Shingo Tamura .

Abstract


Online documents on the Internet are represented as
a document stream because the documents have a temporal order.
This has resulted in numerous studies on extracting a frequent
phenomenon (involving keywords, users, locations etc.) known
as a burst. Recently, with the growth of interest in social media,
the number of documents created on the Internet has increased
exponentially. Therefore, the speed-up of burst detection in
a large-scale document stream is one of the most important
challenges. In this paper, we propose a novel parallelization
method for the parallel processing of Kleinberg’s burst detection
algorithm in a large-scale document stream. Specifically, we
present a technique to combine the inter-task parallelization
model with the intra-task parallelization model. This combination
can achieve seamless dynamic load balancing and detect bursts
in a large-scale document streams in memory.


Full Text:

PDF

Refbacks

  • There are currently no refbacks.