A data stream is a continuous and high-speed flow of data items. High speed refers to the phenomenon that the data rate is high relative to the computational power. The increasing focus of applications that generate and receive data streams stimulates the need for online data stream analysis tools. Mining data streams is a real time process of extracting interesting patterns from high-speed data streams. Mining data streams raises new problems for the data mining community in terms of how to mine continuous high-speed data items that you can only have one look at. In this paper, we propose algorithm output granularity as a solution for mining data streams. Algorithm output granularity is the amount of mining results that fits in main memory before any incremental integration. We show the application of the proposed strategy to build efficient clustering, frequent items and classification techniques. The empirical results for our clustering algorithm are presented and discussed which demonstrate acceptable accuracy coupled with efficiency in running time.
|Number of pages||6|
|Publication status||Published - 2004|
|Event||Australasian Workshop on Data Mining and Web Intelligence - Dunedin, New Zealand|
Duration: 1 Jan 2004 → …
|Conference||Australasian Workshop on Data Mining and Web Intelligence|
|Period||1/01/04 → …|