Cost-efficient mining techniques for data streams

M. Gaber, S. Krishnaswamy, A. Zaslavsky

Research output: Contribution to conferencePaperpeer-review

18 Downloads (Pure)

Abstract

A data stream is a continuous and high-speed flow of data items. High speed refers to the phenomenon that the data rate is high relative to the computational power. The increasing focus of applications that generate and receive data streams stimulates the need for online data stream analysis tools. Mining data streams is a real time process of extracting interesting patterns from high-speed data streams. Mining data streams raises new problems for the data mining community in terms of how to mine continuous high-speed data items that you can only have one look at. In this paper, we propose algorithm output granularity as a solution for mining data streams. Algorithm output granularity is the amount of mining results that fits in main memory before any incremental integration. We show the application of the proposed strategy to build efficient clustering, frequent items and classification techniques. The empirical results for our clustering algorithm are presented and discussed which demonstrate acceptable accuracy coupled with efficiency in running time.
Original languageEnglish
Pages109-114
Number of pages6
Publication statusPublished - 2004
EventAustralasian Workshop on Data Mining and Web Intelligence - Dunedin, New Zealand
Duration: 1 Jan 2004 → …

Conference

ConferenceAustralasian Workshop on Data Mining and Web Intelligence
Abbreviated titleDMWI2004
Country/TerritoryNew Zealand
CityDunedin
Period1/01/04 → …

Fingerprint

Dive into the research topics of 'Cost-efficient mining techniques for data streams'. Together they form a unique fingerprint.

Cite this