Parallel rule induction with information theoretic pre-pruning

F. Stahl, Max Bramer, Mo Adda

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)peer-review

Abstract

In a world where data is captured on a large scale the major challenge for data mining algorithms is to be able to scale up to large datasets. There are two main approaches to inducing classification rules, one is the divide and conquer approach, also known as the top down induction of decision trees; the other approach is called the separate and conquer approach. A considerable amount of work has been done on scaling up the divide and conquer approach. However, very little work has been conducted on scaling up the separate and conquer approach.In this work we describe a parallel framework that allows the parallelisation of a certain family of separate and conquer algorithms, the Prism family. Parallelisation helps the Prism family of algorithms to harvest additional computer resources in a network of computers in order to make the induction of classification rules scale better on large datasets. Our framework also incorporates a pre-pruning facility for parallel Prism algorithms.
Original languageEnglish
Title of host publicationResearch and Development in Intelligent Systems XXVI
EditorsR. Ellis, M. Petridis
PublisherSpringer
Pages151-164
Number of pages14
Volume4
ISBN (Print)978184882983111
Publication statusPublished - 2010

Fingerprint

Dive into the research topics of 'Parallel rule induction with information theoretic pre-pruning'. Together they form a unique fingerprint.
  • Optimisation of extended generalised fat tree topologies

    Peratikou, A. & Adda, M., 2014, Distributed computer and communication networks: 17th international conference, DCCN 2013, Moscow, Russia, October 7-10, 2013. revised selected papers. Vishnevsky, V., Kozyrev, D. & Larionov, A. (eds.). Heidelberg: Springer, p. 82-90 (Communications in computer and information science ; vol. 279).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Open Access
    File
    257 Downloads (Pure)

Cite this