PMCRI: a parallel modular classification rule induction framework

F. Stahl, Max Bramer, Mo Adda

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)peer-review

192 Downloads (Pure)

Abstract

In a world where massive amounts of data are recorded on a large scale we need data mining technologies to gain knowledge from the data in a reasonable time. The Top Down Induction of Decision Trees (TDIDT) algorithm is a very widely used technology to predict the classification of newly recorded data. However alternative technologies have been derived that often produce better rules but do not scale well on large datasets. Such an alternative to TDIDT is the PrismTCS algorithm. PrismTCS performs particularly well on noisy data but does not scale well on large datasets. In this paper we introduce Prism and investigate its scaling behaviour. We describe how we improved the scalability of the serial version of Prism and investigate its limitations. We then describe our work to overcome these limitations by developing a framework to parallelise algorithms of the Prism family and similar algorithms. We also present the scale up results of a first prototype implementation.
Original languageEnglish
Title of host publicationMachine learning and data mining in pattern recognition: 6th international conference, MLDM 2009, Leipzig, Germany, July 23-25, 2009. proceedings
EditorsP. Perner
Place of PublicationBerlin
PublisherSpringer
Pages148-162
Number of pages15
Volume5632
Edition5632
ISBN (Print)9783642030697
DOIs
Publication statusPublished - 2009

Publication series

NameLecture notes in computer science
PublisherSpringer Veralg
Number5632
ISSN (Print)0302-9743

Fingerprint

Dive into the research topics of 'PMCRI: a parallel modular classification rule induction framework'. Together they form a unique fingerprint.
  • Optimisation of extended generalised fat tree topologies

    Peratikou, A. & Adda, M., 2014, Distributed computer and communication networks: 17th international conference, DCCN 2013, Moscow, Russia, October 7-10, 2013. revised selected papers. Vishnevsky, V., Kozyrev, D. & Larionov, A. (eds.). Heidelberg: Springer, p. 82-90 (Communications in computer and information science ; vol. 279).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Open Access
    File
    257 Downloads (Pure)

Cite this