Comparison of platforms for recommender algorithm on large datasets

Christina Diedhiou, Bryan Carpenter, Ramazan Esmeli

Research output: Chapter in Book/Report/Conference proceedingConference contribution

145 Downloads (Pure)

Abstract

One of the challenges our society faces is the ever increasing amount of data. Among existing platforms that address the system requirements, Hadoop is a framework widely used to store and analyze “big data”. On the human side, one of the aids to finding the things people really want is recommendation systems. This paper evaluates highly scalable parallel algorithms for recommendation systems with application to very large data sets. A particular goal is to evaluate an open source Java message passing library for parallel computing called MPJ Express, which has been integrated with Hadoop. As a demonstration we use MPJ Express to implement collaborative filtering on various data sets using the algorithm ALSWR (Alternating-Least-Squares with Weighted-λ-Regularization). We benchmark the performance and demonstrate parallel speedup on Movielens and Yahoo Music data sets, comparing our results with two other frameworks: Mahout and Spark. Our results indicate that MPJ Express implementation of ALSWR has very competitive performance and scalability in comparison with the two other frameworks.

Original languageEnglish
Title of host publication2018 Imperial College Computing Student Workshop, ICCSW 2018
EditorsEdoardo Pirovano, Eva Graversen
PublisherSchloss Dagstuhl – Leibniz Center for Informatics
Pages4:1-4:10
ISBN (Electronic)9783959770972
DOIs
Publication statusPublished - 24 Jan 2019
Event7th Imperial College Computing Student Workshop - London, United Kingdom
Duration: 20 Sep 201821 Sep 2018

Publication series

NameOpen Access Series in Informatics
Volume66
ISSN (Print)2190-6807

Conference

Conference7th Imperial College Computing Student Workshop
Abbreviated titleICCSW 2018
Country/TerritoryUnited Kingdom
CityLondon
Period20/09/1821/09/18

Keywords

  • Hadoop
  • HPC
  • Mahout
  • MPJ express
  • Spark

Fingerprint

Dive into the research topics of 'Comparison of platforms for recommender algorithm on large datasets'. Together they form a unique fingerprint.

Cite this