Distributed neural networks for missing big data imputation

Alessio Petrozziello, Ivan Jordanov, Christian Sommeregger

Research output: Chapter in Book/Report/Conference proceedingConference contribution

192 Downloads (Pure)


In this paper we investigate the use of Distributed Neural Networks for the imputation of missing values in Big Data context. The presented framework for data imputation is implemented in Spark, allowing easy imputation as an additional step to the data pre-processing pipeline. The Distributed Neural Networks model is using Mini-batch Stochastic Gradient Descent, scaling well with the cluster size and minimizing the communication among the workers. The model is tested on a real-world Recommender Systems dataset, where the missing data is generally a problem for new items, as the systems ranking is usually biased towards the popular items. The model is compared with univariate (Mean and Median Imputation) and multivariate (K-Nearest Neighbours and Linear Regression) imputation techniques, and its performance is validated using prediction accuracy and speed. Furthermore, we evaluate the speedup compared to the sequential implementation of Neural Networks with Stochastic Gradient Descent.
Original languageEnglish
Title of host publication2018 International Joint Conference on Neural Networks (IJCNN)
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages9
ISBN (Electronic)978-1-5090-6014-6
ISBN (Print)978-1-5090-6015-3
Publication statusPublished - 15 Oct 2018
EventIEEE WCCI 2018, World Congress on Computational Intelligence - Roi de Janeiro, Brazil
Duration: 8 Jul 201813 Jul 2018

Publication series

NameIEEE IJCNN Proceedings Series
ISSN (Electronic)2161-4407


ConferenceIEEE WCCI 2018, World Congress on Computational Intelligence
Abbreviated titleIJCNN
Internet address


  • Distributed Computation
  • Big Data
  • Missing Data Imputation
  • Neural Networks


Dive into the research topics of 'Distributed neural networks for missing big data imputation'. Together they form a unique fingerprint.

Cite this