Skip to content

Unified framework for control of machine learning tasks towards effective and efficient processing of big data

Research output: Chapter in Book/Report/Conference proceedingChapter (peer-reviewed)peer-review

Big data can be generally characterised by 5 Vs – Volume, Velocity, Variety, Veracity and Variability. Many studies have been focused on using machine learning as a powerful tool of big data processing. In machine learning context , learning algorithms are typically evaluated in terms of accuracy, efficiency, interpretability and stability. These four dimensions can be strongly related to veracity, volume, variety and variability and are impacted by both the nature of learning algorithms and characteristics of data. This chapter analyses in depth how the quality of computational models can be impacted by data characteristics as well as strategies involved in learning algorithms. This chapter also introduces a unified framework for control of machine learning tasks towards appropriate employment of algorithms and efficient processing of big data. In particular, this framework is designed to achieve effective selection of data pre-processing techniques towards effective selection of relevant attributes, sampling of representative training and test data, and appropriate dealing with missing values and noise. More importantly, this framework allows the employment of suitable machine learning algorithms on the basis of the training data provided from the data pre-processing stage towards building of accurate, efficient and interpretable computational models.
Original languageEnglish
Title of host publicationData science and big data
Subtitle of host publicationAn environment of computational intelligence
EditorsWitold Pedrycz, Shyi-Ming Chen
PublisherSpringer
Pages123-140
ISBN (Electronic)978-3-319-53474-9
ISBN (Print)978-3-319-53473-2
DOIs
Publication statusPublished - 2017

Publication series

Name Studies in Big Data
PublisherSpringer
Volume24
ISSN (Print)2197-6503

Related information

Relations Get citation (various referencing formats)

ID: 5029290