Data analytics for online travelling recommendation system: a case study

Alessio Petrozziello, Ivan Jordanov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Nowadays, the online travel agencies (OTAs) provide the main service for booking holidays, business trips, accommodations, etc. As in all online services where users, items, and decisions are involved, there is a necessity for a Recommender System (RS) to facilitate the navigation of catalogues and websites. For a travel RS the use of a pure collaborative filtering approach is not feasible because the user-item matrix is way too sparse. For this reason, a content-based filtering is investigated in this work, focusing on one of its main problems: missing features. An initial exploratory analysis helps to identify a class of poorly ranked properties (e.g., Vacation Rentals (VR)). To deal with the missingness in the data, several state-of-the-art imputation methods (K-NN, Random Forests, and Gradient-Boosted Trees) are investigated and their performance critically analysed and tested. These techniques are applied following dataset preprocessing that includes cleaning, feature scaling, and standardization. In addition to that, a k-fold cross validation is used to validate the imputation results and reduce the possibility of overfitting. Three similarity measures (Jaccard, Weighted Hamming and Fuzzy-C-Means rankings) based on engineered non-historical features (amenities and geographical position) are analysed and employed for determining the best proxy for unavailable features.
Original languageEnglish
Title of host publicationProceedings of Modelling, Identification and Control (MIC2017)
EditorsM. H. Hamza
PublisherACTA Press
Pages106-112
Number of pages7
ISBN (Print)978-0-88986-988-2, 978-0-88986-989-9
DOIs
Publication statusPublished - 1 Mar 2017
EventThe 36th IASTED International Conference on Modelling, Identification and Control: MIC 2017 - Innsbruck, Austria
Duration: 20 Feb 201721 Feb 2017
https://www.iasted.org/conferences/pastinfo-848.html

Publication series

Name
ISSN (Print)1025-8973

Conference

ConferenceThe 36th IASTED International Conference on Modelling, Identification and Control
Country/TerritoryAustria
CityInnsbruck
Period20/02/1721/02/17
Internet address

Keywords

  • Data Analytics
  • Big Data
  • Fuzzy-C-Means
  • Random Forests
  • K-NN
  • Missing Features
  • Recommender Systems

Fingerprint

Dive into the research topics of 'Data analytics for online travelling recommendation system: a case study'. Together they form a unique fingerprint.

Cite this