Loan default prediction using spark machine learning algorithms

Aiman Muhammad Uwais, Hamidreza Khaleghzadeh*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

34 Downloads (Pure)


Loan lending has been an important business activity for both individuals and financial institutions. Profit and loss of financial lenders to an extent depend on loan repayment. Though loan lending is beneficial for both lenders and borrowers, it does carry a great risk of the inability of the loan receiver to repay back the loan. This inability is termed as loan default. Loan default prediction is a crucial process that should be carried out by financial lenders to help them find out if a loan can default or not. Successful loan default prediction can help financial institutions to decrease the number of bad loan issues and eventually increase profit. The aim of this paper is to use data mining techniques to bring out insight from data then build a loan prediction model using machine learning algorithms on the Sparks Big Data platform. Six supervised machine learning classification algorithms are applied to predict loan default: Logistic Regression, Decision Tree, Random Forest, Gradient Boosted Tree (GBTs), Factorization Machines (FM) and Linear Support Vector Machine (LSVM). Accuracy, precision, recall, ROC curve and F measure are used to evaluate the models and the results compared. We achieve the highest accuracy of 99.62% using the Decision Tree and Random Forest Models.
Original languageEnglish
Title of host publicationArtificial Intelligence and Cognitive Science 2021
Subtitle of host publication29th Irish Conference on Artificial Intelligence and Cognitive Science (AICS)
EditorsArjun Pakrashi, Ellen Rushe, Mehran Hossein Zadeh Bazargani, Brian Mac Namee
PublisherCEUR Workshop Proceedings
Number of pages12
Publication statusPublished - 10 Mar 2022
EventAIAI 29th Irish Conference on Artificial Intelligence and Cognitive Science - University College Dublin Online Event, Dublin, Ireland
Duration: 9 Dec 202110 Dec 2021

Publication series

NameCEUR Workshop Proceedings
ISSN (Print)1613-0073


ConferenceAIAI 29th Irish Conference on Artificial Intelligence and Cognitive Science
Abbreviated titleAICS 2021


  • loan default
  • prediction
  • machine learning
  • big data
  • spark


Dive into the research topics of 'Loan default prediction using spark machine learning algorithms'. Together they form a unique fingerprint.

Cite this