Superclass Composition Approach to Remote Sensing Image Classification using Convolutional Neural Networks

  • Anas Tukur Balarabe

Student thesis: Doctoral Thesis


This thesis presents extensive work on remote sensing scene image classification, land use and land cover classification (LULC) using deep convolutional neural networks. One traditional convolutional neural network and three transfer learning-based models have been developed, implemented, trained, evaluated, and tested on a wide range of remote sensing scene classification datasets, including a newly introduced dataset. The results obtained have been critically analysed and compared with some state-of-the-art models. A framework has been incorporated with each repurposed architecture to improve the architecture’s classification efficiency.
The topic of land use and land cover classification has attracted the interest of many researchers in recent times. Various techniques have been proposed for LULC; while some are semantic segmentation-based, others classify an entire image to determine its class. The semantic segmentation approach labels objects as class members by assigning a different colour to each class. The third experiment investigates class heterogeneity in land used and land cover classification images, focusing on the prominent UCM dataset. The 21 classes of this dataset were carefully clustered into four superclasses based on their observable textural, spectral, or structural similarities to investigate how merging overlapping classes into superclasses could reduce misclassification by deep learning models due to inner-class similarity and outer-class variability. The efficiency and accuracy of the implemented approach have been demonstrated, reporting a superior performance in terms of Accuracy, Precision, Recall, and F1 score: 92.90%, 92.30%, 92.60% and 92.60%, respectively. As an improvement to the first case study, a framework for comparing and combining images of different classes into superclasses based on spatial, textural and colour similarities was developed for the third case study. This framework uses Bhattacharyya for colour-based image similarity analysis, a combination of LBPs (Local Binary Pattern), the Earth Mover’s Distance, and Euclidean Distance for the texture and spatial similarity analysis, and the structural similarity index (SSIM). A pre-trained CNN model (Xception) is then fine-tuned to classify the superclasses and the original classes of the Aerial Image (AID), the UC Merced, and the Optical Image Analysis and Learning (OPTIMAL-31) datasets. The results show that combining these images into superclasses reduces the possibility of misclassifications and brings greater performance efficiency to CNNs. The model evaluation also indicates that it can boost the CNNs' performances and

significantly reduce the impact of inner-class variability and their outer-class similarity. The results have been discussed, analysed and compared with some baseline approaches. The model achieved 99.05%, 98.64% and 96.90% overall accuracy for four and ten superclasses and the 21 classes, respectively. For the AID, the model achieved 97.25%, 97.905 and 95.75%, respectively. Similarly, for the OPTIMAL-31 dataset, the model produced competitive results. An overall accuracy value of 97.31&, 97.04% and 92.20% for the 7 superclasses, the 17 superclasses and the 30 classes, respectively.
The third case study addresses land use and land cover classification in Lami town. Part of an effort to assist the governments of Fiji, the Solomon Islands, and Vanuatu in their bid to build a robust defence against the catastrophic impacts of climate change and improve access to climate support funds, a group of six independent entities, public and private, teamed up to support to provide technical and logistical support to these governments. As a stakeholder in the project, the University of Portsmouth was saddled with the responsibility of conducting a preliminary investigation into the LULC classification in the region of interest. To that effect, a novel dataset consisting of four land use and land cover classification classes has been extracted, and a VGGNet pre-trained model has been modified, implemented, trained, evaluated and tested on this dataset. The result has been discussed, and all other necessary performance indicators have been tabulated and illustrated. As a new dataset with no record benchmark performance, the repurposed VGG19 architecture recorded 96.7%, 96.80%,96.50%, and 96.70% for overall accuracy, precision, recall and F1 score.
While fine-tuning a transfer learning model alleviates the need for large training data, it still comes with a few challenges. One of them is the range of image dimensions that the input layer of a model accepts. This issue is of interest, especially in tasks that require a transfer learning model. In scene classification, for instance, images could come in varying sizes that could be too large/small to be fed into the first layer of the architecture. While resizing could be used to trim images to a required shape, that is usually not possible for images with tiny dimensions, for example, in the case of the EuroSAT dataset. The fourth case study proposes an Xception model-based framework that accepts images of arbitrary size and then resizes or interpolates them before extracting and enhancing the discriminative features using an adaptive dilation module. This framework has been implemented, trained, evaluated and tested on the EuroSAT, UCM, AID and SIRI-WHU datasets. The results have been critically analysed, discussed and compared with the state-of-the-art models. The micro-average and macro-average ROC curve scores for all the datasets have also been monitored to further evaluate the proposed model’s effectiveness. The model outperformed all the baseline architectures on the SIRI-WHU dataset, producing a 96.04% overall accuracy, 99.52%, and 96.15% for the UCM and AID datasets, respectively.
Date of Award21 Nov 2023
Original languageEnglish
Awarding Institution
  • University of Portsmouth
SupervisorIvan Jordanov (Supervisor), Rinat Khusainov (Supervisor) & Ioannis Kagalidis (Supervisor)

Cite this