Improving Video Encoding Using Deep Learning Super Resolution

  • Mohamed Saleh Abdelazim

Student thesis: Doctoral Thesis

Abstract

The digital landscape is rapidly evolving, marked by significant advancements in digital hardware and data processing. This progress has led to new requirements for consumer applications on both mobile and desktop platforms, as well as for commercial applications, particularly in servers and cloud systems. Concurrently, communication speeds have increased dramatically, encompassing both wired (fibre-optic) and wireless (5G) media, fuelling the demand for high-quality video processing.
Deep learning-based techniques have also shown remarkable effectiveness in tackling computer vision challenges, including image classification, segmentation, object detection, and super-resolution. This project concentrates on optimising a system for two key video processing operations: video encoding and super-resolution. It investigates the area of improving video encoding using super-resolution technique within the framework of the latest machine learning methods.
This involves using popular video encoders like High Efficiency Video Coding (HEVC), incorporating down-/up-sampling operations at both ends of the encoding process at the block level. The innovative approach involves selectively down-/up-sampling certain pixels within the blocks while maintaining others in high-resolution encoding. Machine learning techniques have been applied in selecting this pixel subset and enhancing the quality of the super- resolution operation. The novel approach is applied to both single images encoding and video encoding.
The development of this novel system involved a detailed analysis of the video coding process and super-resolution upscaling. The systems were then integrated using insights from this analysis and cutting-edge machine learning algorithms.
The system's proof of concept and testing were conducted in two phases: initial experiments with JPEG images followed by video performance assessments. The outcomes from both stages contribute significantly to the advancement of the field and its related application.
The experimental results demonstrate superior performance when compared to both the standard and similar methods in terms of quality and bit rate, particularly in the video coding stage. In the initial experiments with JPEG images, the results were notably efficient, where approximately 3% of the blocks were encoded with higher efficiency than standard JPEG encoding. This efficiency was achieved regardless of the presence or absence of down/up- sampling. Moreover, the average Peak Signal-to-Noise Ratio (PSNR) improvement for these blocks, at the same rate, amounted to 0.7%. Across all sequences, discernible improvements over standard JPEG encoding in certain blocks were observed. This suggests the potential for an advanced JPEG encoder capable of dynamically assessing and selectively employing the most efficient method for each block, thus maximising overall encoding efficiency and resulting video quality.
In the subsequent video performance assessments, the proposed method achieved a 0.21 dB improvement in PSNR and a 3% reduction in bit rate compared to the standard HEVC reference software. When compared with Deep Learning Video Coding (DLVC), notable bit rate enhancements of up to 0.19 were observed in certain sequences.
These findings underscore the effectiveness of the proposed method in enhancing the performance and quality of video encoding processes. The inherent flexibility in the system design offers substantial adaptability to various applications and video content, suggesting further enhancements with increased sequence resolution and the potential for advanced encoding standards such as VVC (H.266).
Date of Award10 Dec 2024
Original languageEnglish
Awarding Institution
  • University of Portsmouth
SupervisorDjamel Ait-Boudaoud (Supervisor), Mo Adda (Supervisor) & Abdelrahman Abdelazim (Supervisor)

Cite this

'