Unsupervised video summarization using deep Non-Local video summarization networks

Sha Sha Zang, Hui Yu*, Yan Song, Ru Zeng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

64 Downloads (Pure)


Video summarization is to extract effective information from videos to quickly obtain the most informative summary. Most of the existing video summarization methods use recurrent neural networks and their variants such as long and short-term memory (LSTM), to simulates the variable range time dependence between video frames. However, those methods can only process serial inputs of the video frames along with the hidden layer information from the previous time step, which affects the performance and the quality of video summarization. To tackle this issue, we present a deep non-local video summarization network (DN-VSN) for original video abstracts in this paper. Our unsupervised model treats video summarization as a sequence of decision problems. Given an input video, the probability that a video frame is selected as a part of the summary is obtained through a non-local convolutional network, and a strategy gradient algorithm of reinforcement learning is adopted for optimization in the training phase. The proposed method has been tested on four widely used datasets. The experimental results show the superiority of the proposed unsupervised model over the state-of-the-art methods.

Original languageEnglish
Pages (from-to)26-35
Number of pages10
Early online date23 Nov 2022
Publication statusPublished - 28 Jan 2023


  • LSTM
  • non-local convolutional network
  • non-local video summarization network
  • reinforcement learning
  • video summarization


Dive into the research topics of 'Unsupervised video summarization using deep Non-Local video summarization networks'. Together they form a unique fingerprint.

Cite this