A one-stage temporal detector with attentional LSTM for video object detection

Jiahui Yu, Zhaojie Ju, Hongwei Gao, Dalin Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contribution

120 Downloads (Pure)

Abstract

Temporal object detection is more challenging than static image detection because of the rich context information. Recently, state-of-the-art works mine context information to detect each frame by using LSTM-based modules. However, restricted by the low-exploration of temporal information, significant results in terms of accuracies and speeds are not reported by the existing methods. In this paper, we propose a new one-stage temporal detector for online video object detection. A new structure with an improved spatiotemporal LSTM (STLSTM) is proposed to suppress useless background information. Next, the SSD-based structure is improved to extract rich features and high-level semantic features. We evaluate the proposed model on the ImageNet benchmark and space human-robot interaction database. Extensive comparisons show that the proposed detector achieves state-of-the-art performance.
Original languageEnglish
Title of host publication27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP)
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages464-468
Number of pages5
ISBN (Electronic)9781665431538
ISBN (Print)9781665431545
DOIs
Publication statusPublished - 7 Jan 2022
Event2021 27th International Conference on Mechatronics and Machine Vision in Practice - Shanghai, China
Duration: 26 Nov 202128 Nov 2021

Conference

Conference2021 27th International Conference on Mechatronics and Machine Vision in Practice
Abbreviated titleM2VIP
Country/TerritoryChina
CityShanghai
Period26/11/2128/11/21

Fingerprint

Dive into the research topics of 'A one-stage temporal detector with attentional LSTM for video object detection'. Together they form a unique fingerprint.

Cite this