Attention mechanism and Bidirectional Long Short-Term Memory-Based real-time gaze tracking

Lihong Dai, Jinguo Liu, Zhaojie Ju*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Downloads (Pure)

Abstract

In order to improve the accuracy of gaze tracking in real-time, various attention mechanisms and long short-term memory (LSTM) networks for dynamic continuous video frames are studied in-depth in the paper. A real-time gaze-tracking method (SpatiotemporalAM) based on attention mechanism and bidirectional LSTM (Bi-LSTM) is proposed. Firstly, convolutional neural networks (CNNs) are employed to extract the spatial features of each image. Then, Bi-LSTM is adopted to obtain the dynamic temporal features between continuous frames to leverage the past and future context information. After that, the extracted spatiotemporal features are fused by the output attention mechanism (OAM), which improves the accuracy of gaze tracking. The models with OAM are compared with those with self-attention mechanism (SAM), which confirms the advantages of the former in accuracy and real-time performance. At the same time, a series of measures are taken to improve the accuracy, such as using cosine similarity in the loss function and ResNet50 with bottleneck residual blocks as the baseline network. A large number of experiments are performed on the Gaze360 and GazeCapture of public gaze tracking databases to verify the effectiveness, real-time performance, and generalization ability of the proposed gaze tracking approach.

Original languageEnglish
Article number4599
Number of pages18
JournalElectronics (Switzerland)
Volume13
Issue number23
DOIs
Publication statusPublished - 21 Nov 2024

Keywords

  • Attention mechanism
  • bidirectional LSTM
  • CNN
  • gaze tracking

Fingerprint

Dive into the research topics of 'Attention mechanism and Bidirectional Long Short-Term Memory-Based real-time gaze tracking'. Together they form a unique fingerprint.

Cite this