A two-stream CNN framework for American sign language recognition based on multimodal data fusion

Research output: Chapter in Book/Report/Conference proceedingConference contribution

456 Downloads (Pure)


At present, vision-based hand gesture recognition is very important in human-robot interaction (HRI). This non-contact method enables natural and friendly interaction between people and robots. Aiming at this technology, a two-stream CNN framework (2S-CNN) is proposed to recognize the American sign language (ASL) hand gestures based on multimodal (RGB and depth) data fusion. Firstly, the hand gesture data is enhanced to remove the influence of background and noise. Secondly, hand gesture RGB and depth features are extracted for hand gesture recognition using CNNs on two streams, respectively. Finally, a fusion layer is designed for fusing the recognition results of the two streams. This method utilizes multimodal data to increase the recognition accuracy of the ASL hand gestures. The experiments prove that the recognition accuracy of 2S-CNN can reach 92.08 % on ASL fingerspelling database and is higher than that of baseline methods.
Original languageEnglish
Title of host publicationAdvances in Computational Intelligence Systems
EditorsZhaojie Ju, Longzhi Yang, Chenguang Yang, Alexander Gegov, Dalin Zhou
ISBN (Electronic)978-3-030-29933-0
ISBN (Print)978-3-030-29932-3
Publication statusPublished - Sept 2019
Event19th UK Workshop on Computational Intelligence - Portsmouth, United Kingdom
Duration: 4 Sept 20195 Sept 2019
Conference number: 19

Publication series

NameAdvances in Computational Intelligence Systems
PublisherSpringer, Cham
ISSN (Print)2194-5357
ISSN (Electronic)2194-5365


Workshop19th UK Workshop on Computational Intelligence
Abbreviated titleUKCI 2019
Country/TerritoryUnited Kingdom
OtherThe UKCI 2019 covers both theory and applications in computational intelligence. The topics of interest include
Fuzzy Systems
Neural Networks
Evolutionary Computation
Evolving Systems
Machine Learning
Data Mining
Cognitive Computing
Intelligent Robotics
Hybrid Methods
Deep Learning
Applications of Computational Intelligence
Internet address


Dive into the research topics of 'A two-stream CNN framework for American sign language recognition based on multimodal data fusion'. Together they form a unique fingerprint.

Cite this