TY - JOUR
T1 - Robust detection method for improving small traffic sign recognition based on spatial pyramid pooling
AU - Dewi, Christine
AU - Chen, Rung Ching
AU - Yu, Hui
AU - Jiang, Xiaoyi
N1 - Funding Information:
This paper is supported by the Ministry of Science and Technology, Taiwan. The nos are MOST-110-2927-I-324-50, MOST-110-2221-E-324-010, and MOST-109-2622-E-324 -004, Taiwan. Additionally, this study was partially funded by the EU Horizon 2020 program RISE Project ULTRACEPT under Grant 778062.
Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2021/11/19
Y1 - 2021/11/19
N2 - An extraordinary challenge for real-world applications is traffic sign recognition, which plays a crucial role in driver guidance. Traffic signals are very difficult to detect using an extremely precise, real-time approach in practical autonomous driving scenes. This article reviews several object detection methods, including Yolo V3 and Densenet, in conjunction with spatial pyramid pooling (SPP). The SPP principle is employed to boost the Yolo V3 and Densenet backbone networks to extract the features. Moreover, we adopt spatial pyramid pooling to learn object features more completely. These models are measured and compared with key measurement parameters such as average accuracy (mAP), working area size, detection time, and billion floating-point number (BFLOPS). Based on the experimental results, Yolo V3 SPP outperforms state-of-the-art systems. Specifically, Yolo V3 SPP obtains 87.8% accuracy for small (S) target, 98.0% for medium (M) target, and 98.6% for large target groups in the BTSD dataset. Our results have shown that Yolo V3 SPP obtains the highest total BFLOPS (66.111), and mAP (99.28%). Consequently, SPP upgrades the achievement of all experimental models.
AB - An extraordinary challenge for real-world applications is traffic sign recognition, which plays a crucial role in driver guidance. Traffic signals are very difficult to detect using an extremely precise, real-time approach in practical autonomous driving scenes. This article reviews several object detection methods, including Yolo V3 and Densenet, in conjunction with spatial pyramid pooling (SPP). The SPP principle is employed to boost the Yolo V3 and Densenet backbone networks to extract the features. Moreover, we adopt spatial pyramid pooling to learn object features more completely. These models are measured and compared with key measurement parameters such as average accuracy (mAP), working area size, detection time, and billion floating-point number (BFLOPS). Based on the experimental results, Yolo V3 SPP outperforms state-of-the-art systems. Specifically, Yolo V3 SPP obtains 87.8% accuracy for small (S) target, 98.0% for medium (M) target, and 98.6% for large target groups in the BTSD dataset. Our results have shown that Yolo V3 SPP obtains the highest total BFLOPS (66.111), and mAP (99.28%). Consequently, SPP upgrades the achievement of all experimental models.
KW - CNN
KW - Densenet
KW - object recognition and detection
KW - spatial pyramid pooling
KW - Yolo V3 SPP
UR - http://www.scopus.com/inward/record.url?scp=85119476725&partnerID=8YFLogxK
U2 - 10.1007/s12652-021-03584-0
DO - 10.1007/s12652-021-03584-0
M3 - Article
AN - SCOPUS:85119476725
SN - 1868-5137
JO - Journal of Ambient Intelligence and Humanized Computing
JF - Journal of Ambient Intelligence and Humanized Computing
ER -