AS-TransUnet: combining ASPP and transformer for semantic segmentation

Jinshuo Wang, Dongxu Gao, Xuna Wang, Hongwei Gao, Zhaojie Ju

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Semantic segmentation is a task to classify each pixel in an image. Most recent semantic segmentation methods adopt full convolutional network FCN. FCN uses a fully convolutional network with encoding and decoder architecture. Encoders are used for feature extraction, and the decoder uses encoder-encoded features as input to decode the final segmentation prediction results. However, the convolutional kernel of feature extraction is not too large, so the model can only use local information to understand the input image, limiting the initial receptive field of the model. In addition, semantic segmentation tasks also need details in addition to semantic information, such as contextual information. To solve the above problems, we innovatively introduced the space pyramid structure (ASPP) into TransUnet, a model based on Transformers and U-Net, which is called AS-TransUnet. The spatial pyramid module can obtain more receptive fields to obtain multi-scale information. In addition, we add an attention module to the decoder to help the model learn relevant features. To verify the performance and efficiency of the model, we conducted experiments on two common data sets and compared them with the latest model. Experimental results show the superiority of this model.
Original languageEnglish
Title of host publicationIntelligent Robotics and Applications 16th International Conference, ICIRA 2023, Hangzhou, China, July 5–7, 2023, Proceedings, Part II
EditorsHuayong Yang, Honghai Liu, Jun Zou, Zhouping Yin, Lianqing Liu, Geng Yang, Xiaoping Ouyang
PublisherSpringer
Pages147-158
Number of pages13
ISBN (Electronic)9789819964864
ISBN (Print)9789819964857
DOIs
Publication statusPublished - 10 Oct 2023
EventInternational Conference on Intelligent Robotics and Applications - Hangzhou, China
Duration: 5 Jul 20237 Jul 2023

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume14268
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349
NameLecture Notes in Artificial Intelligence
PublisherSpringer
ISSN (Print)2945-9133
ISSN (Electronic)2945-9141

Conference

ConferenceInternational Conference on Intelligent Robotics and Applications
Country/TerritoryChina
CityHangzhou
Period5/07/237/07/23

Keywords

  • FCN
  • TransUnet
  • ASPP

Cite this