DeliteSeg: A Real-Time Semantic Segmentation Model for Predicting Small Objects and Object Contours
DOI:
https://doi.org/10.31577/cai_2025_1_71Keywords:
Real-time performance, deformable convolution, deep pyramid aggregation module, lightweight attention decoderAbstract
Semantic segmentation is one of the key technologies in the development of autonomous vehicles. Practical applications are increasingly pursuing a balance between effectiveness and efficiency. Many lightweight segmentation models nowadays have some problems, often making it difficult to predict small objects and edges between different objects. In this work, we propose a model of encoder-decoder structure, DeliteSeg. Firstly, we added deformable convolutional layers to the encoder, leveraging the advantages of deformable convolution to enable the model to better predict object edges. Then we proposed a new deep context aggregation module DLPPM, which improves the context information aggregation ability by fusing low-resolution feature maps of different scales multiple times, enabling the model to better predict small objects. Finally, we designed a new lightweight attention decoder (LMD) that utilizes a spatial channel attention mechanism to refine feature maps at different levels, effectively recovering information. After extensive experiments, our network achieved 73.6 % mIou and 123.7 FPS on the Cityscapes dataset and 73.9 % mIou and 116.4 FPS on the CamVid dataset. The experimental results confirm that our proposed model can make appropriate trade-offs between accuracy and real-time performance.
