Abstract:In view of the low contrast and complex topological structure of pavement defect images, most of the currently proposed segmentation algorithms still have great shortcomings in capturing the receptive fields and extracting pavement defect features. Therefore, this article proposes an improved U-Net road defect image segmentation algorithm. First, the SN-Disout residual block is proposed in the classic U-Net convolution block to enhance the model’s robustness against overfitting. Secondly, a criss-cross module is introduced between the encoder and the decoder to enhance the model’s ability to capture features between different positions in the feature map and more accurately model the boundaries of defects. Finally, the spatial channel squeeze and excitation module is introduced in the decoder, which enables the network to focus more on important features while reducing the dependence on irrelevant or noisy features; position-aware multi-head attention is added to the neck of the model to further helps the model better understanding and utilizing the internal relationship of the input data, thereby improving the performance and performance capabilities of the model, and using the hybrid loss function Dice+BCE to replace a single loss function. The intersection ratio and F1 of this algorithm on the Crack500 image data set reached 60.13% and 75.22% respectively, both exceeding mainstream semantic segmentation networks such as U-Net, DeepLabV3+, PSPNet, TransU-Net, and UNet++. Experimental results show that this algorithm can effectively improve the prediction accuracy of the network and the segmentation results of small targets, and it also meets the real-time requirements while ensuring segmentation accuracy.