基于双分支多尺度特征融合的跨模态语义分割算法
DOI:
CSTR:
作者:
作者单位:

长春理工大学电子信息工程学院长春130022

作者简介:

通讯作者:

中图分类号:

TP391.41; TN215

基金项目:

国家自然科学基金重大仪器专项(62127813)、吉林省科技发展计划项目(20210203181SF)资助


Cross-modal semantic segmentation algorithm based on dual-branch multi-scale feature fusion
Author:
Affiliation:

School of Electronic Information Engineering, Changchun University of Science and Technology, Changchun 130022, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对单模态可见光RGB图像语义分割在夜晚或光线变化环境下存在分割效果差、目标边缘分割不清晰等问题,以及现有的跨模态语义分割在获取全局上下文信息和融合跨模态特征时还存在大量不足。为此提出了一种基于双分支多尺度特征融合的跨模态语义分割算法。采用Segformer作为主干网络提取特征,捕获长距离依赖关系,采用特征增强模块提升浅层特征图的对比度和边缘信息的判别性,利用有效注意力增强模块和跨模态特征融合模块,对不同模态特征图像素点间的关系进行建模,聚合互补信息,发挥跨模态特征优势。最后,采用轻量级的All-MLP解码器重建图像,预测分割结果。相比较于已有主流算法,该算法在MFNet城市街景数据集上的各项评估指标均为最优,平均准确率(mAcc)和平均交并比(mIoU)分别达到了76.9%和59.8%。实验结果表明,该算法在处理复杂场景时,能够有效改善目标边缘轮廓分割不清晰的问题,提高图像的分割精度。

    Abstract:

    To solve the problems of poor segmentation effect and unclear target edge segmentation of single-modal visible RGB image semantic segmentation at night or in the environment of light change, and there are still many shortcomings in the existing cross-modal semantic segmentation networks when obtaining global context information and fusing cross-modal features. This paper proposed a cross-modal semantic segmentation algorithm based on dual-branch multi-scale feature fusion. The Segformer is used as the backbone network to extract features and capture long-distance dependencies. The feature enhancement module is used to improve the contrast of shallow feature maps and the discrimination of edge information. The effective attention enhancement module and cross-modal feature fusion module are used to model the relationship between pixels of different modal feature maps, aggregate complementary information, and give full play to the advantages of cross-modal features. Finally, the lightweight All-MLP decoder was used to reconstruct the image and predict the segmentation result. Compared with the mainstream algorithms in the existing literature, the proposed algorithm has the best evaluation indicators on the MFNet urban street view dataset, and the mAcc and mIoU reach 76.9%and 59.8%respectively. Experimental results show that the proposed algorithm can effectively improve the problem of unclear target edge contour segmentation and improve the accuracy of image segmentation when dealing with complex scenes.

    参考文献
    相似文献
    引证文献
引用本文

陈广秋,任天蓉,段锦,黄丹丹.基于双分支多尺度特征融合的跨模态语义分割算法[J].电子测量与仪器学报,2025,39(5):144-154

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-07-04
  • 出版日期:
文章二维码