电子测量与仪器学报

李天放,孙一宸,于明鑫,董明利.结合语义分割与跨模态差分特征补偿的红外与可见光图像融合方法[J].电子测量与仪器学报,2024,38(7):34-45

结合语义分割与跨模态差分特征补偿的红外与可见光图像融合方法

Infrared and visible image fusion method integrating semantic segmentationand cross-modality differential feature compensation

DOI：

中文关键词: 图像融合语义分割注意力机制跨模态差分特征补偿

英文关键词:image fusion semantic segmentation attentional mechanism cross-modality differential feature compensation

基金项目:北京市教委科技计划一般项目(KM202011232007)、高校学科人才引进计划(D17021)、北京信息科技内涵发展项目(2019KYNH204)资助

作者	单位
李天放	北京信息科技大学仪器科学与光电工程学院北京100192
孙一宸	北京信息科技大学仪器科学与光电工程学院北京100192
于明鑫	北京信息科技大学仪器科学与光电工程学院北京100192
董明利	北京信息科技大学仪器科学与光电工程学院北京100192

Author	Institution
Li Tianfang	School of Instrument Science and OptoElectronics Engineering, Beijing Information Science and Technology University, Beijing 100192, China
Sun Yichen	School of Instrument Science and OptoElectronics Engineering, Beijing Information Science and Technology University, Beijing 100192, China
Yu Mingxin	School of Instrument Science and OptoElectronics Engineering, Beijing Information Science and Technology University, Beijing 100192, China
Dong Mingli	School of Instrument Science and OptoElectronics Engineering, Beijing Information Science and Technology University, Beijing 100192, China

摘要点击次数: 477

全文下载次数: 1859

中文摘要:

针对现有红外与可见光图像融合模型在深层特征提取时细节信息丢失、显著目标轮廓模糊的问题，提出一种结合语义分割与跨模态差分特征补偿 (CMDFC) 的红外与可见光图像融合方法。通过具有卷积注意力机制 ( CBAM) 的跨模态差分特征补偿模块，叠加不同模态的互补特征信息至原始特征中进行深层特征提取，引入语义分割网络对融合图像进行像素级别的分类操作构造语义损失来约束融合网络，并使用解码器重构融合图像。在公开数据集上进行融合实验的结果表明，相较于对照模型中的最优指标，所选的5种指标均有不同程度的提高，其中互信息 ( MI) 和视觉信息保真度 (VIF) 分别提高了4.41%和4.25%，说明本文所提出的模型生成的融合图像更清晰，与源图像相关性更强，该方法有效缓解了红外与可见光图像融合过程中特征细节信息丢失的问题，增强了生成图像的视觉效果和对比度。

英文摘要:

To address the issues of detail information loss and blurred salient target contours in existing infrared and visible image fusion models during deep feature extraction, we propose an infrared and visible image fusion method that combines semantic segmentation with cross-modality differential feature compensation (CMDFC). By incorporating a cross modality differential feature compensation module with a convolutional block attention module (CBAM), complementary features from different modalities are integrated into the original features for deep feature extraction. Additionally, a semantic segmentation network is introduced to perform pixel-level classification on the fused image, constructing a semantic loss to constrain the fusion network, and a decoder is used to reconstruct the fused image. Experimental results on public datasets show that compared to the best metrics of the reference models, the proposed model achieves various degrees of improvement in five selected metrics, with mutual information (MI) and visual information fidelity (VIF) increased of 4.41% and 4.25%, respectively. These results indicate that the proposed model generates clearer fused images with stronger correlation to the source images, effectively mitigating the issue of feature detail loss during the fusion process and enhancing the visual quality and contrast of the generated images.

查看全文查看/发表评论下载PDF阅读器