基于稳定光度损失的无监督单目深度估计
DOI:
CSTR:
作者:
作者单位:

江南大学轻工过程先进控制教育部重点实验室无锡214122

作者简介:

通讯作者:

中图分类号:

TN911.73

基金项目:

国家自然科学基金(62173160)资助项目


Unsupervised monocular depth estimation based on stable photometric loss
Author:
Affiliation:

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在基于视频的无监督单目深度估计模型训练中,光度损失一直发挥着重要作用,但其在弱纹理区域和边缘区域等特殊区域普遍存在较大误差,导致训练网络的监督信号存在较强的不稳定性。针对这一问题,提出一种更具鲁棒性的无监督单目深度估计方法。本文方法首先结合双分支编码器和通道注意力模块来提升单帧深度网络对深度特征的提取能力,然后利用单帧深度网络结果引导进行多帧深度估计,以提高深度估计的准确性。在此基础上设计一种新型光度损失函数,通过计算图像梯度上的光度损失消除局部亮度变化引起的不合理监督,并利用连续像素之间的差异特性来定义模糊像素,最后基于二进制掩模排除由于目标帧和重构目标帧上边缘模糊像素产生的错误监督。本文方法在KITTI数据集的测试结果中,平均相对误差、平方相对误差、均方根误差等多项指标均有提升,平均相对误差和平方相对误差分别降低至0.075和0.548。实验结果证明,与其他先进方法相比,本文方法进一步提高了现有模型的性能。

    Abstract:

    The photometric loss has been playing an important role in the training of video-based unsupervised monocular depth estimation models. However, it generally has large errors in special regions such as weak texture regions and edge regions, which leads to strong instability in the supervision signal of the training network. To solve the problem, a more robust unsupervised monocular depth estimation method is proposed. The method first combines the dual-branch encoder and the channel attention module to improve the extraction ability of the single-frame depth network for depth features. Then, the single-frame depth network results are used to guide the multi-frame depth estimation to improve the accuracy of depth estimation. On the basis, a new photometric loss function is designed. By calculating the photometric loss on the image gradient, the unreasonable supervision caused by local brightness changes is eliminated. At the same time, the difference between successive pixels is used to define the blurry pixels. Finally, the false supervision caused by the blurred pixels on the target frame and the reconstructed target frame is excluded based on the binary mask. In the test results of the KITTI dataset, multiple indicators such as the average relative error, the square relative error and the root mean square error have improved. The average relative error and the squared relative error are reduced to 0.075 and 0.548 respectively. The experimental result shows that the proposed method further improves the performance of existing models compared with other advanced methods.

    参考文献
    相似文献
    引证文献
引用本文

曲熠,陈莹.基于稳定光度损失的无监督单目深度估计[J].电子测量与仪器学报,2024,38(11):158-167

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-01-13
  • 出版日期:
文章二维码