联合多注意力和 C-ASPP 的单目 3D 目标检测
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TN966. 6

基金项目:

陕西省科技厅项目(2018GY-173)、西安市科技局项目(GXYD7. 5)资助


Combined multi-attention and C-ASPP network for monocular 3D object detection
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对单目 3D 检测中网络结构复杂、深度估计后得到的目标深度信息不精确的问题,本文提出一种端到端的联合多注 意力深度估计的单目 3D 目标检测网络结构(CDCN-3D)。 首先,为获取目标显著特征,引入自适应空间注意力机制,对像素特 征进行聚集,以增强局部特征来提升网络表征能力;其次,为改善深度估计时局部信息丢失问题,利用改进 C-ASPP 使每个深度 信息都能够捕获更加精确的方向感知和位置敏感信息;最后,利用精确的 P-BEV 将得到的目标三维信息映射到二维平面,再用 单级目标检测器完成检测输出任务。 实验结果证明,CDCN-3D 网络在 KITTI 数据集上,在 FPS 与现有单目 3D 检测网络持平情 况下,其准确率优于其他网络,在 Car、Pedestrian、Cyclist 类中,其检测精确度分别提升 2. 31%、1. 48%、1. 14%,能够完成 3D 目标 检测任务。

    Abstract:

    In monocular 3D detection, the complex network structure and inaccurate target depth information obtained after depth estimation are two problems that need to be dealt with. To address this issue, we propose an end-to-end joint multi-attention depth estimation monocular 3D target detection network structure, named CDCN-3D. First of all, to obtain the salient features of the target, we introduce an adaptive spatial attention mechanism to aggregate the pixel features, which enhances local features and improves the network representation ability. Second, we use an improved C-ASPP approach to address the problem of local information loss in depth estimation, capturing more accurate direction perception and position-sensitive information for each depth information. Finally, the accurate P-BEV is used to map the three-dimensional information of the target to a two-dimensional plane, and then the single-stage target detector is used to complete the detection and output task. Through experiments on the KITTI dataset, the proposed CDCN-3D network shows improved accuracy compared to other networks, with the same FPS as that of the existing monocular 3D detection network. More specifically, and the detection accuracy of the CDCN-3D network is improved by 2. 31%, 1. 48%, 1. 14% respectively by the class of Car、Pedestrian、Cyclist, which can complete the 3D target detection task.

    参考文献
    相似文献
    引证文献
引用本文

郑自立,徐 健,刘秀平,刘高峰,赵一剑,夏代洪.联合多注意力和 C-ASPP 的单目 3D 目标检测[J].电子测量与仪器学报,2023,37(8):241-248

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-11-23
  • 出版日期:
文章二维码