郑自立,徐 健,刘秀平,刘高峰,赵一剑,夏代洪.联合多注意力和 C-ASPP 的单目 3D 目标检测[J].电子测量与仪器学报,2023,37(8):241-248 |
联合多注意力和 C-ASPP 的单目 3D 目标检测 |
Combined multi-attention and C-ASPP network for monocular 3D object detection |
|
DOI: |
中文关键词: 单目 3D 目标检测 深度估计 多注意力机制 机器视觉 自动驾驶 |
英文关键词:monocular 3D target detection depth estimation multi-attention mechanism machine vision autonomous driving |
基金项目:陕西省科技厅项目(2018GY-173)、西安市科技局项目(GXYD7. 5)资助 |
|
|
摘要点击次数: 911 |
全文下载次数: 667 |
中文摘要: |
针对单目 3D 检测中网络结构复杂、深度估计后得到的目标深度信息不精确的问题,本文提出一种端到端的联合多注
意力深度估计的单目 3D 目标检测网络结构(CDCN-3D)。 首先,为获取目标显著特征,引入自适应空间注意力机制,对像素特
征进行聚集,以增强局部特征来提升网络表征能力;其次,为改善深度估计时局部信息丢失问题,利用改进 C-ASPP 使每个深度
信息都能够捕获更加精确的方向感知和位置敏感信息;最后,利用精确的 P-BEV 将得到的目标三维信息映射到二维平面,再用
单级目标检测器完成检测输出任务。 实验结果证明,CDCN-3D 网络在 KITTI 数据集上,在 FPS 与现有单目 3D 检测网络持平情
况下,其准确率优于其他网络,在 Car、Pedestrian、Cyclist 类中,其检测精确度分别提升 2. 31%、1. 48%、1. 14%,能够完成 3D 目标
检测任务。 |
英文摘要: |
In monocular 3D detection, the complex network structure and inaccurate target depth information obtained after depth
estimation are two problems that need to be dealt with. To address this issue, we propose an end-to-end joint multi-attention depth
estimation monocular 3D target detection network structure, named CDCN-3D. First of all, to obtain the salient features of the target, we
introduce an adaptive spatial attention mechanism to aggregate the pixel features, which enhances local features and improves the network
representation ability. Second, we use an improved C-ASPP approach to address the problem of local information loss in depth
estimation, capturing more accurate direction perception and position-sensitive information for each depth information. Finally, the
accurate P-BEV is used to map the three-dimensional information of the target to a two-dimensional plane, and then the single-stage
target detector is used to complete the detection and output task. Through experiments on the KITTI dataset, the proposed CDCN-3D
network shows improved accuracy compared to other networks, with the same FPS as that of the existing monocular 3D detection network.
More specifically, and the detection accuracy of the CDCN-3D network is improved by 2. 31%, 1. 48%, 1. 14% respectively by the class
of Car、Pedestrian、Cyclist, which can complete the 3D target detection task. |
查看全文 查看/发表评论 下载PDF阅读器 |
|
|
|