基于稀疏可学习proposal的车间工具目标检测
DOI:
作者:
作者单位:

江苏海洋大学 电子工程学院 连云港 222005

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)


Target detection of workshop tools based on sparse learnable proposal
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对车间工具不同型号之间尺寸存在较大差异、形状种类繁多等问题,提出了一种基于稀疏可学习proposal的车间工具检测算法。首先,融入稀疏表示和可学习的proposal机制来提升模型的鲁棒性,并减少检测过程中所需的参数量;其次,引入Swin-Transformer结构,旨在增强模型的全局以及细节学习能力,有效地解决传统卷积神经网络在高层语义信息融合方面存在的不足;然后,使用一种改进的多尺度特征融合网络架构,通过有效融合不同尺度的特征提高了模型对于各种尺度目标的检测能力;最后,将多头注意力和动态卷积结合,在不同特征层之间建立更精确且细致的联系,从而进一步提升了目标检测的准确性;采用了CIoU损失函数,通过综合考虑位置、尺度和形状信息,使得模型对边界框的回归预测更加全面与准确。实验结果显示,本文算法在车间工具目标检测任务上的平均检测精度达到了91%,较当前主流算法至少提升了2.3%以上。同时,单张图片的检测速度大约为53ms,满足了实时检测的需求,体现了综合性能优越。

    Abstract:

    Aiming at the significant size discrepancies and various shapes among different models of workshop tools, a workshop tool detection method based on sparse learnable proposal is proposed. Firstly, sparse representation and learnable proposal mechanism are integrated to improve the robustness of the model and reduce the required parameters in the detection process. Secondly, Swin-Transformer structure is introduced to enhance the global and detail learning ability of the model, which can effectively overcome the shortcomings of traditional convolution neural network in high-level semantic information fusion. Thirdly, an improved multi-scale feature fusion network architecture is used to improve the detection ability of the model for various scale targets according to effective fusion of different scale features. Finally, multi-head attention and dynamic convolution are combined to establish a more precise and detailed connection between different feature layers, thereby furtherly improving the accuracy of target detection. The CIoU loss function is applied to make the regression prediction of the boundary box more comprehensive and accurate by considering the location, scale and shape information. The experimental results show that the average detection accuracy of the proposed method for workshop tool detection reaches 91%, which is at least 2.3% higher than the current mainstream methods. At the same time, the detection speed of a single picture is about 53ms, which meets the needs of real-time detection and reflects the excellent comprehensive performance.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-01-04
  • 最后修改日期:2024-05-11
  • 录用日期:2024-05-14
  • 在线发布日期:
  • 出版日期: