周彦臻,吴瑞东,于 潇,付 平,刘 冰,李君宝.面向 FPGA 部署的 CNN-SVM 算法研究与实现[J].电子测量与仪器学报,2021,35(4):90-98
面向 FPGA 部署的 CNN-SVM 算法研究与实现
Research and implementation of CNN-SVM algorithm based on FPGA
  
DOI:
中文关键词:  CNN-SVM 算法  FPGA 实现  硬件加速器设计  软硬件协同设计
英文关键词:CNN-SVM algorithm  FPGA implementation  hardware accelerator design  hardware and software co-design
基金项目:国家自然科学基金(61671170)项目资助
作者单位
周彦臻 1. 哈尔滨工业大学 电子与信息工程学院 
吴瑞东 1. 哈尔滨工业大学 电子与信息工程学院 
于 潇 2. 沈阳飞机设计研究所 
付 平 1. 哈尔滨工业大学 电子与信息工程学院 
刘 冰 1. 哈尔滨工业大学 电子与信息工程学院 
李君宝 1. 哈尔滨工业大学 电子与信息工程学院 
AuthorInstitution
Zhou Yanzhen 1. College of Electronic and Information Engineering, Harbin Institute of Technology 
Wu Ruidong 1. College of Electronic and Information Engineering, Harbin Institute of Technology 
Yu Xiao 2. Shenyang Aircraft D&R Institute 
Fu Ping 1. College of Electronic and Information Engineering, Harbin Institute of Technology 
Liu Bing 1. College of Electronic and Information Engineering, Harbin Institute of Technology 
Li Junbao 1. College of Electronic and Information Engineering, Harbin Institute of Technology 
摘要点击次数: 403
全文下载次数: 6
中文摘要:
      卷积神经网络-支持向量机(CNN-SVM)混合算法结合了 CNN 特征提取能力和 SVM 分类性能,在计算复杂度和解决小 样本问题上具有一定优势,目前已在故障诊断、医学图像处理等领域得到了一定应用,同时,由于其计算复杂度较低,也引起了 边缘计算领域的关注。 针对边缘计算场景中对算法性能和功耗的要求,提出了一种面向 FPGA 平台的 CNN-SVM 算法优化与实 现方法。 首先,结合 FPGA 的架构特点,对 CNN-SVM 算法结构进行了硬件适应性优化,包括模型压缩和分类器核函数的选取。 其次,采用了软硬件协同和高层次综合( HLS) 设计方法,完成了 CNN-SVM 算法加速器的设计与实现。 实验结果表明,在 ZCU102 上,加速器的 FPS(frames per second)达到了 18. 33 K,计算速度为 1. 474 GMAC/ s,相对于 CPU 平台四核 Cortex-A57 和 Ryzen7 3700x 分别实现了 23. 57 和 4. 92 倍加速,相对于 Jetson Nano GPU 和 GTX750 平台能耗比分别达到了 33. 24 和 50. 27。
英文摘要:
      CNN-SVM hybrid algorithm combines the feature extraction ability of CNN and the classification performance of SVM, it has certain advantages in computational complexity and can solve small sample problem. It has been applied in fault diagnosis, medical image processing and other fields, at the same time, it gets attention in the field of edge computing due to its low computational complexity. Aiming at the requirements of algorithm performance and power consumption in edge computing scenarios, an optimization and implementation method of CNN-SVM algorithm for FPGA platform is proposed. First, combined with the architecture characteristics of FPGA, the hardware adaptability optimization of CNN-SVM algorithm structure is carried out, including the model compression and the selection of kernel function of classifier. Secondly, the design and implementation of CNN-SVM algorithmic accelerator is completed by using software and hardware cooperation and high level synthesis ( HLS) design method. The experimental results show that on ZCU102, the frames per second(FPS) of accelerator reaches 18. 33 K, the computing speed is 1. 474 GMAC/ s. Compared with the CPU platform, quad core Cortex-A57 and Ryzen7 3700x achieve 23. 57 and 4. 92 times acceleration respectively, compared with Jetson Nano GPU and GTX750 platform, the energy consumption ratio is 33. 24 and 50. 27 respectively.
查看全文  查看/发表评论  下载PDF阅读器