基于字符编码与卷积神经网络的汉字识别
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TN91;TP391.1

基金项目:

安徽省科技攻关计划(1604a0902182)资助项目


Chinese character recognition based on convolutional neural network and character encoding
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    汉字识别是人工智能与模式识别领域中重要的研究内容,针对现有的研究仍然存在着参数调整难度大、训练样本数少、不能识别所有常用字符等问题,提出了一种基于字符编码与卷积神经网络的汉字识别方法,首先通过查询字库得到所有字符信息,以utf8编码方式与多种字体编码文件进行编码输出字符图像,再进行多种图像处理后得到数据集,并利用深度卷积神经网络进行训练识别,在网络训练中通过数据扩增、批标准化、RMSProp优化等方式进行优化,同时加入正则化和Dropout防止过拟合。实验结果表明,所提方法对于汉字的识别率达到了9808%,与Alexnet、LeNet5相比,使用同一数据集在识别准确率上提高了937%、2114%,实现了一个识别率高、特征提取能力与泛化能力强的神经网络。

    Abstract:

    Chinese character recognition is an important research content in the field of artificial intelligence and pattern recognition. Existing research still has problems such as difficulty in parameter adjustment, small number of training samples, and inability to identify all common characters. Aiming at these problems, we propose a Chinese character recognition method based on character encoding and convolutional neural network. First, we obtain all the character information by querying the font database, which are encoded and outputted by using UTF8 encoding method and various font encoding files to generate character images. Further, we apply various of image processing to obtain the new character image dataset. Then, we propose a deep convolutional neural network for Chinese character recognition. In the training procedure, data augmentation, batch normalization, RMSProp optimization are optimized, regularization and dropout are used to prevent overfitting for optimization. The experimental results show that the proposed method is simple yet effective, the recognition accuracy rate for Chinese characters is 9808%. Compared with Alexnet and LeNet5, we obtain a significant improvement by 937% and 2114%. A neural network with high recognition rate, strong feature extraction ability and generalization ability is realized.

    参考文献
    相似文献
    引证文献
引用本文

刘正琼,丁力,凌琳,李学飞,周文霞.基于字符编码与卷积神经网络的汉字识别[J].电子测量与仪器学报,2020,34(2):143-149

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-06-15
  • 出版日期: 2020-01-31