李 恬,冯 早,朱雪峰.基于主动学习和最优路径森林的 管道故障分类识别方法[J].电子测量与仪器学报,2022,36(12):67-76
基于主动学习和最优路径森林的 管道故障分类识别方法
Pipeline blockage recognition method based on active learningand optimum-path forest
  
DOI:
中文关键词:  主动学习  最优路径森林  半监督  故障分类
英文关键词:active learning  optimum-path forest  semi-supervised  fault classification
基金项目:国家自然科学基金(61563024)项目资助
作者单位
李 恬 1. 昆明理工大学信息工程与自动化学院,2. 云南省人工智能重点实验室 
冯 早 1. 昆明理工大学信息工程与自动化学院,2. 云南省人工智能重点实验室 
朱雪峰 1. 昆明理工大学信息工程与自动化学院,2. 云南省人工智能重点实验室 
AuthorInstitution
Li Tian 1. Faculty of Information Engineering & Automation, Kunming University of Science and Technology,2. Yunnan Key Laboratory of Artificial Intelligence 
Feng Zao 1. Faculty of Information Engineering & Automation, Kunming University of Science and Technology,2. Yunnan Key Laboratory of Artificial Intelligence 
Zhu Xuefeng 1. Faculty of Information Engineering & Automation, Kunming University of Science and Technology,2. Yunnan Key Laboratory of Artificial Intelligence 
摘要点击次数: 1147
全文下载次数: 1004
中文摘要:
      在工业故障分类过程中有标记样本数量少而人工标注成本高会导致分类器精度难以提高,而大量包含丰富信息的无 标记样本却没有得到充分利用。 针对上述问题,提出了一种结合主动学习(AL)和最优路径森林算法(OPF)的半监督故障分类 模型(AL-OPF)。 该方法首先利用 BvSB 和余弦相似度准则综合衡量样本的价值量,以排序批处理模式筛选价值高的样本,并 获取其标签扩充初始标记样本集,然后通过构建最优路径森林实现半监督标签传播,最后在实验室采集得到的管道故障样本集 上进行实验验证。 实验结果表明,该方法能在有标签样本为 10%的情况下达到 96. 68%的整体识别准确率,与逐个采样模式的 主动学习方法以及基于距离度量提取训练样本全局结构信息的半监督方法相比,所提出方法拥有更高的 Recall 值和 F1- score 值 关键词: 。
英文摘要:
      Aiming at the problem of difficulty in improving the classification accuracy of industrial fault detection caused by its limited number of labeled training samples which would consume a significant amount of manpower, which a large number of unlabeled samples containing rich information are not fully utilized, this paper puts forward a semi-supervised classification model of combining active learning (AL) and the optimum-path forest (OPF). Firstly, the high-value samples are selected in sorting batch mode according to the value of samples that are comprehensively measured based on BvSB and cosine similarity criterion, and the value of each sample is obtained to expand the initial labeled sample set. Secondly, semi-supervised label propagation is achieved by constructing the optimumpath forest. Finally, the experimental verification was carried out using laboratory collected pipe condition datasets. The experimental results show that the method can achieve an overall recognition accuracy of 96. 68% when the number of labeled samples is 10%. Compared with active learning methods in one-by-one sampling mode and semi-supervised methods that extract global structural information of training samples based on distance metrics, the proposed method has higher Recall value and F1-score value.
查看全文  查看/发表评论  下载PDF阅读器