电子测量与仪器学报

李恬,冯早,朱雪峰.基于主动学习和最优路径森林的管道故障分类识别方法[J].电子测量与仪器学报,2022,36(12):67-76

基于主动学习和最优路径森林的管道故障分类识别方法

Pipeline blockage recognition method based on active learningand optimum-path forest

DOI：

英文关键词:active learning optimum-path forest semi-supervised fault classification

基金项目:国家自然科学基金（61563024）项目资助

作者	单位
李恬	1. 昆明理工大学信息工程与自动化学院,2. 云南省人工智能重点实验室
冯早	1. 昆明理工大学信息工程与自动化学院,2. 云南省人工智能重点实验室
朱雪峰	1. 昆明理工大学信息工程与自动化学院,2. 云南省人工智能重点实验室

Author	Institution
Li Tian	1. Faculty of Information Engineering & Automation, Kunming University of Science and Technology,2. Yunnan Key Laboratory of Artificial Intelligence
Feng Zao	1. Faculty of Information Engineering & Automation, Kunming University of Science and Technology,2. Yunnan Key Laboratory of Artificial Intelligence
Zhu Xuefeng	1. Faculty of Information Engineering & Automation, Kunming University of Science and Technology,2. Yunnan Key Laboratory of Artificial Intelligence

摘要点击次数: 1856

全文下载次数: 2650

中文摘要:

在工业故障分类过程中有标记样本数量少而人工标注成本高会导致分类器精度难以提高,而大量包含丰富信息的无标记样本却没有得到充分利用。针对上述问题,提出了一种结合主动学习(AL)和最优路径森林算法(OPF)的半监督故障分类模型(AL-OPF)。该方法首先利用 BvSB 和余弦相似度准则综合衡量样本的价值量,以排序批处理模式筛选价值高的样本,并获取其标签扩充初始标记样本集,然后通过构建最优路径森林实现半监督标签传播,最后在实验室采集得到的管道故障样本集上进行实验验证。实验结果表明,该方法能在有标签样本为 10%的情况下达到 96. 68%的整体识别准确率,与逐个采样模式的主动学习方法以及基于距离度量提取训练样本全局结构信息的半监督方法相比,所提出方法拥有更高的 Recall 值和 F1- score 值关键词: 。

英文摘要:

Aiming at the problem of difficulty in improving the classification accuracy of industrial fault detection caused by its limited number of labeled training samples which would consume a significant amount of manpower, which a large number of unlabeled samples containing rich information are not fully utilized, this paper puts forward a semi-supervised classification model of combining active learning (AL) and the optimum-path forest (OPF). Firstly, the high-value samples are selected in sorting batch mode according to the value of samples that are comprehensively measured based on BvSB and cosine similarity criterion, and the value of each sample is obtained to expand the initial labeled sample set. Secondly, semi-supervised label propagation is achieved by constructing the optimumpath forest. Finally, the experimental verification was carried out using laboratory collected pipe condition datasets. The experimental results show that the method can achieve an overall recognition accuracy of 96. 68% when the number of labeled samples is 10%. Compared with active learning methods in one-by-one sampling mode and semi-supervised methods that extract global structural information of training samples based on distance metrics, the proposed method has higher Recall value and F1-score value.

查看全文查看/发表评论下载PDF阅读器