Abstract:Aiming at the problems of few labeled fault samples, unbalanced dataset of pipeline operation state data set and high cost of sample labeling in the process of urban buried drainage pipeline blockage fault detection, an classification and recognition method of drainage pipeline blockage fault based on active learning is proposed. This method adopts the improved committee sample query strategy, and established an active learning model based on consensus entropy to realize the learning of unbalanced data set. After fully considering the uncertainty of the samples and mining the most informative of unlabeled samples for labeling, the committee composed of several random forest classifiers was used to classify and identify the unlabeled samples. The vote entropy, uniform entropy and randomly selected sample query strategy are compared and verified on the pipeline operation data set collected by the laboratory. The experimental results show that the committee query strategy based on consensus entropy has faster convergence speed and better stability under the initial training set of class distribution equilibrium, and also has good recognition effect under the initial training set with unbalanced distribution of categories.