Abstract:Considering the problem that the sparse features in the existing traffic anomaly detection models are easily ignored by the feature selection algorithms, a traffic anomaly detection method based on feature coupling generalization (FCG) was proposed. First, the DBSCAN density clustering algorithm was used to remove outliers in the data to reduce the impact of the anomalies on the subsequent FCG algorithm. Second, the minimal-redundancy-maximal-relevance (mRMR) algorithm was used to sort the data features, and the most influential features for classification were selected to generate the class-distinguishing features (CDF) in the FCG algorithm, in order to enhance the classification ability. The K-nearest neighbors (KNN) algorithm was used to fill in the missing values in CDF to maintain data integrity. Then, the data were grouped according to attack categories, and the features were sorted using the mRMR algorithm respectively, and the sparse features with instance-distinguishing ability in the data of each attack category were selected as the example-distinguishing feature (EDF) in the FCG algorithm. The degree of coupling between the two features in the anomaly detection data and the upper concept of EDF were used to transform EDF into more generalized features. Finally, the processed data were fed into the random forest (RF) model based on Bayesian optimization (BO) parameters for classification and identification. Through simulation experiments on the NSL-KDD dataset, the accuracy reached 91.79%, which verifies the proposed method has a good detection performance.