Feature Selection Based on Linear Twin Support Vector Machine and Application
|School||Zhejiang University of Technology|
|Keywords||Machine learning F-score Twin support vector machine Feature selection|
Generally, there are two ways to do the feature selection in classification problems. One way is algorithm independent, such as F-score. F-score is a simple and effective criterion which measures the discrimination of two sets of real numbers. A known deficiency of F-score is that it can not reveal mutual information among features. Another way is algorithm-dependent, such as SVM-RFE. The method with a large computation consumption, but it has a better effect of feature selection.Recently, Jayadeva et.al proposed the twin support vector machines (TWSVM). Different from SVM, in TWSVM, there are two different weight vectors in decision function. Thus, we cannot do the feature selection the same as SVM-RFE directly. Aiming at this problem, we propose two TWSVM feature selection algorithms. Firstly, we propose a feature selection algorithm based on linear TWSVM, called sort-TWSVM. sort-TWSVM merges the two weight vectors into one vector, then the new weight vector is used to delete the redundant features in a similar way to F-score, this algorithm is convergence and play the role of feature selection. Secondly, we propose another feature selection algorithm based on linear TWSVM, called TWSVM-RFE. TWSVM-RFE also merges the two weight vectors into one, then do feature selection in a similar way with SVM-RFE, this algorithm reveals the mutual information among features.Experiments on several benchmark datasets show the feasible and effective of our sort-TWSVM and TWSVM-RFE on feature selection. At last, we put the two algorithms to identify wine quality. Finally, we prove our two feature selection algorithms play the role of feature selection.