Introduction of LIBSVM ——Supervised Machine Learning Lib

Supervised Machine Learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a “reasonable” way.

There’re two canonical supervised learning problems:

1. Classification

  •  estimate class, e.g. handwritten digit classification.

2. Regression

  • estimate parameters, e.g. of weight vs height.

LIBSVM is an open source machine learning libraries for support vector classification(C-SVC, nu-SVC), regression (epsilon-SVR, nu-SVR) and distribution estimation (one-class SVM). It supports multi-class classification. You can download the zip file or tar.gz.file at https://www.csie.ntu.edu.tw/~cjlin/libsvm/index.html.

Data Format of LIBSVM:

The format of training and testing data files is:

<label> <index1>:<value1> <index2>:<value2> ...
.

.

.

Each line contains an instance and is ended by a ‘\n’ character. For <labal> in the training set, have the following cases.

* For classification: <label> is an integer indicating the class label (multi-class is supported) , multi-classification can have multi labels.

* For regression, <label> is the target value which can be any real number and only one label.

* For one-class SVM, <label> is not used and can be any number and only one label.

In the test set, <label> is used only to calculate accuracy or errors. If it’s unknown, any number is fine. For one-class SVM, if non-outliers/outliers are known, their labels in the test file must be +1/-1 for evaluation.

The pair <index>:<value> gives a feature (attribute) value: <index> is an integer starting from 1 and <value> is a real number. The only exception is the precomputed kernel, where <index> starts from 0; see the section of precomputed kernels. Indices must be in ASCENDING order.  ###index can not be continuous, when -t = 4(precomputed), index start with 0, in other cases start with 1.

Usage in Command Line of LIBSVM:

  • `svm-train’ Usage

          Usage:  svm-train [options] training_set_file [model_file]

          Parameter explanation:

     options:
         -s svm_type : set type of SVM (default 0)
            0 -- C-SVC (multi-class classification)
            1 -- nu-SVC (multi-class classification)
            2 -- one-class SVM
            3 -- epsilon-SVR (regression)
            4 -- nu-SVR (regression)
         -t kernel_type : set type of kernel function (default 2)
            0 -- linear: u'*v
            1 -- polynomial: (gamma*u'*v + coef0)^degree
            2 -- radial basis function: exp(-gamma*|u-v|^2)
            3 -- sigmoid: tanh(gamma*u'*v + coef0)
            4 -- precomputed kernel (kernel values in training_set_file)
         -d degree : set degree in kernel function (default 3)
         -g gamma : set gamma in kernel function (default 1/num_features)
         -r coef0 : set coef0 in kernel function (default 0)
         -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
         -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
         -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
         -m cachesize : set cache memory size in MB (default 100)
         -e epsilon : set tolerance of termination criterion (default 0.001)
         -h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
         -b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
         -wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
         -v n: n-fold cross validation mode
         -q : quiet mode (no outputs)
     training_set_file:is the data you want to training.
     model_file:is the model file generated by svm-train. Default name is $training_set_file$name.model

Special:

when -t = 1, -d will take effect, otherwise the -d parameter will use the default value of 3 even if specified;
  • `svm-scale’ Usage

          Usage: svm-scale [options] data_filename

          Parameter explanation:

    options:
       -l lower : x scaling lower limit (default -1)
       -u upper : x scaling upper limit (default +1)
       -y y_lower y_upper : y scaling limits (default: no y scaling)
       -s save_filename : save scaling parameters to save_filename
       -r restore_filename : restore scaling parameters from restore_filename
    data_filename

         Tips on Practical Use:

* Scale your data. For example, scale each attribute to [0,1] or [-1,+1].
* For C-SVC, consider using the model selection tool in the tools directory.
* nu in nu-SVC/one-class-SVM/nu-SVR approximates the fraction of training errors and support vectors.
* If data for classification are unbalanced (e.g. many positive and few negative), try different penalty parameters C by -wi.
* Specify larger cache size (i.e., larger -m) for huge problems.
  • `svm-predict’ Usage

          Usage: svm-scale [options] test_file model_file output_file

          Parameter explanation:

     options:
	-b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); for one-class SVM only 0 is supported
     test_file: is the test data you want to predict.
     model_file: is the model file generated by svm-train.
     output_file: svm-predict will produce output in the output_file.

        svm_model’s required parameter in different cases(What’s svm_model will talk about below):

1.For multi classification:

  required parameter:labels,svm_type,kernel_type,gamma,nSV,rho,SV
      labels: "x1 x2 x3 x4..." ##k个 
      svm_type: 0|1
      kernel_type: 0|1|2|3|4
      gamma: x
      nSV: "x1 x2 x3 x4..." ##k个
      rho: "x1 x2 x3 x4 x5 x6..." ##k*(k-1)/2个
      SV: [l][count(index)]
  
  (parameter nr_class, l and sv_coef can computed by SV, the others use default value.)
      nr_class: labels.length ###single classification is 2
      l: SV.length
      sv_coef: [k-1][l] ###single classification and regression is [1][l]

2.For single classification:

  required parameter: svm_type,kernel_type,gamma,rho,SV
      svm_type: 2
      kernel_type: 0|1|2|3|4
      gamma: x
      rho: 1个
      SV: [l][count(index)]

  (parameter nr_class, l and sv_coef can computed by SV, the others use default value.)
      nr_class: 2 ###single classification is 2
      l: SV.length
      sv_coef: [1][l] ###single classification and regression is [1][l]

3.For regression:

 required parameter:svm_type,kernel_type,gamma,SV
      svm_type: 3|4
      kernel_type: 0|1|2|3|4
      gamma: x
      rho: 1个
      SV: [l][count(index)]

 (parameter nr_class, l and sv_coef can computed by SV, the others use default value.)
      nr_class: 2
      l: SV.length
      sv_coef: [1][l] ###single classification and regression is [1][l]

Model structure of LIBSVM:

  • structure of svm_parameter:
{
    int svm_type;
    int kernel_type;
    int degree; /* for poly */
    double gamma; /* for poly/rbf/sigmoid */
    double coef0; /* for poly/sigmoid */

    /* these are for training only */
    double cache_size; /* in MB */
    double eps; /* stopping criteria */
    double C; /* for C_SVC, EPSILON_SVR, and NU_SVR */
    int nr_weight; /* for C_SVC */
    int *weight_label; /* for C_SVC */
    double* weight; /* for C_SVC */
    double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */
    double p; /* for EPSILON_SVR */
    int shrinking; /* use the shrinking heuristics */
    int probability; /* do probability estimates */
};

among them, svm_type: can be one of C_SVC, NU_SVC, ONE_CLASS, EPSILON_SVR, NU_SVR.
                     C_SVC: C-SVM classification
                     NU_SVC: nu-SVM classification
                     ONE_CLASS: one-class-SVM
                     EPSILON_SVR: epsilon-SVM regression
                     NU_SVR: nu-SVM regression

           kernel_type: can be one of LINEAR, POLY, RBF, SIGMOID.
                     LINEAR: u'*v
                     POLY: (gamma*u'*v + coef0)^degree
                     RBF: exp(-gamma*|u-v|^2)
                     SIGMOID: tanh(gamma*u'*v + coef0)
                     PRECOMPUTED: kernel values in training_set_file
  • structure of svm_model:
{
    struct svm_parameter param; /* parameter */
    int nr_class; /* number of classes, = 2 in regression/one class svm */
    int l; /* total #SV */
    struct svm_node **SV; /* SVs (SV[l]) */
    double **sv_coef; /* coefficients for SVs in decision functions (sv_coef[k-1][l]) */
    double *rho; /* constants in decision functions (rho[k*(k-1)/2]) */
    double *probA; /* pairwise probability information */
    double *probB;
    int *sv_indices; /* sv_indices[0,...,nSV-1] are values in [1,...,num_traning_data] to indicate SVs in the training set */

    /* for classification only */
    int *label; /* label of each class (label[k]) */
    int *nSV; /* number of SVs for each class (nSV[k]) */
    /* nSV[0] + nSV[1] + ... + nSV[k-1] = l */
    /* XXX */
    int free_sv; /* 1 if svm_model is created by svm_load_model*/
    /* 0 if svm_model is created by svm_train */
};

2 Replies to “Introduction of LIBSVM ——Supervised Machine Learning Lib”

Leave a Reply

Your email address will not be published. Required fields are marked *