Tutorial

Test your installation

If πSvM was installed properly as described here running pisvm-train on a single processor with no command line switches should output usage instructions:


	  $ mpirun -np 1 ./pisvm-train
	  Usage: svm-train [options] training_set_file [model_file]

	  options:

	  -s svm_type : set type of SVM (default 0)

	  0 -- C-SVC

	  1 -- nu-SVC

	  2 -- one-class SVM

	  3 -- epsilon-SVR

	  4 -- nu-SVR

	  -t kernel_type : set type of kernel function (default 2)

	  0 -- linear: u'*v

	  1 -- polynomial: (gamma*u'*v + coef0)^degree

	  2 -- radial basis function: exp(-gamma*|u-v|^2)

	  3 -- sigmoid: tanh(gamma*u'*v + coef0)

	  -d degree : set degree in kernel function (default 3)

	  -g gamma : set gamma in kernel function (default 1/k)

	  -r coef0 : set coef0 in kernel function (default 0)

	  -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)

	  -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)

	  -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)

	  -m cachesize : set cache memory size in MB (default 40)

	  -e epsilon : set tolerance of termination criterion (default 0.001)

	  -h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1)

	  -b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)

	  -wi weight: set the parameter C of class i to weight*C, for C-SVC (default 1)

	  -v n: n-fold cross validation mode

	  -o n: max. size of working set

	  -q n: max. number of new variables entering working set

So besides some options we need to feed some training data to πSvM, which we will get next.

Download some datasets

In the Download section you find the training and test datasets that were used to check the performance of πSvM. You might for example download the mnist-576-rbf-8vr training and test datasets. It is also recommended to download the corresponding md5 checksum files and place everything in the same directory. Then check the integrity of the datasets by running the following commands in your download directory:


	  $ md5sum -c mnist_train_576_rbf_8vr.dat.bz2.md5sum

	  mnist_train_576_rbf_8vr.dat.bz2: OK

	  $ md5sum -c mnist_test_576_rbf_8vr.dat.bz2.md5sum

	  mnist_test_576_rbf_8vr.dat.bz2: OK

If the check fails try to redownload the datasets and if that doesn't help contact me (beeblbrox at users.sourceforge.net). Finally before using the datasets you have to decompress them:


	  $ bunzip2 mnist_train_576_rbf_8vr.dat.bz2

	  $ bunzip2 mnist_test_576_rbf_8vr.dat.bz2

Running πSvM on a single processor

When running πSvM in parallel you use the mpirun script that is provided by your MPI library. The command line switch -np of mpirun is used to give the number of processors the parallel program is run on. So the following command means "train πSvM on 7 processors":


	  $ mpirun -np 7 ./pisvm-train

But let's first check if things work correctly on a single processor. Besides the options for SVM training, like selection of SVM and kernel type and parameters the most important options for πSvM are -o and -q. The command line switch -o is used to specify the size of the working set that is, the number of variables optimized over in in each iteration of the decomposition approach. Since it is important to keep a certain number of variables in the working set after each optimization step the switch -q is used to tell πSvM how many new variables should enter the working set. Now if you wonder what i'm talking about don't worry just use -o 512 -q 256 or -o 1024 -q 512 which works out nice with most datasets i have tested. This of course assumes that your datasets contains more than 512 or 1024 pattern respectively. If you're getting impatient by now just run the following:


	  $ mpirun -np 1 ./pisvm-train -o 512 -q 256 -s 0 -t 2 -g 1.667 -c 10
	  -m 256 mnist_train_576_rbf_8vr.dat

	  Initializing gradient...done.

	  it  | setup time | solver it | solver time | gradient time | kkt violation

	  1 |       0.30 |       513 |        0.01 |         38.79 | 3.02341

	  2 |       0.36 |       821 |        0.02 |         27.48 | 3.6475

	  3 |       0.29 |       782 |        0.02 |         21.24 | 3.12844

	  4 |       0.24 |       892 |        0.02 |         26.39 | 2.99326

	  5 |       0.27 |       903 |        0.02 |         29.44 | 3.06529

	  ...

	 51 |       0.07 |         5 |        0.00 |          1.00 | 0.000977987

	  

	  optimization finished, #iter = 51

	  nu = 0.002356

	  obj = -706.795212, rho = -0.946953

	  nSV = 3289, nBSV = 0

	  Total nSV = 3289

	  I/O time = 46.97

This trains a C-SVC on the mnist-576-rbf-8vr dataset using a RBF kernel with γ=1.667, C=10 and a kernel cache with 256MB. Training on a single processor might take up to one hour depending on the hardware you are using. The resulting SVM model is placed in the file mnist_train_576_rbf_8vr.dat.model. To get predictions based on this model for the test data use pisvm-predict.


	  $ mpirun -np 1 ./pisvm-predict mnist_test_576_rbf_8vr.dat  mnist_train_576_rbf_8vr.dat.model out.txt

	  Accuracy = 99.82% (9982/10000) (classification)

	  Mean squared error = 0.0072 (regression)

	  Squared correlation coefficient = 0.979562 (regression)

While the predicted labels are placed in the file out.txt one can see that for the trained C-SVC model the test error is 0.18%.

Running πSvM on multiple processors

Since we already used mpirun to test πSvM on a single processor the step to multiple processors is easy by giving a different number with the -np option. If you want to use MPI on a set of different host computers it's best to first create a text file which contains each host computers name on a seperate line. So if you want machines 'asterix' and 'obelix' to do the work create a machine file machines.txt with:


	  $ echo -e "asterix\nobelix" >machines.txt

Now to train πSvM on dataset mnist-576-rbf-8vr using two processors run:


	  $ mpirun -machinefile machines.txt -np 2 ./pisvm-train -o 512 -q 256 -s 0 -t 2 -g 1.667 -c 10
	  -m 256 mnist_train_576_rbf_8vr.dat

	  Initializing gradient...Initializing gradient...done.

	  done.

	  it  | setup time | solver it | solver time | gradient time | kkt violation

	  1 |       0.30 |       513 |        0.01 |         38.65 | 3.02341

	  2 |       0.34 |       821 |        0.02 |         26.11 | 3.6475

	  3 |       0.29 |       782 |        0.01 |         20.32 | 3.12844

	  4 |       0.27 |       892 |        0.02 |         25.76 | 2.99326

	  5 |       0.30 |       903 |        0.02 |         28.92 | 3.06529

	  ...

	  51 |       0.18 |         5 |        0.00 |          0.89 | 0.000977987

	  

	  optimization finished, #iter = 51

	  nu = 0.002356

	  obj = -706.795212, rho = -0.946953

	  nSV = 3289, nBSV = 0

	  Total nSV = 3289

	  I/O time = 76.76

	  

	  optimization finished, #iter = 51

	  nu = 0.002356

	  obj = -706.795212, rho = -0.946953

	  nSV = 3289, nBSV = 0

	  Total nSV = 3289

	  I/O time = 76.46

Note that the -machinefile option of mpirun is used to pass the name of the machine file prepared in the previous step. Now the training time of πSvM should be cut down by more than a factor of two resulting in a superlinear speedup. To increase the number of processors just add more host computers to your file machines.txt. It is also possible to let the number of processors passed by -np switch exceed the number of host computers listed in machines.txt. But then more than one instance of πSvM is started on some of the hosts and you might get load balancing problems. Of course prediction can also be parallelized in a similar way:


	  $ mpirun -machinefile machines.txt -np 2 ./pisvm-predict mnist_test_576_rbf_8vr.dat  mnist_train_576_rbf_8vr.dat.model out.txt

	  Accuracy = 99.82% (9982/10000) (classification)

	  Mean squared error = 0.0072 (regression)

	  Squared correlation coefficient = 0.979562 (regression)

Again the predicted labels are placed in the file out.txt and the test error is 0.18% although prediction time should now be about two times as fast.

What to do next?