Monday, July 10, 2006

Cat Precision Recall
16 6.25% 20%
17 0% 0%
48 7.41% 5%
5 6.38% 12%
6 9.09% 10%
14 11.29% 9.33%
37 5% 7.69%
58 5% 20%
60 3.45% 50%
61 2.86% 9.09%
64 8.33% 5.41%

Sunday, July 09, 2006

Transductive Learner

For Category 16, I decided to use the Transductive Learner.

For a sample size of 5000 unlabeled, 178 labeled:
Number of switches: 40
Moving training errors to inconsistent examples...done.
done. (46619 iterations)
Optimization finished (0 misclassified, maxdiff=0.00100).
Runtime in cpu-seconds: 261.28
Number of SV: 611 (plus 0 inconsistent examples)
L1 loss: loss=1.54322
Norm of weight vector: |w|=7.02006
Norm of longest example vector: |x|=9.57293
Estimated VCdim of classifier: VCdim<=3772.04057
Number of kernel evaluations: 246909403
Writing model file...done

Accuracy on test set: 97.88% (783 correct, 17 incorrect, 800 total)
Precision/recall on test set: 7.14%/20.00%

svm_learn -d 1 -j 58 -t 1 -i 1 trans_5000.txt trans_model_5000

For 6000
Writing prediction file...done
Number of switches: 40
Moving training errors to inconsistent examples...done.
done. (58557 iterations)
Optimization finished (0 misclassified, maxdiff=0.00097).
Runtime in cpu-seconds: 411.56
Number of SV: 699 (plus 0 inconsistent examples)
L1 loss: loss=4.84448
Norm of weight vector: |w|=7.87144
Norm of longest example vector: |x|=7.85789
Estimated VCdim of classifier: VCdim<=2311.07587
Number of kernel evaluations: 449172648
Writing model file...done
$ svm_classify valdn.txt trans_model_8000 results_8000
Reading model...OK. (799 support vectors read)
Classifying test examples..100..200..300..400..500..600..700..800..done
Runtime (without IO) in cpu-seconds: 0.27
Accuracy on test set: 97.50% (780 correct, 20 incorrect, 800 total)
Precision/recall on test set: 5.88%/20.00%
Writing prediction file...done
Number of switches: 40
Moving training errors to inconsistent examples...done.
done. (58557 iterations)
Optimization finished (0 misclassified, maxdiff=0.00097).
Runtime in cpu-seconds: 394.93
Number of SV: 699 (plus 0 inconsistent examples)
L1 loss: loss=4.84448
Norm of weight vector: |w|=7.87144
Norm of longest example vector: |x|=7.85789
Estimated VCdim of classifier: VCdim<=2311.07587
Number of kernel evaluations: 449172648
Writing model file...done
$ svm_classify valdn.txt trans_model_6000 results_6000
Reading model...OK. (699 support vectors read)
Classifying test examples..100..200..300..400..500..600..700..800..done
Runtime (without IO) in cpu-seconds: 0.23
Accuracy on test set: 97.00% (776 correct, 24 incorrect, 800 total)
Precision/recall on test set: 4.76%/20.00%

For 7000
Writing prediction file...done
Number of switches: 72
Moving training errors to inconsistent examples...done.
done. (68516 iterations)
Optimization finished (0 misclassified, maxdiff=0.00097).
Runtime in cpu-seconds: 549.12
Number of SV: 844 (plus 0 inconsistent examples)
L1 loss: loss=4.34731
Norm of weight vector: |w|=7.02687
Norm of longest example vector: |x|=9.73493
Estimated VCdim of classifier: VCdim<=1690.04401
Number of kernel evaluations: 681462397
Writing model file...done
$ svm_classify valdn.txt trans_model_10000 results_10000
Reading model...OK. (3910 support vectors read)
Classifying test examples..100..200..300..400..500..600..
$ svm_classify valdn.txt trans_model_7000 results_7000
Reading model...OK. (844 support vectors read)
Classifying test examples..100..200..300..400..500..600..700..800..done
Runtime (without IO) in cpu-seconds: 0.28
Accuracy on test set: 98.00% (784 correct, 16 incorrect, 800 total)
Precision/recall on test set: 7.69%/20.00%

For 8000
done. (78815 iterations)
Optimization finished (0 misclassified, maxdiff=0.00096).
Runtime in cpu-seconds: 730.46
Number of SV: 799 (plus 0 inconsistent examples)
L1 loss: loss=3.10087
Norm of weight vector: |w|=7.47849
Norm of longest example vector: |x|=10.23067
Estimated VCdim of classifier: VCdim<=3914.93053
Number of kernel evaluations: 949252765
Writing model file...done
$ svm_classify valdn.txt trans_model_7000 results_7000
Reading model...OK. (2966 support vectors read)
Classifying test examples..100..200..300..400..500..600..700..800..done
Runtime (without IO) in cpu-seconds: 0.88
Accuracy on test set: 96.62% (773 correct, 27 incorrect, 800 total)
Precision/recall on test set: 4.17%/20.00%
$ svm_classify valdn.txt trans_model_8000 results_8000
Reading model...OK. (799 support vectors read)
Classifying test examples..100..200..300..400..500..600..700..800..done
Runtime (without IO) in cpu-seconds: 0.27
Accuracy on test set: 97.50% (780 correct, 20 incorrect, 800 total)
Precision/recall on test set: 5.88%/20.00%

For 10,0000
Number of switches: 108
Moving training errors to inconsistent examples...done.
done. (99177 iterations)
Optimization finished (0 misclassified, maxdiff=0.00096).
Runtime in cpu-seconds: 1116.92
Number of SV: 863 (plus 0 inconsistent examples)
L1 loss: loss=1.63717
Norm of weight vector: |w|=7.91010
Norm of longest example vector: |x|=8.84691
Estimated VCdim of classifier: VCdim<=4835.62639
Number of kernel evaluations: 1617217915
Writing model file...done
svm_classify valdn.txt trans_model_10000 results_10000
Reading model...OK. (863 support vectors read)
Classifying test examples..100..200..300..400..500..600..700..800..done
Runtime (without IO) in cpu-seconds: 0.28
Accuracy on test set: 97.50% (780 correct, 20 incorrect, 800 total)
Precision/recall on test set: 5.88%/20.00%

Retraining..........done
Writing prediction file...done
Number of switches: 90
Moving training errors to inconsistent examples...done.
done. (135014 iterations)
Optimization finished (0 misclassified, maxdiff=0.00098).
Runtime in cpu-seconds: 1582.78
Number of SV: 2015 (plus 0 inconsistent examples)
L1 loss: loss=6.76363
Norm of weight vector: |w|=4.16550
Norm of longest example vector: |x|=78.26780
Estimated VCdim of classifier: VCdim<=97798.94683
Number of kernel evaluations: 2105171198
Writing model file...done
$ svm_classify valdn.txt trans_model_10000 results_10000
Reading model...OK. (2015 support vectors read)
Classifying test examples..100..200..300..400..500..600..700..800..done
Runtime (without IO) in cpu-seconds: 0.58
Accuracy on test set: 96.50% (772 correct, 28 incorrect, 800 total)
Precision/recall on test set: 4.00%/20.00%