Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Normalisation Option #5

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hsinyuan-huang
Copy link
Contributor

With -n option added when executing train, It would normalize every instance so it become unit length in Euclidean norm. And the .model file would have a line of "normalization 1" in it. (if -n is not added, then it would be "normalization 0") So when you execute predict with that .model file, it will also do the normalization on the test data. (Thus the old .model file would be deprecated)

Example:
train -s 7 -n -e 1e-6 covtype.libsvm.binary

Some comparison:
train -s 0 -e 1e-6 covtype.libsvm.binary: 0m39.492s, with -n: 0m04.001s
train -s 1 -e 1e-6 covtype.libsvm.binary: 2m47.380s, with -n: 0m10.116s
train -s 2 -e 1e-6 covtype.libsvm.binary: 0m38.217s, with -n: 0m05.072s
train -s 7 -e 1e-6 covtype.libsvm.binary: 3m42.433s, with -n: 0m07.034s

train -s 1 -e 1e-6 splice.txt
predict splice.t splice.txt.model out.txt: 84.2299%, with -n: 84.9655%

train -s 1 -e 1e-6 a9a.txt
predict a9a.t a9a.txt.model out.txt: 84.9395%, with -n: 85.0132%

train -s ALPHA -e 1e-6 w1a.txt
predict w1a.t w1a.txt.model out.txt: 96.9221%, with -n: 97.6625%
ALPHA from 0 to 7: (without -n)
Accuracy = 97.2902% (45991/47272)
Accuracy = 96.8903% (45802/47272)
Accuracy = 96.9221% (45817/47272)
Accuracy = 97.1019% (45902/47272)
Accuracy = 96.8523% (45784/47272)
Accuracy = 97.3959% (46041/47272)
Accuracy = 97.5736% (46125/47272)
Accuracy = 97.2902% (45991/47272)
(with -n)
Accuracy = 97.5144% (46097/47272)
Accuracy = 97.6646% (46168/47272)
Accuracy = 97.6625% (46167/47272)
Accuracy = 97.6455% (46159/47272)
Accuracy = 97.635% (46154/47272)
Accuracy = 97.745% (46206/47272)
Accuracy = 97.3473% (46018/47272)
Accuracy = 97.5144% (46097/47272)

train -s ALPHA -e 1e-6 svmguide1.txt
predict svmguide1.t svmguide1.txt.model out.txt
ALPHA from 0 to 7: (without -n)
Accuracy = 79.025% (3161/4000)
Accuracy = 78.95% (3158/4000)
Accuracy = 78.925% (3157/4000)
Accuracy = 59.125% (2365/4000)
Accuracy = 76.625% (3065/4000)
Accuracy = 78.9% (3156/4000)
Accuracy = 79.025% (3161/4000)
Accuracy = 80.125% (3205/4000)
(with -n)
Accuracy = 78.4% (3136/4000)
Accuracy = 78.425% (3137/4000)
Accuracy = 78.425% (3137/4000)
Accuracy = 78.3% (3132/4000)
Accuracy = 78.5% (3140/4000)
Accuracy = 79.025% (3161/4000)
Accuracy = 78.225% (3129/4000)
Accuracy = 78.4% (3136/4000)

It is basically a lot faster with similar accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant