-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up training #5
Comments
Hey Heather, you're right precomputing the data kernel matrix is a good way to speed things up, especially if your D is very large and N is not TOO crazy so that NxN kernel matrix fits in memory. I would expect this to provide a very large speedup and it is straight forward to implement. One can also play with some parameters. Most notably, numpasses can be lowered (down to 1 or 2 even without TOO much hit on classification performance possibly), and tol can be increased a bit too maybe, up to 1e-3 or 1e-2 or so, but that's sketchy. maxiter can be lowered too. I'll have a closer look into this tomorrow or so. What is the data size and dimension? Is it text data? Is it sparse? |
As a quick hack by the way, try to simply precompute kernel(data[i],data[j]) for all i,j into a 2D array and then use that instead of calling the kernel function. Can you achieve significant performance boost? |
ftr, kernel precomputation implemented in #6. One thing that definitely speeds up training is making each input vector a |
Any way to speed up training would be great. From the code, it looks like the results of the kernel function could be memoized for each data pair after it's been computed once. Avoiding that computation could be a big deal on my data set where each vector has a length in the thousands.
Are there any other improvements that could be made?
The text was updated successfully, but these errors were encountered: