You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sort-means is giving slightly different results in some of my testing, compared to all the other algorithms. I have a test case where every other algorithm converges in 45 iterations, but it takes 42, and gives slightly different results (centers).
Steps to reproduce:
run naive k-means and sort-means on the attached input, with k=10, using the library's k-means++ initialization, with a max iterations of 1000
expected result: same number of iterations and centers
actual result: different centers, different iterations
The initial centers chosen by k-means++ on this dataset are:
I've started looking at this; so far, I've replicated these results, but nothing strikes me as obviously wrong about the implementation. I'll run some more detailed tests to see if I can find anything.
I would probably tackle this by printing out detailed information about each iteration (both for sort-means and for the naive algorithm), identifying the earliest iteration where something goes wrong (some assignment is incorrect), and drilling into why that happened.
Sort-means is giving slightly different results in some of my testing, compared to all the other algorithms. I have a test case where every other algorithm converges in 45 iterations, but it takes 42, and gives slightly different results (centers).
Steps to reproduce:
The initial centers chosen by k-means++ on this dataset are:
bad_input.txt.gz
The text was updated successfully, but these errors were encountered: