You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found that when I upgrade from numpy 1 to 2, the clustering results are different on different platforms. This behavior didn't happen on numpy 1. I also tested setting numpy seeds and PYTHONHASHSEED and neither helped.
the issue happened when I upgraded from numpy 1.26.4 to numpy 2.1.1 and keeping all other packages the same.
You can reproduce it with this data by reading it into a dataframe then run HDBSCAN.fit(df) and setting cluster_selection_epsilon = 0.15 + the parameters in the json file.
On Linux-6.5.11-linuxkit-x86_64-with-glibc2.36 the exemplars for cluster 4 has 10 items (this is running on Apple M2)
On Linux-5.10.223-212.873.amzn2.x86_64-x86_64-with-glibc2.36 the exemplars for cluster 4 has only 5 items (this is running on one of the AWS machines, but seems to happen on all EC2 instances we have)
Both returned the same clusters -- only the exemplars are different. Also on numpy ` they returned the same exemplars.
The text was updated successfully, but these errors were encountered:
What
I found that when I upgrade from numpy 1 to 2, the clustering results are different on different platforms. This behavior didn't happen on numpy 1. I also tested setting numpy seeds and
PYTHONHASHSEED
and neither helped.How to reproduce
poetry dependency:
the issue happened when I upgraded from numpy
1.26.4
to numpy2.1.1
and keeping all other packages the same.You can reproduce it with this data by reading it into a dataframe then run
HDBSCAN.fit(df)
and settingcluster_selection_epsilon = 0.15
+ the parameters in the json file.data.json
The platform name is printed with
platform.platform()
Linux-6.5.11-linuxkit-x86_64-with-glibc2.36
the exemplars for cluster 4 has 10 items (this is running on Apple M2)Linux-5.10.223-212.873.amzn2.x86_64-x86_64-with-glibc2.36
the exemplars for cluster 4 has only 5 items (this is running on one of the AWS machines, but seems to happen on all EC2 instances we have)Both returned the same clusters -- only the exemplars are different. Also on numpy ` they returned the same exemplars.
The text was updated successfully, but these errors were encountered: