-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Brute force python save_load test fail randomly #704
Comments
I found one query distances with different to samples are very close occasionally: E array([[7356, 6315, 3579, 2860, 3790, 1747, 9880, 2584, 4067, 2456],
E [7520, 6982, 5815, 929, 2325, 2250, 1439, 9866, 8289, 4179],
E [5404, 6802, 2551, 5991, 1892, 1024, 2428, 6699, 2606, 7739],
E [5774, 8981, 6457, 7510, 8410, 9036, 9010, 2026, 2463, 8314],
E [1369, 9634, 338, 8415, 9194, 3971, 1448, 9331, 1082, 1115],
E [7754, 1067, 7113, 3789, 1614, 6765, 7434, 541, 8518, 2720],
E [8104, 8804, 6654, 1776, 4759, 4700, 9157, 744, 9911, 9321],
E [8735, 3937, 5238, 5307, 9597, 4964, 8644, 3688, 3545, 8054],
E [5948, 1218, 8975, 3946, 7593, 2818, 9065, 1631, 5087, 9053],
E [6484, 8709, 7806, 9591, 6222, 3882, 3098, 5923, 1283, 2439],
E [8394, 3987, 7022, 7663, 2785, 4737, 2056, 4049, 8812, 7542],
E [4657, 7893, 332, 8275, 6701, 6561, 3746, 2839, 9894, 5603],
E [8009, 2137, 2013, 1234, 9554, 249, 4547, 9061, 8111, 6968],
E [5326, 512, 9282, 4583, 3257, 654, 8371, 7634, 2745, 3021],
E [ 543, 6044, 5574, 4648, 3479, 7515, 2366, 8803, 5344, 7147],
E [5148, 3850, 4997, 2023, 2537, 2392, 3603, 8942, 8239, 3979],
E [4257, 4958, 3282, 1352, 3671, 1301, 1036, 3618, 4553, 3117],
E [4014, 7698, 4968, 7374, 6330, 3304, 3840, 2815, 4923, 9442],
E [7005, 1510, 9989, 9899, 5163, 7446, 9105, 6494, 8157, 5861],
E [ 593, 5258, 3633, 2708, 7588, 2196, 7364, 7779, 7036, 5612],
E [2448, 1863, 100, 3025, 1347, 4543, 7865, 4251, 251, 6939],
E [ 339, 7729, 8217, 5146, 5502, 2968, 8170, 7342, 5736, 3631],
E [9443, 9989, 5761, 9708, 9475, 9838, 2443, 641, 6118, 4325],
E [6073, 1128, 2027, 1558, 8969, 59, 7697, 3103, 4433, 7554],
E [4006, 9111, 8843, 4222, 8114, 509, 402, 3178, 5061, 1806],
E [ 609, 7838, 5915, 8270, 871, 9903, 7710, 8054, 29, 2906],
E [5312, 9984, 1499, 6424, 8703, 3908, 1291, 6405, 9563, 2469],
E [8868, 6144, 8657, 6008, 2755, 2112, 6999, 2860, 5169, 4849],
E [9227, 6329, 7620, 1352, 7751, 8577, 5430, 4524, 8615, 8882],
E [6497, 6595, 3627, 7979, 1118, 3082, 3075, 1519, 7416, 9992],
E [4499, 7839, 3740, 6926, 2839, 5849, 4634, 9760, 6142, 9553],
E [6603, 2519, 4735, 2744, 6931, 9085, 4633, 8498, 2755, 4033],
E [ 175, 2110, 7, 1412, 1798, 1481, 8614, 1371, 910, 4410],
E [5100, 9721, 7463, 6862, 5599, 2337, 7774, 1352, 8642, 9823],
E [7126, 6686, 5524, 5590, 3049, 1207, 1657, 1299, 8063, 1240],
E [4630, 4522, 6362, 3991, 4703, 404, 4806, 3987, 8907, 2668],
E [1806, 3429, 5282, 2632, 9513, 6054, 1176, 2030, 8223, 4034],
E [1682, 2312, 8311, 4145, 2564, 9377, 9194, 7276, 6398, 8011],
E [3354, 2311, 3432, 3160, 8459, 2196, 2540, 7471, 9730, 346],
E [5835, 4479, 2361, 398, 8541, 2573, 4455, 9601, 6844, 2813],
E [8434, 2691, 8829, 7797, 9241, 119, 2212, 7958, 5488, 5151],
E [5737, 9222, 2020, 1136, 4479, 2518, 9206, 3687, 4255, 107],
E [6184, 6582, 3734, 3876, 9532, 730, 8907, 7171, 3682, 7437],
E [3581, 9124, 3998, 3021, 3323, 2810, 7940, 6185, 7069, 9403],
E - [8882, 9900, 8757, 107, 8479, 5171, 9484, 9775, 3933, 2111],
E ? ------
E + [8882, 9900, 8757, 107, 8479, 9484, 5171, 9775, 3933, 2111],
E ? ++++++ ++++++
[3.3649712 3.8245468 4.1913643 4.262558 4.3329716 4.3587875 4.3587875
4.4026165 4.408066 4.4113426] It can be reproduced by the code: import numpy as np
dataset_5171 = np.array( [9.32142138e-01, 8.72813582e-01, 7.71189928e-01, 5.85802495e-01,
2.38001600e-01, 7.06278622e-01, 8.78981054e-01, 4.03263688e-01,
4.75592285e-01, 1.45135760e-01, 3.21364909e-01, 6.87913895e-01,
8.30989003e-01, 2.19078675e-01, 6.42280996e-01, 4.20177460e-01,
8.86989295e-01, 3.68644297e-01, 9.93755877e-01, 6.99248761e-02,
9.09213126e-02, 2.24599376e-01, 8.28429461e-01, 3.59329879e-01,
5.71323633e-01, 6.59366027e-02, 4.18484896e-01, 8.31805766e-01,
2.35715076e-01, 8.27406108e-01, 8.21960211e-01, 4.06000704e-01,
4.96624112e-01, 4.26898181e-01, 2.14131534e-01, 6.58396363e-01,
3.63032669e-01, 9.13158238e-01, 8.36311519e-01, 2.79704154e-01,
4.96733159e-01, 9.39586386e-02, 8.37650478e-01, 7.54839361e-01,
4.97722834e-01, 5.02949297e-01, 8.90806139e-01, 5.37597716e-01,
7.28471994e-01, 7.22921133e-01], dtype=float)
dataset_9484 = np.array( [9.81457651e-01, 5.93166947e-01, 1.56116426e-01, 7.87705481e-01,
6.65047050e-01, 6.26070380e-01, 1.89543471e-01, 8.99864018e-01,
4.08117533e-01, 2.98274849e-02, 8.38999808e-01, 9.41052794e-01,
7.70968139e-01, 1.60131723e-01, 9.21209812e-01, 4.37662721e-01,
7.82714367e-01, 9.02948081e-01, 9.07859057e-02, 4.08284068e-01,
4.53266650e-01, 6.05524331e-03, 9.13958311e-01, 7.49397278e-01,
4.31984991e-01, 1.09489582e-01, 4.56626981e-01, 3.62299412e-01,
1.01631679e-01, 6.10530853e-01, 9.31025982e-01, 7.03031659e-01,
4.18992788e-01, 7.79154778e-01, 6.04481623e-03, 1.96646154e-01,
8.79877806e-01, 4.18404818e-01, 1.33909434e-01, 6.40034139e-01,
6.06465399e-01, 2.39080772e-01, 7.01063633e-01, 7.51568615e-01,
9.95701730e-01, 5.62685132e-01, 4.60158437e-01, 2.16199681e-01,
4.73403454e-01, 6.09784663e-01], dtype=float)
query = np.array([
6.98716998e-01, 7.96642601e-01, 4.80482608e-01, 4.85115111e-01,
4.14034247e-01, 6.72952533e-01, 2.37295032e-01, 7.65312135e-01,
9.61143553e-01, 4.36596006e-01, 6.77114785e-01, 4.44995224e-01,
6.83229864e-01, 3.40059429e-01, 6.11794770e-01, 6.90815866e-01,
9.56425905e-01, 8.06521237e-01, 9.50217426e-01, 9.30451825e-02,
1.24300353e-01, 6.84115410e-01, 5.58719575e-01, 9.52109814e-01,
7.08390355e-01, 2.19248887e-02, 7.92785466e-01, 8.06614876e-01,
2.92044669e-01, 8.83844793e-01, 8.60216200e-01, 9.07594740e-01,
2.09351406e-01, 9.41360295e-01, 4.94379848e-02, 4.60062832e-01,
8.38162124e-01, 5.27859628e-01, 3.16174507e-01, 2.18997210e-01,
5.47150195e-01, 1.86117634e-01, 6.49071097e-01, 6.42057657e-01,
4.42520112e-01, 7.30912447e-01, 5.05176723e-01, 2.86207590e-02,
5.84955156e-01, 3.34418684e-01], dtype=float)
def squared_euclidean_distance(vec1, vec2):
return np.sum((vec1 - vec2) ** 2)
print("Squared Euclidean Distance (5171, Query):", squared_euclidean_distance(dataset_5171, query))
print("Squared Euclidean Distance (9484, Query):", squared_euclidean_distance(dataset_9484, query))
# Squared Euclidean Distance (5171, Query): 4.358781377215945
# Squared Euclidean Distance (9484, Query): 4.358787387166914 |
The current conclusion is a random failure caused because the distances are too close. There's still a reminded suspect point as to why the precision looks like a downgrade. As we can see, the Python code shows the differences in the 6th decimal place. But anyway, it does not seem to be a serious issue. |
The log shows the set of neighbors is correct, but some order of them is not overturned
This could be reproduced with very high possibility by: https://github.com/rhdong/cuvs/tree/rhdong/bf-py-test-fail-reproduce
The text was updated successfully, but these errors were encountered: