-
Hey! Say I have a step_key = fold_in(key, step)
step_device_key = fold_in(step_key, device_id) Is there a more efficient way in which both step_device_key = fold_in(key, (step, device_id)) Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 7 replies
-
I suspect that iterating You could use a pairing function, which enumerates the integer grid, and compose it with def cantor(a, b):
a, b = a + 1, b + 1
return (a * a + 2 * a * b + b * b - a - 3 * b + 2) // 2
def fold_in2(key, a, b):
return jax.random.fold_in(key, cantor(a, b)) Here's how the enumeration looks: >>> import numpy as np
>>> np.array([cantor(i, j) for i in range(5) for j in range(5)]).reshape(5, 5)
array([[ 1, 2, 4, 7, 11],
[ 3, 5, 8, 12, 17],
[ 6, 9, 13, 18, 24],
[10, 14, 19, 25, 32],
[15, 20, 26, 33, 41]]) The Rosenberg-Strong pairing function might be better: >>> def rs(a, b):
... m = np.maximum(a, b)
... return m * m + m + a - b
...
>>> np.array([rs(i, j) for i in range(5) for j in range(5)]).reshape(5, 5)
array([[ 0, 1, 4, 9, 16],
[ 3, 2, 5, 10, 17],
[ 8, 7, 6, 11, 18],
[15, 14, 13, 12, 19],
[24, 23, 22, 21, 20]]) The R-S function is bijective between [n] x [n] and [n*n] for any finite positive n (where by [n] here I mean {0, ..., n-1}). That's efficient for the uint32 that One downside to this approach over iterated hashing is that collisions follow simple patterns. Collisions are inevitable when mapping two N bit integers to one. With the R-S function, we have simple cycles like the following (taking N = 32 to simulate uint32): >>> [rs(i, 2**16) % 2 ** 32 for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> [rs(i, 2**18) % 2 ** 32 for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> [rs(i, 2**20) % 2 ** 32 for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] To the extent that this is an issue, you could say that the R-S pairing approach only works over pairs of N/2-bit integers. |
Beta Was this translation helpful? Give feedback.
-
(a dumber but probably simpler answer for this particular case:
or possibly this is better even for your case? though I dunno about perf.
|
Beta Was this translation helpful? Give feedback.
I suspect that iterating
fold_in
(the hash) is often fine, and won't often present a bottleneck. If this does need optimizing, we could have some fun:You could use a pairing function, which enumerates the integer grid, and compose it with
fold_in
. Here is an example using Cantor's pairing function:Here's how the enumeration looks: