How to avoid double random.fold_in when indexing into two different sources? #15240

cgarciae · 2023-03-27T17:00:49Z

cgarciae
Mar 27, 2023
Collaborator

Hey! Say I have a step and device_id int scalars, I want to fold a source key into both. I could be done just be repeated use of fold_in e.g:

step_key = fold_in(key, step)
step_device_key = fold_in(step_key, device_id)

Is there a more efficient way in which both step and device_id could be "hashed" into a single key to avoid running fold_in twice? E.g.

step_device_key = fold_in(key, (step, device_id))

Thanks!

Answered by froystig

Mar 27, 2023

I suspect that iterating fold_in (the hash) is often fine, and won't often present a bottleneck. If this does need optimizing, we could have some fun:

You could use a pairing function, which enumerates the integer grid, and compose it with fold_in. Here is an example using Cantor's pairing function:

def cantor(a, b):
  a, b = a + 1, b + 1
  return (a * a + 2 * a * b + b * b - a - 3 * b + 2) // 2

def fold_in2(key, a, b):
  return jax.random.fold_in(key, cantor(a, b))

Here's how the enumeration looks:

>>> import numpy as np
>>> np.array([cantor(i, j) for i in range(5) for j in range(5)]).reshape(5, 5)
array([[ 1,  2,  4,  7, 11],
       [ 3,  5,  8, 12, 17],
       [ 6,  9, 13, 18, 24],
  …

View full answer

froystig · 2023-03-27T21:07:14Z

froystig
Mar 27, 2023
Maintainer

I suspect that iterating fold_in (the hash) is often fine, and won't often present a bottleneck. If this does need optimizing, we could have some fun:

You could use a pairing function, which enumerates the integer grid, and compose it with fold_in. Here is an example using Cantor's pairing function:

def cantor(a, b):
  a, b = a + 1, b + 1
  return (a * a + 2 * a * b + b * b - a - 3 * b + 2) // 2

def fold_in2(key, a, b):
  return jax.random.fold_in(key, cantor(a, b))

Here's how the enumeration looks:

>>> import numpy as np
>>> np.array([cantor(i, j) for i in range(5) for j in range(5)]).reshape(5, 5)
array([[ 1,  2,  4,  7, 11],
       [ 3,  5,  8, 12, 17],
       [ 6,  9, 13, 18, 24],
       [10, 14, 19, 25, 32],
       [15, 20, 26, 33, 41]])

The Rosenberg-Strong pairing function might be better:

>>> def rs(a, b):
...   m = np.maximum(a, b)
...   return m * m + m + a - b
... 
>>> np.array([rs(i, j) for i in range(5) for j in range(5)]).reshape(5, 5)
array([[ 0,  1,  4,  9, 16],
       [ 3,  2,  5, 10, 17],
       [ 8,  7,  6, 11, 18],
       [15, 14, 13, 12, 19],
       [24, 23, 22, 21, 20]])

The R-S function is bijective between [n] x [n] and [n*n] for any finite positive n (where by [n] here I mean {0, ..., n-1}). That's efficient for the uint32 that fold_in wants (n = 2 ** 32).

One downside to this approach over iterated hashing is that collisions follow simple patterns. Collisions are inevitable when mapping two N bit integers to one. With the R-S function, we have simple cycles like the following (taking N = 32 to simulate uint32):

>>> [rs(i, 2**16) % 2 ** 32 for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> [rs(i, 2**18) % 2 ** 32 for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> [rs(i, 2**20) % 2 ** 32 for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

To the extent that this is an issue, you could say that the R-S pairing approach only works over pairs of N/2-bit integers.

6 replies

cgarciae Mar 29, 2023
Collaborator Author

It would seem you can generalize this to more numbers simply by calling the paring function multiple times? E.g.

pairing(n1, n2, n3) = pairing(n1, pairing(n2, n3))

In case this grows too large what would be the best way of handling overflow?

froystig Mar 29, 2023
Maintainer

In case this grows too large what would be the best way of handling overflow?

I'm not sure. Especially at the point of overflows, it really seems best to iterate the hash? I also mentioned in the replies to @dlwh: maybe we should change fold_in to explicitly accept 64 bits (at least) so that we're much less likely to even need to think about collisions/overflows. We typically have at least 64 bits to work with, but we're throwing half away!

For the setting you seem to have, @dlwh's response is really the most efficient encoding. I originally took your question as a general one, but I overlooked that the name of one of your variables (device_id) means you probably know its extent (device count). And I bet that device count is much smaller than 2 ** 16! When you know extents, raveling is ideal, and will last much longer before overflowing and cycling.

Thinking again: if we're avoiding overflows, then we're effectively setting max values on inputs regardless. We could then just ravel on those (e.g. lambda a, b: a * 2 ** 16 + b in the 2d case). We'd reach for pairings if we truly didn't know extents (e.g. unknown max integer width), or don't want to depend on them for some reason, but that's not our world. So this was fun, but not quite necessary!

[...] by calling the paring function multiple times?

Yeah that seems like something to do "in math," namely if we weren't thinking about fixed-width and overflows.

cgarciae Mar 29, 2023
Collaborator Author

I've been working on hashing a varying length sequence of python integers into a jpn.uint32 to pass it to fold_in, but in python I can just use hashlib, so I was wondering how to do something equivalent in pure JAX. I chose the example because it seemed like a more common case but @froystig answer is closer to what I was looking for.

Thanks for all the info! Enjoyed the conversation :)

froystig Mar 29, 2023
Maintainer

Ah, good! In that case I think there really is a tradeoff here. If the sequence length is varying and extents aren't known then we may hit overflow with pairings or "max-int raveling," especially in 32 bits. String hashes (e.g. from hashlib) may require more computation but will avoid the kind of regularity that we'd see when ravelings/pairings reach overflow. I think of iterating fold_in as a sort of (jittable) string hash, though I expect that there are more efficient string hashes that can be written jittably.

I think either way, we've motivated taking (at least) 64-bit inputs in fold_in. Overflow would be less of a concern, and we have the space!

froystig Mar 29, 2023
Maintainer

Filed #15296

dlwh · 2023-03-28T22:40:34Z

dlwh
Mar 28, 2023

(a dumber but probably simpler answer for this particular case:

step_device_key = fold_in(key, step * num_devices + device_id)

or possibly this is better even for your case? though I dunno about perf.

step_device_keys = split(fold_in(key, step), num_devices)

1 reply

froystig Mar 28, 2023
Maintainer

Hah – I guess it's worth paying attention to what variable names suggest. :)

More generally, if we know the grid extents then yeah, we can use that to ravel any index. Often fold_in comes up when you don't know them, or when they're very large, otherwise you'd have split. But indeed, the device count is known (!), and maybe worth just splitting over in that case.

There's still the eventual question of cycling through a large enough counter. We probably won't exhaust the uint32 space just by scaling steps up by device count. Still, jax's default hash (Threefry) actually accepts 64 bit messages; maybe we should always accept at least that much in fold_in.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to avoid double random.fold_in when indexing into two different sources? #15240

{{title}}

Replies: 2 comments 7 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

How to avoid double random.fold_in when indexing into two different sources? #15240

cgarciae Mar 27, 2023 Collaborator

Replies: 2 comments · 7 replies

froystig Mar 27, 2023 Maintainer

cgarciae Mar 29, 2023 Collaborator Author

froystig Mar 29, 2023 Maintainer

cgarciae Mar 29, 2023 Collaborator Author

froystig Mar 29, 2023 Maintainer

froystig Mar 29, 2023 Maintainer

dlwh Mar 28, 2023

froystig Mar 28, 2023 Maintainer

cgarciae
Mar 27, 2023
Collaborator

Replies: 2 comments 7 replies

froystig
Mar 27, 2023
Maintainer

cgarciae Mar 29, 2023
Collaborator Author

froystig Mar 29, 2023
Maintainer

cgarciae Mar 29, 2023
Collaborator Author

froystig Mar 29, 2023
Maintainer

froystig Mar 29, 2023
Maintainer

dlwh
Mar 28, 2023

froystig Mar 28, 2023
Maintainer