persistent_dict: add complex hashing, numpy scalar hashing #184

matthiasdiener · 2023-09-25T21:39:26Z

Please squash

matthiasdiener · 2023-09-25T22:22:23Z

This is ready for review.

inducer · 2023-09-26T19:52:03Z

pytools/persistent_dict.py

@@ -301,6 +315,10 @@ def update_for_int(key_hash, key):
                return
            except OverflowError:
                sz *= 2
+            except AttributeError:
+                # Numpy scalars don't have to_bytes()


I think a better approach would be to have all numpy scalars go to the same method, in which you convert to shape-() array and then use to_bytes on that.

Should we add similar functionality for ndarrays as well? maybe something like (tobytes, shape, dtype)?

No, definitely not!

They're mutable.

With that implementation, it would violate the fundamental property a hash satisfies, i.e. a == b imples hash(a) == hash(b). That's because the array's byte pattern depends on storage order (e.g. F/C ordering), for arrays that Numpy would otherwise consider equal.

It seems that tobytes() for complex256 (only that dtype) isn't stable:

import numpy as np for i in range(10): k = np.complex256(1.1+2.2j) print(k.tobytes())

$ python t.py b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x00\x00\x00\x00\x00\x00\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@\x00\x00\x00\x00\x00\x00' b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x00\x00\x00\x00\x00\x00\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@\x00\x00\x00\x00\x00\x00' b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x00\x00\x00\x00\x00\x00\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@\x00\x00\x00\x00\x00\x00' b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x00\x00\x00\x00\x00\x00\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@\x00\x00\x00\x00\x00\x00' b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x00\x00\x00\x00\x00\x00\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@\x00\x00\x00\x00\x00\x00' b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x00\x00\x00\x00\x00\x00\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@\x00\x00\x00\x00\x00\x00' b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x00\x00\x00\x00\x00\x00\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@\x00\x00\x00\x00\x00\x00' b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x00\x00\x00\x00\x00\x00\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@\x00\x00\x00\x00\x00\x00' b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x86\xc5\xeel\x16\xfe\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@`w\x00\x00\x00\x00' b'\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\xff?\x86\xc5\xeel\x16\xfe\x00\xd0\xcc\xcc\xcc\xcc\xcc\x8c\x00@`w\x00\x00\x00\x00'

(same issue with an array of complex256)

That makes a little bit of sense to me, as (at least on x86), float128 (where complex256 is just two of those) is just the x87 (not a typo) 80-bit FP type. So there are six bytes that aren't accounted for and possibly arbitrary.

Maybe specially handle those? (Cast to float (yes, Python float, both parts for complex), call repr and hash that)?

That makes a little bit of sense to me, as (at least on x86), float128 (where complex256 is just two of those) is just the x87 (not a typo) 80-bit FP type. So there are six bytes that aren't accounted for and possibly arbitrary.

Oh 😞 In other words, this might also affect float128 itself?

Maybe specially handle those? (Cast to float (yes, Python float, both parts for complex), call repr and hash that)?

If you prefer that idea to f1b766c, I can implement that.

I implemented your suggestion in aeb824a (using Python's complex() for converting np.complex256 because that seemed a bit simper).

pytools/persistent_dict.py

This reverts commit f1b766c.

This reverts commit 078c75a.

This reverts commit fc4350a.

inducer · 2023-10-11T14:48:47Z

Thanks!

persistent_dict: add complex hashing, numpy scalar hashing

ee1691c

matthiasdiener force-pushed the persistent-dict-scalar-hashing branch from ffa92cf to ee1691c Compare September 25, 2023 21:39

matthiasdiener added 3 commits September 25, 2023 16:49

simplify

326a3b5

fix

64f2239

better update_for_complex

5964b1c

matthiasdiener mentioned this pull request Sep 25, 2023

add PytatoKeyBuilder, persistent_dict test inducer/pytato#459

Merged

inducer reviewed Sep 26, 2023

View reviewed changes

convert numpy scalars to 1D array

c443e89

inducer reviewed Sep 26, 2023

View reviewed changes

pytools/persistent_dict.py Outdated Show resolved Hide resolved

matthiasdiener added 7 commits September 26, 2023 15:15

use np.array instead

1d710ee

tobytes() is not stable, try str+dtype

f1b766c

fix number detection

078c75a

Revert "tobytes() is not stable, try str+dtype"

af5cb35

This reverts commit f1b766c.

Revert "fix number detection"

fc4350a

This reverts commit 078c75a.

convert large float types to python types

aeb824a

Revert "Revert "fix number detection""

c4906b1

This reverts commit fc4350a.

matthiasdiener requested a review from inducer September 28, 2023 17:31

Merge branch 'main' into persistent-dict-scalar-hashing

a255a1a

inducer merged commit 47dfe4f into inducer:main Oct 11, 2023
12 checks passed

matthiasdiener deleted the persistent-dict-scalar-hashing branch October 11, 2023 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

persistent_dict: add complex hashing, numpy scalar hashing #184

persistent_dict: add complex hashing, numpy scalar hashing #184

matthiasdiener commented Sep 25, 2023 •

edited

Loading

matthiasdiener commented Sep 25, 2023

inducer Sep 26, 2023

matthiasdiener Sep 26, 2023

inducer Sep 26, 2023

matthiasdiener Sep 26, 2023 •

edited

Loading

inducer Sep 26, 2023

matthiasdiener Sep 26, 2023

matthiasdiener Sep 27, 2023

inducer commented Oct 11, 2023

persistent_dict: add complex hashing, numpy scalar hashing #184

persistent_dict: add complex hashing, numpy scalar hashing #184

Conversation

matthiasdiener commented Sep 25, 2023 • edited Loading

matthiasdiener commented Sep 25, 2023

inducer Sep 26, 2023

Choose a reason for hiding this comment

matthiasdiener Sep 26, 2023

Choose a reason for hiding this comment

inducer Sep 26, 2023

Choose a reason for hiding this comment

matthiasdiener Sep 26, 2023 • edited Loading

Choose a reason for hiding this comment

inducer Sep 26, 2023

Choose a reason for hiding this comment

matthiasdiener Sep 26, 2023

Choose a reason for hiding this comment

matthiasdiener Sep 27, 2023

Choose a reason for hiding this comment

inducer commented Oct 11, 2023

matthiasdiener commented Sep 25, 2023 •

edited

Loading

matthiasdiener Sep 26, 2023 •

edited

Loading