-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Only consider salient bytes in sharecommon eq, hash #5840
base: master
Are you sure you want to change the base?
Conversation
Change "lean_sharecommon_{eq,hash}" to only consider the salient bytes of an object, and not any bytes of any unspecified/uninitialized unused capacity. Accessing uninitialized storage results in undefined behaviour. This does not seem to have any semantics disadvantages: If objects compare equal after this change, their salient bytes are still equal. By contrast, if the actual identity of allocations needs to be distinguished, that can be done by just comparing pointers to the storage. If we wanted to retain the current logic, we would need initialize the otherwise unused parts to some specific value to avoid the undefined behaviour.
size_t sz1 = lean_object_byte_size(o1); | ||
size_t sz2 = lean_object_byte_size(o2); | ||
size_t sz1 = lean_object_data_size(o1); | ||
size_t sz2 = lean_object_data_size(o2); | ||
if (sz1 != sz2) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible that this should reject objects as unequal if they have the same 'value' but differing capacities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That used to be the case, e.g. on account of L18, but is now no longer so. As long as the "value" is the same, we now compare equal regardless of capacity (and also hash equal).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I was pointing this out for other reviewers; perhaps we should be more conservative and only fix the undefined behavior without changing the behavior for objects with differing numbers of unused bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you'd consider the total size (but not contents) as salient, too? We could do that. It'd be good to then also include that size in the hash.
I'm not sure if this is overall desirable, though. I.e. why would you expect equal objects to have the same capacity? Wouldn't the capacity be checked locally by any operation that grows the contents, and not something you'd assume upfront?
That is, yes, we could make this change, but I'd hope that nothing would depend on that.
Mathlib CI status (docs):
|
Change
lean_sharecommon_{eq,hash}
to only consider the salient bytes of an object, and not any bytes of any unspecified/uninitialized unused capacity.Accessing uninitialized storage results in undefined behaviour.
This does not seem to have any semantics disadvantages: If objects compare equal after this change, their salient bytes are still equal. By contrast, if the actual identity of allocations needs to be distinguished, that can be done by just comparing pointers to the storage.
If we wanted to retain the current logic, we would need initialize the otherwise unused parts to some specific value to avoid the undefined behaviour.
Closes #5831