Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Only consider salient bytes in sharecommon eq, hash #5840

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tkoeppe
Copy link
Contributor

@tkoeppe tkoeppe commented Oct 25, 2024

Change lean_sharecommon_{eq,hash} to only consider the salient bytes of an object, and not any bytes of any unspecified/uninitialized unused capacity.

Accessing uninitialized storage results in undefined behaviour.

This does not seem to have any semantics disadvantages: If objects compare equal after this change, their salient bytes are still equal. By contrast, if the actual identity of allocations needs to be distinguished, that can be done by just comparing pointers to the storage.

If we wanted to retain the current logic, we would need initialize the otherwise unused parts to some specific value to avoid the undefined behaviour.

Closes #5831

Change "lean_sharecommon_{eq,hash}" to only consider the salient bytes
of an object, and not any bytes of any unspecified/uninitialized unused
capacity.

Accessing uninitialized storage results in undefined behaviour.

This does not seem to have any semantics disadvantages: If objects
compare equal after this change, their salient bytes are still
equal. By contrast, if the actual identity of allocations needs to be
distinguished, that can be done by just comparing pointers to the
storage.

If we wanted to retain the current logic, we would need initialize the
otherwise unused parts to some specific value to avoid the undefined
behaviour.
Comment on lines -16 to 18
size_t sz1 = lean_object_byte_size(o1);
size_t sz2 = lean_object_byte_size(o2);
size_t sz1 = lean_object_data_size(o1);
size_t sz2 = lean_object_data_size(o2);
if (sz1 != sz2) return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that this should reject objects as unequal if they have the same 'value' but differing capacities.

Copy link
Contributor Author

@tkoeppe tkoeppe Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That used to be the case, e.g. on account of L18, but is now no longer so. As long as the "value" is the same, we now compare equal regardless of capacity (and also hash equal).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I was pointing this out for other reviewers; perhaps we should be more conservative and only fix the undefined behavior without changing the behavior for objects with differing numbers of unused bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you'd consider the total size (but not contents) as salient, too? We could do that. It'd be good to then also include that size in the hash.

I'm not sure if this is overall desirable, though. I.e. why would you expect equal objects to have the same capacity? Wouldn't the capacity be checked locally by any operation that grows the contents, and not something you'd assume upfront?

That is, yes, we could make this change, but I'd hope that nothing would depend on that.

@tkoeppe tkoeppe changed the title [sharecommon] Only consider salient bytes in eq, hash fix: Only consider salient bytes in sharecommon eq, hash Oct 25, 2024
@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Oct 25, 2024
@leanprover-community-bot
Copy link
Collaborator

Mathlib CI status (docs):

  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 4c0d12b3f1e8fdb4002eb01f8f5ff9be9ec4b25b --onto 193b6f2bec332ac0bce33e10856a96163d4be456. (2024-10-25 16:07:27)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ShareCommon.Object.hash assigns mismatching hashes to equal objects
3 participants