Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

findLastChangeRevision() returns the wrong result in a multi-node env #3643

Closed
wacker opened this issue Oct 16, 2024 · 5 comments
Closed

findLastChangeRevision() returns the wrong result in a multi-node env #3643

wacker opened this issue Oct 16, 2024 · 5 comments
Assignees
Labels
type: regression A regression from a previous release

Comments

@wacker
Copy link

wacker commented Oct 16, 2024

There is a regression in v3.3.4 introduced by #3579, which leads to the wrong revision returned by a call to findLastChangeRevision().

The assumption of strictly increasing revision numbers over time holds true for a single instance but not in a multi-node environment, as Hibernate sequence numbers are allocated per node in batches of 50.

Cases where the latest revision actually has the highest revision number become increasingly rare as the number of nodes increases. To illustrate this, here are real-world example contents of a revinfo table:

image

In this light, whether two transactions are from two differing nodes of your application or from two differing sessions within the same application instance is in fact not as irrelevant as the forum answer (referred to in the other issue thread) suggests in its conclusion.

The 'order by' inside the implementation should probably be composed of timestamp AND rev in order to handle the issue of exactly the same timestamps.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Oct 16, 2024
@mp911de
Copy link
Member

mp911de commented Oct 17, 2024

Thanks for reaching out. If neither timestamp nor the revision number (when batched) are good candidates to get the latest change, what is then the solution?

In the same sense as timestamps might deviate across multiple nodes, neither is ideal. It seems that the previous variant has been less of an issue, still with a batched revision number, we cannot guarantee that the first item sorted by timestamp and number is the most recent one.

@mp911de mp911de added the status: waiting-for-feedback We need additional information before we can continue label Oct 17, 2024
@quaff
Copy link
Contributor

quaff commented Oct 18, 2024

as Hibernate sequence numbers are allocated per node in batches of 50.

Have you tried to disable pre-allocating sequence numbers?

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Oct 18, 2024
@wacker
Copy link
Author

wacker commented Oct 18, 2024

If neither timestamp nor the revision number (when batched) are good candidates to get the latest change, what is then the solution?

To undo the change.

Previously, the result of findLastChangeRevision() was basically correct, except for the corner case where two nodes make changes at exactly the same time and the accuracy of the timestamps is not sufficient to distinguish the changes correctly. As long as the workload is well partitioned between the nodes and one entity is usually processed by the same node, this corner case is unlikely to occur.

Currently, findLastChangeRevision() only returns the correct change revision if no node that made an earlier change did have a more recent revision sequence number batch allocated at the time, which is a gamble. In a distributed environment, the return value is therefore essentially useless unless pre-allocation/pooling of the revision sequence numbers is disabled, which has performance implications.

@mp911de mp911de added type: regression A regression from a previous release and removed status: waiting-for-triage An issue we've not yet triaged status: feedback-provided Feedback has been provided labels Oct 21, 2024
schauder added a commit that referenced this issue Oct 21, 2024
Sorting by revision number alone failed for distributed systems using batched sequences for revision numbers.

Closes #3643
See #3579
schauder added a commit that referenced this issue Oct 21, 2024
Sorting by revision number alone failed for distributed systems using batched sequences for revision numbers.

Closes #3643
See #3579
@schauder
Copy link
Contributor

The assumption of strictly increasing revision numbers over time holds true for a single instance but not in a multi-node environment, as Hibernate sequence numbers are allocated per node in batches of 50.

It's not really an assumption, but something that probably should be a requirement and might actually be intended as a requirement, and one might argue that use of a batched sequence is a mistake on your/Hibernates side.

As long as the workload is well partitioned between the nodes and one entity is usually processed by the same node, this corner case is unlikely to occur.

Here you are kinda arguing against yourself. if and entity is processed by the same node, it should work with the same batch of ids and not have a problem.
But since you obviously have a problem, this does not hold.
This isn't surprising since there really isn't a reason why an entity should be processed by the same node.
Even when one uses some partitioning, e.g. by users, they will still often access entity that cross these borders.
For examples users might place orders which then affect stock levels of products.

An additional potential complication is that clocks on different nodes aren't synchronized. But AFAIK nobody has complained about that so far, so I guess it's not an issue for most users.

All that said: We decided to go with sorting by timestamp first and revision second. Given that sorting by timestamp worked for a long time without complains, I'm somewhat confident that this solution is good enough

@schauder schauder added this to the 3.2.12 (2023.1.12) milestone Oct 21, 2024
@wacker
Copy link
Author

wacker commented Oct 21, 2024

Thanks for taking care of the issue. To be honest, I was surprised when I discovered non-continuous revision numbers with Envers and only then learned about pooling sequence numbers in Hibernate.

It's not really an assumption, but something that probably should be a requirement and might actually be intended as a requirement, and one might argue that use of a batched sequence is a mistake on your/Hibernates side.

I don't see how this 'mistake' could be avoided as the mapping of the revinfo table is not explicitly defined or configurable(?) Sure, hibernate.id.optimizer.pooled.preferred=none would probably help, but this affects the whole application and not just Envers.

Anyway, IMO it should be mentioned in the Envers docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: regression A regression from a previous release
Projects
None yet
Development

No branches or pull requests

5 participants