-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
findLastChangeRevision() returns the wrong result in a multi-node env #3643
Comments
Thanks for reaching out. If neither timestamp nor the revision number (when batched) are good candidates to get the latest change, what is then the solution? In the same sense as timestamps might deviate across multiple nodes, neither is ideal. It seems that the previous variant has been less of an issue, still with a batched revision number, we cannot guarantee that the first item sorted by timestamp and number is the most recent one. |
Have you tried to disable pre-allocating sequence numbers? |
To undo the change. Previously, the result of findLastChangeRevision() was basically correct, except for the corner case where two nodes make changes at exactly the same time and the accuracy of the timestamps is not sufficient to distinguish the changes correctly. As long as the workload is well partitioned between the nodes and one entity is usually processed by the same node, this corner case is unlikely to occur. Currently, findLastChangeRevision() only returns the correct change revision if no node that made an earlier change did have a more recent revision sequence number batch allocated at the time, which is a gamble. In a distributed environment, the return value is therefore essentially useless unless pre-allocation/pooling of the revision sequence numbers is disabled, which has performance implications. |
It's not really an assumption, but something that probably should be a requirement and might actually be intended as a requirement, and one might argue that use of a batched sequence is a mistake on your/Hibernates side.
Here you are kinda arguing against yourself. if and entity is processed by the same node, it should work with the same batch of ids and not have a problem. An additional potential complication is that clocks on different nodes aren't synchronized. But AFAIK nobody has complained about that so far, so I guess it's not an issue for most users. All that said: We decided to go with sorting by timestamp first and revision second. Given that sorting by timestamp worked for a long time without complains, I'm somewhat confident that this solution is good enough |
Thanks for taking care of the issue. To be honest, I was surprised when I discovered non-continuous revision numbers with Envers and only then learned about pooling sequence numbers in Hibernate.
I don't see how this 'mistake' could be avoided as the mapping of the revinfo table is not explicitly defined or configurable(?) Sure, Anyway, IMO it should be mentioned in the Envers docs. |
There is a regression in v3.3.4 introduced by #3579, which leads to the wrong revision returned by a call to
findLastChangeRevision()
.The assumption of strictly increasing revision numbers over time holds true for a single instance but not in a multi-node environment, as Hibernate sequence numbers are allocated per node in batches of 50.
Cases where the latest revision actually has the highest revision number become increasingly rare as the number of nodes increases. To illustrate this, here are real-world example contents of a revinfo table:
In this light, whether two transactions are from two differing nodes of your application or from two differing sessions within the same application instance is in fact not as irrelevant as the forum answer (referred to in the other issue thread) suggests in its conclusion.
The 'order by' inside the implementation should probably be composed of timestamp AND rev in order to handle the issue of exactly the same timestamps.
The text was updated successfully, but these errors were encountered: