-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: failing test est_restore_by_datetime #2000
Conversation
Signed-off-by: Nikolay Ulmasov <[email protected]>
Signed-off-by: Nikolay Ulmasov <[email protected]>
Signed-off-by: Nikolay Ulmasov <[email protected]>
Signed-off-by: Nikolay Ulmasov <[email protected]>
It wasn't meant to pass the CI 🤦 as I'm still trying to trace but I think I know where the issue is - during the version search a blank table is initiated which has no timestamp stored and |
Signed-off-by: Nikolay Ulmasov <[email protected]>
@ion-elgreco are you able to start the workflows here? |
There is good in every bad - different tests are failing now. Of course I didn't run them locally 🤦 |
Signed-off-by: Nikolay Ulmasov <[email protected]>
Signed-off-by: Nikolay Ulmasov <[email protected]>
Hmm, this was unexpected overflow failure here but it is only a problem with the test itself, not the library. Looks like utime::set_file_times is expecting timestamp in seconds whereas our timestamps are in millis (for some reason it only broke on Windows). |
.await?; | ||
let ts = meta.last_modified.timestamp_millis(); | ||
// Load the version specified | ||
if self.version() != version { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of loading the version each time, we could also just do this
self.state.commit_infos().get(version).unwrap().timestamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ion-elgreco Setting aside the concerns I have shared with reliance on commitInfo
, I believe that invocation is not guaranteed to return the right versioned information if the version has not been loaded already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just think aloud here but in general you shouldn't be able to restore to a version of a table that's above the current loaded table version
let timestamp: Option<i64> = if !self.state.commit_infos().is_empty() { | ||
self.state.commit_infos().last().unwrap().timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a concept I've discussed with @ion-elgreco in Slack a bit. The value of the commitInfo
is not governed by the protocol and therefore any timestamp
field should not be relied upon for anything.
Since the protocol doesn't dictate the format here, this could be epoch, or any other random timestmap and we have no guarantees other than Delta/Spark's convention that it will actually represent the timestamp of the last written version.
All that said, I like that this is a sort of guarded optimization, I'm really not sure how to make this much safer though 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we just check the clientVersion and only take the timestamp from engines that we know use timestamp defined in the same format
.await?; | ||
let ts = meta.last_modified.timestamp_millis(); | ||
// Load the version specified | ||
if self.version() != version { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ion-elgreco Setting aside the concerns I have shared with reliance on commitInfo
, I believe that invocation is not guaranteed to return the right versioned information if the version has not been loaded already.
let ts = meta.last_modified.timestamp_millis(); | ||
// Load the version specified | ||
if self.version() != version { | ||
self.load_version(version).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@r3stl355 I am not as familiar with the callers of get_version_timestamp
but this call will actually reset the self
table state to the version
specified. That seems like it would be a concerning side effect.
I'm honestly not sure if load_version
is more/less work than just creating a new DeltaTable
at that specific version, and then returning newtable.get_version_timestamp(version)
to preserve self
here 🤔
Closing this because a simpler fix for the test only was implemented in #2010 |
Description
This started as a fix for the issue occurring on CI Mac only. Turns out it is not Mac only but could also happen in other systems if commit takes more than a millisecond. Fix changes the following:
get_version_timestamp
would get the timestamp from thelast_modified
property of the log entry fileget_version_timestamp
will first attempt to get the timestamp fromCommitInfo
and will revert to previous logic of getting the timestamp from file'slast_modified
ifCommitInfo
does not contain a timestamp. This will also help restoring a correct version for tables which get copied to a different location (e.g. for in case of DR)Related Issue(s)
#1925