-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kernel][Metrics][PR#2] Add SnapshotReport for reporting snapshot construction #3903
base: master
Are you sure you want to change the base?
Conversation
e1dbe30
to
ceb5a56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Reviewed the production code, not the tests yet
kernel/kernel-api/src/main/java/io/delta/kernel/internal/metrics/SnapshotContext.java
Outdated
Show resolved
Hide resolved
kernel/kernel-api/src/main/java/io/delta/kernel/internal/metrics/SnapshotContext.java
Outdated
Show resolved
Hide resolved
kernel/kernel-api/src/main/java/io/delta/kernel/internal/metrics/SnapshotContext.java
Outdated
Show resolved
Hide resolved
@@ -105,8 +107,11 @@ public Transaction build(Engine engine) { | |||
// Table doesn't exist yet. Create an initial snapshot with the new schema. | |||
Metadata metadata = getInitialMetadata(); | |||
Protocol protocol = getInitialProtocol(); | |||
LogReplay logReplay = getEmptyLogReplay(engine, metadata, protocol); | |||
snapshot = new InitialSnapshot(table.getDataPath(), logReplay, metadata, protocol); | |||
SnapshotContext snapshotContext = SnapshotContext.forVersionSnapshot(tablePath, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the semantics of the SnapshotContext version
? i.e. is it the read version? the attempted write version? Is there a differnce or does it matter?
For example, with this code that uses forVersionSnapshot(..., 0)
... we get a snapshotContext that is identical to the case where a table has 3 commits (0, 1, 2) and you time travel to read the table at version 0.
Curious what you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this, my mistake this should've been -1 (which matches the version returned by the log segment & log replay).
In general it is the intended read version. For this specific case, we treat a non-existent table as read version -1. Note, we don't generate a snapshot report for this case since there is no real snapshot to load!
kernel/kernel-api/src/main/java/io/delta/kernel/internal/metrics/SnapshotReportImpl.java
Outdated
Show resolved
Hide resolved
kernel/kernel-api/src/main/java/io/delta/kernel/internal/metrics/SnapshotReportImpl.java
Outdated
Show resolved
Hide resolved
kernel/kernel-api/src/main/java/io/delta/kernel/internal/snapshot/SnapshotManager.java
Show resolved
Hide resolved
snapshotContext); | ||
|
||
// Push snapshot report to engine | ||
engine.getMetricsReporters().forEach(reporter -> reporter.report(snapshot.getSnapshotReport())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: in general, when should we be pushing to the engine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as soon as the given operation is completed. In this case, as soon as a snapshot is successfully constructed. I viewed this to essentially mean as soon as the metadata/protocol is loaded.
kernel/kernel-api/src/main/java/io/delta/kernel/metrics/SnapshotMetricsResult.java
Outdated
Show resolved
Hide resolved
ceb5a56
to
db5c16c
Compare
db5c16c
to
8e1f651
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just posting what we discussed in a sync! Looks great!
-
Would be great to use factory methods to create
SnapshotReportImpl
s. One for success/normal cases, one for error case. This should take in theSnapshotQueryContext
. -
Ensure you have the same exact method / value names between SnapshotMetrics and SnapshotMetricsResult. Also, add docs to both that explain the relationship. i.e. SnapshotMetrics creates SnapshotMetricsResult; SnapshotMetricsResult is created by SnapshotMetrics
path, | ||
expectException = false, | ||
expectedVersion = Optional.of(0), | ||
expectedProvidedTimestamp = Optional.empty(), // No time travel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we time travel to version 0? Would // No time travel by timestamp
be a more accurate comment? And I think this would apply to all the other such instances of this comment, right?
} | ||
} | ||
|
||
test("SnapshotReport valid queries") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also test: try and time travel by a bad version (e.g. version 55) or by a bad timestamp (e.g. the year 1985) but on a valid table?
e.g. you could rename this test to SnapshotReport queries on valid table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So originally I did not add these tests because my main goal with the exception cases was to cover the 3 possible places snapshot construction could fail (outlined here in design doc) instead of trying to cover all possible errors/exceptions during snapshot construction.
- Error resolving timestamp --> version (only applicable for time-travel by timestamp)
- Error building the log segment
- Error constructing the snapshot (loading table protocol and metadata)
But I'm happy to add these two cases anyways. I can just add an additional test for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So originally I did not add these tests because my main goal with the exception cases was to cover the 3 possible places snapshot construction could fail (outlined here in design doc) instead of trying to cover all possible errors/exceptions during snapshot construction.
Sorry, what's unclear to me is if you didn't plan to include those tests in this PR or if there's no plan to write them in the near future
class MetricsReportSuite extends AnyFunSuite with TestUtils { | ||
|
||
/////////////////////////// | ||
// SnapshotReport tests // |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many different types of reports do you anticipate we will have? If test + helper code for just SnapshotReport is ~300 LOC, I wonder if we should split this up into different files?
e.g. SnapshotMetricsReportSuite extends MetricsReportTestUtils
Which Delta project/connector is this regarding?
Description
Adds a
SnapshotReport
for reporting snapshot construction.We record a
SnapshotReport
after successfully constructing a snapshot or if an exception is thrown during construction. We useSnapshotContext
to propagate and update information about the snapshot construction. For example, we update the "version" as soon as it's resolved for time-travel by timestamp or load latest snapshot queries. This means in the case of the exception we include as much information as is available.How was this patch tested?
Adds a test suite
MetricsReportSuite
with unit tests.Does this PR introduce any user-facing changes?
No.