-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kernel][Metrics][PR#3] Metrics report JSON serializer and LoggingMetricsReporter for the default engine #3904
Conversation
6966594
to
632985b
Compare
d8c08dd
to
fea8cd0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! 2 minor comments
...el/kernel-defaults/src/main/java/io/delta/kernel/defaults/engine/LoggingMetricsReporter.java
Outdated
Show resolved
Hide resolved
...el/kernel-defaults/src/main/java/io/delta/kernel/defaults/engine/LoggingMetricsReporter.java
Outdated
Show resolved
Hide resolved
fea8cd0
to
7a54a70
Compare
7a54a70
to
678ac47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with two minor comments!
* | ||
* @throws JsonProcessingException | ||
*/ | ||
public static String serializeSnapshotReport(SnapshotReport snapshotReport) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we would have to do this for every type of report, right?
Any way to avoid that? (Since they would all just call OBJECT_MAPPER.writeValueAsString(inputVariable)
Could we just take in a DeltaOperationReport?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I wasn't sure about this because I also think there's a use-case where you are only serializing a specific report type. maybe we can have both serializeSnapshotReport
and serializeDeltaOperationReport
But then I need to make DeltaOperationReport
serializable (which currently I did not). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every report we create will implement DeltaOperationReport, right?
And every report we create we will by default log in with our default engine, right?
Then it seems fair that DeltaOperationReport be serializable 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I have some thoughts that I haven't fully fleshed out.
Every operation-type report we create will implement DeltaOperationReport, possibly not every report. But yes we probably plan to log all report types in our default engine (I can't think of a reason not to currently). But that would argue that we should make MetricsReport
serializable....
If we add a new report type i.e. XXReport extends DeltaOperationReport
but don't make it serializable, it will still be serialized but will be missing additional information in XXReport.
Do we ever expect an report that only extends DeltaOperationReport? Maybe we just want to report that an operation occurred or success vs failure?
Need to verify that if both DeltaOperationReport and SnapshotReport are serializable jackson will use the lowest ancestor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is something we can flesh out later? These are all internal APIs and for now, I think, making sure the main report types are serializable is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this is something we can flesh out later?
Yup! SGTM!
...ernel-api/src/test/scala/io/delta/kernel/internal/metrics/MetricsReportSerializerSuite.scala
Outdated
Show resolved
Hide resolved
6fb58e2
to
fcc0379
Compare
…ricsReporter for the default engine (delta-io#3904) <!-- Thanks for sending a pull request! Here are some tips for you: 1. If this is your first time, please read our contributor guidelines: https://github.com/delta-io/delta/blob/master/CONTRIBUTING.md 2. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP] Your PR title ...'. 3. Be sure to keep the PR description updated to reflect all changes. 4. Please write your PR title to summarize what this PR proposes. 5. If possible, provide a concise example to reproduce the issue for a faster review. 6. If applicable, include the corresponding issue number in the PR title and link it in the body. --> #### Which Delta project/connector is this regarding? <!-- Please add the component selected below to the beginning of the pull request title For example: [Spark] Title of my pull request --> - [ ] Spark - [ ] Standalone - [ ] Flink - [X] Kernel - [ ] Other (fill in here) ## Description This PR is based off of delta-io#3903 See the diff for just this PR [here](https://github.com/delta-io/delta/pull/3904/files/aec95cf3dc0086c37f4c45e2b3e192b7b881768c..678ac473f4de65a8f7fd770696aad2d31a15aef7) Adds a JSON serializer for metrics reports with serialization logic for SnapshotReport. Also adds a `LoggingMetricsReporter` to the default implementation which simply logs the JSON serialized reports using Log4J. ## How was this patch tested? Adds a test suite. ## Does this PR introduce _any_ user-facing changes? No.
Which Delta project/connector is this regarding?
Description
This PR is based off of #3903
See the diff for just this PR here
Adds a JSON serializer for metrics reports with serialization logic for SnapshotReport. Also adds a
LoggingMetricsReporter
to the default implementation which simply logs the JSON serialized reports using Log4J.How was this patch tested?
Adds a test suite.
Does this PR introduce any user-facing changes?
No.