-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support snapshots configurations in stats collector, fix bugs where snapshot retention didn't support past DAY format, fix false positive edge case in snapshot event #291
Conversation
…napshot retention didn't support past DAY format
apps/spark/src/test/java/com/linkedin/openhouse/jobs/spark/OperationsTest.java
Outdated
Show resolved
Hide resolved
apps/spark/src/main/java/com/linkedin/openhouse/jobs/util/TableStatsCollectorUtil.java
Outdated
Show resolved
Hide resolved
apps/spark/src/main/java/com/linkedin/openhouse/jobs/util/TableStatsCollectorUtil.java
Outdated
Show resolved
Hide resolved
services/common/src/main/java/com/linkedin/openhouse/common/stats/model/IcebergTableStats.java
Outdated
Show resolved
Hide resolved
apps/spark/src/main/java/com/linkedin/openhouse/jobs/spark/Operations.java
Show resolved
Hide resolved
apps/spark/src/main/java/com/linkedin/openhouse/jobs/spark/Operations.java
Show resolved
Hide resolved
apps/spark/src/test/java/com/linkedin/openhouse/jobs/spark/OperationsTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very clean refactor Will, thanks
left a couple of questions regarding the old code
apps/spark/src/main/java/com/linkedin/openhouse/jobs/util/TableStatsCollectorUtil.java
Show resolved
Hide resolved
apps/spark/src/main/java/com/linkedin/openhouse/jobs/util/TableStatsCollectorUtil.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excellent, thanks for adding secondOldestSnapshot which will solve frequent false positives when monitoring the availability of Snapshot Expiration job.
Summary
Issue Briefly discuss the summary of the changes made in this
pull request in 2-3 lines.
We want to be able to emit events with user tables that have the history configuration for them so that monitoring systems can accurately detect when snapshot expiration fails for configured tables.
This PR also does some refactoring around parsing policy JSONs to be more robust and easier to manage as more policies are added.
This PR also fixes a bug where not all date granularities were supported in retention, since
TimeUnit
is maxed granularity at DAYS when technically the snapshot policy can be kept at higher granularities of MONTH and YEAR, although we currently still limit it to 3 DAYS.We also add the
secondOldestSnapshot
field to the IcebergTableStats to handle the edge case where a table has very infrequent writes, this can cause snapshots to fall out of retention policy but not yet be cleaned up, if the secondOldestSnapshot is still recent then we should avoid being alerted.Changes
For all the boxes checked, please include additional details of the changes made in this pull request.
Testing Done
For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.
Unit tests, tested snapshot version and ran tablestatscollector job in a cluster on Spark, did not see any errors.
Additional Information
For all the boxes checked, include additional details of the changes made in this pull request.