Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[b/356461225] Add HDFS progress logging #534

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

dawidxc
Copy link
Collaborator

@dawidxc dawidxc commented Aug 23, 2024

No description provided.

Copy link
Collaborator

@shevek-google shevek-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this use https://github.com/google/dwh-migration-tools/blob/main/dumper/lib-common/src/main/java/com/google/edwmigration/dumper/plugin/ext/jdk/progress/ConcurrentRecordProgressMonitor.java instead, particularly if the 'total' field was updated to be a mutable LongAdder? If we want an ETA, I think adding that to ConcurrentProgressMonitor would be a lovely feature update.

Note that System.currentTimeMillis() is a very expensive call, which does not show up in a stack-sampling profiler, because it's not a safepoint; BlockProgressMonitor is designed to reduce that overhead.

.append("\nTotal time: ")
.append(timeSinceScanBegin.getSeconds() + "s")
.append("\nTotal time in listStatus(..): ")
.append(timeSpentInListStatus.getSeconds() + "s")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a chained append(), not a String-+ in the middle, as that creates an extra StringBuilder and String.

lastLogTime = now;
long percentFinished =
Longs.constrainToRange(this.numDirsWalked * 100 / totalDirectoryCount, 0, 99);
partialProgressLogger.logProgress("HDFS scan " + percentFinished + "% completed");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this looks very similar to the shared/common code in [Concurrent]RecordProgressMonitor. Would it be possible to use that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants