Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-29103 Avoid excessive allocations during reverse scanning when seeking to next row #6643

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jbewing
Copy link
Contributor

@jbewing jbewing commented Jan 28, 2025

What

This PR updates the reverse scanning flow to call HFileScanner#getKey instead of HFileScanner#getCell for Cells that later have PrivateCellUtil.createFirstOnRow() called on them. The aim here is that we'll reduce excessive memory usage/allocations as:

  • If the instance field previousRow is in use, it will be a KeyOnlyKeyValue rather than a full ExtendedCell
  • When calling HFileScanner#getKey, we only need to deep-copy the row key bytes rather than the entire cell as required by HFileScanner#getCell

This should be safe as we always call HFileScanner#getKey in a context where the HFileScanner is seeked so it won't throw.

Implementation Notes

While I was making updates here, I noticed some invalid javadoc in the reverse scanning code path which I fixed up as a part of this issue. I can separate those changes out into a different issue if that's preferred.

Testing

I ran a lot of the reverse scan specific related tests locally and they passed, however, I would like to run the entire test suite against this patch to ensure that it's correct. Since this affects the meta read hotpath, we want to ensure that no edge cases are missed.

Performance Impact

On a region server that was under extreme CPU load (100%), an allocation profile (see linked JIRA) revealed that about 3% of allocations for a reverse scan heavy workload over a table that was using FAST_DIFF data block encoding came from the BufferedDataBlockEncoder#getCell method. This table had moderate sized rows: about ~1kb on average with 5-6 columns and a block size of 128kb.

I would expect this performance optimization to really only materially aid performance on tables with reverse scan workloads that have column values that aren't trivial in size (more than a couple of bytes). I need to do some more in-depth profiling though to come up with concrete numbers here.

Benchmarks

I've benchmarked this patch vs. the master branch and posted the results in the linked JIRA.

HBASE-29103

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 34s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 💚 mvninstall 4m 6s master passed
+1 💚 compile 3m 47s master passed
+1 💚 checkstyle 0m 45s master passed
+1 💚 spotbugs 1m 56s master passed
+1 💚 spotless 1m 3s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 4m 3s the patch passed
+1 💚 compile 3m 47s the patch passed
+1 💚 javac 3m 47s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 43s the patch passed
+1 💚 spotbugs 2m 13s the patch passed
+1 💚 hadoopcheck 15m 21s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 1m 22s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 21s The patch does not generate ASF License warnings.
48m 35s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6643/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6643
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux e0d893a8e8d3 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 7cd58f6
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6643/1/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 27s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 3m 15s master passed
+1 💚 compile 0m 55s master passed
+1 💚 javadoc 0m 28s master passed
+1 💚 shadedjars 5m 50s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 3s the patch passed
+1 💚 compile 0m 55s the patch passed
+1 💚 javac 0m 55s the patch passed
+1 💚 javadoc 0m 28s the patch passed
+1 💚 shadedjars 5m 51s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
-1 ❌ unit 220m 3s /patch-unit-hbase-server.txt hbase-server in the patch failed.
246m 30s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6643/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6643
Optional Tests javac javadoc unit compile shadedjars
uname Linux 8093b5e31457 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 7cd58f6
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6643/1/testReport/
Max. process+thread count 5488 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6643/1/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@jbewing
Copy link
Contributor Author

jbewing commented Jan 28, 2025

^ I believe the above test failure is due to a flakey test. When running locally, the test is flakey for me. I can investigate more

@jbewing
Copy link
Contributor Author

jbewing commented Jan 29, 2025

I've confirmed that the above test failure is because org.apache.hadoop.hbase.master.assignment.TestRollbackSCP.testFailAndRollback test seems to be flakey. It's been flakey on other CI builds recently too.

That's to say: all test failures are unrelated.

@rmdmattingly
Copy link
Contributor

@Apache9 tagged you here because I don't know much about this code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants