Reduce default memory allocation to the java process #1407
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1406
This pull request introduces several changes to improve the handling of JVM heap size and thread calculations in the
spark_rapids_tools
module. The most important changes include updating the method for calculating JVM heap size.This change aims at avoiding allocating memory by default that would trigger the OOM-killer
Enhancements to JVM heap size and thread calculations:
user_tools/src/spark_rapids_tools/cmdli/argprocessor.py
: Updated the method to calculate JVM heap size usingUtilities.calculate_jvm_max_heap_in_gb()
instead ofUtilities.get_system_memory_in_gb()
. Increased the minimum heap size per thread from 6 GB to 8 GB.user_tools/src/spark_rapids_tools/utils/util.py
: Added class variablesmin_jvm_xmx
,max_jvm_xmx
, andmax_tools_threads
to set limits on JVM heap size and the number of threads.Method renaming for clarity:
user_tools/src/spark_rapids_tools/utils/util.py
: Renamedget_system_memory_in_gb()
tocalculate_jvm_max_heap_in_gb()
and updated the method to calculate the maximum heap size based on available system memory, capping it between 8 GB and 32 GB.user_tools/src/spark_rapids_tools/utils/util.py
: Renamedget_max_jvm_threads()
tocalculate_max_tools_threads()
and updated the method to calculate the maximum number of threads based on physical cores, capping it at 8 threads.user_tools/src/spark_rapids_tools/utils/util.py
: Updated theadjust_tools_resources
method to use the newcalculate_max_tools_threads()
method for determining the maximum number of threads.