-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase memory for cat_bins #1317
Increase memory for cat_bins #1317
Conversation
files/galaxy/tpv/tools.yml
Outdated
@@ -340,6 +340,12 @@ tools: | |||
toolshed.g2.bx.psu.edu/repos/iuc/fgsea/fgsea/.*: | |||
# any container should work | |||
inherits: basic_docker_tool | |||
toolshed.g2.bx.psu.edu/repos/iuc/cat_bins/cat_bins/.*: | |||
rules: | |||
- if: input_size >= 0.03 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide each rule an ID please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, but what is a rule ID ? I made this based on
- if: input_size >= 0.01 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good question. |
This would be an explanation, but would this mean, that such as job would be stored in the Galaxy DB with i.e. 24 GB memory, but in real they used 48 ? - That will make defining good rules really difficult. On the other hand, it makes me question why there are jobs that got more memory then 24 gb at all? |
Depends on what you are looking for. If you look at the allocated memory, I think the allocation from the first run is reported. But we should look at the cgroup reported valued of the actual consumption. But this second value is broken, its a bug that @sanjaysrikakulam is trying to fix soon. |
I guess it probably does not matter much anyway. Whether the rule was applied or not, they needed more memory to succeed, which they have now. |
Am I free to merge this ? |
It will be deployed over the weekend. |
Some of our cat_bin jobs were failing, most probably due to memory issues. I was looking for a while to find a logic that allows to define how to improve the rool for this (and other) tools. That's what I came up with:
A) Query the tool-memory-per-inputs, but include the job state. Will try to add this option to gxadmin !
B) Plot memory vs total input for all states:
10 % quantile threshold for the error states: 31.0 mb
Percentage below threshold of error state jobs: 0.125
Percentage below threshold of ok state jobs: 0.723
It is clear that many of the error states have higher input size.
With the new rule. > 70% of the ok state jobs would have run with the default 24 GB memory. But for almost 90 % of the failed states, the memory would have been increase - giving them a better chance to succeed. But surely some of them could fail due to other reasons.
If that makes sense I will increase some more tool memory based on the same logic.
One question I got is, that I cannot observe jobs with higher memory for all failed jobs - so I guess the increasing memory for failed jobs rule is not in place after all, or am I missing something ?