-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HQ batch job submission fails at Karolina #791
Comments
Hi, that looks a lot like this issue: #452. The |
I see I have picked some old job from previous version rather than from a new one. Other than that, no the HQ binaries are there.
|
I see. Probably it's some networked filesystem issue then, where Linux thinks that the file has been removed for some reason. Created #792 to try to work around this. |
OK, so can this be fixed within the HQ or should I relocate the file? |
The proposed PR should fix this issue, in the sense that we will just ignore the |
OK, I would try this first and see if it fixes the problem or not and move the binary based on that. |
One more question: |
No, there is no persistent memory except journal and access file (access file only holds information where server is running and encryption keys; a new access file should be generated by default when server is started). |
More likely it was the rate limiter. HQ tries to be very conservative with automatic allocations when it sees that they start failing. |
Recently, I have noticed that the HyperQueue has a problem submitting jobs to Karolina's batch system. The stderr shows:
So, it seems the problem is that the HyperQueue adds
(deleted)
into hq command in hq-submit.sh for some reason, e.g.Interestingly, I do not see this problem on Barbora, even though all are running the same version (v20).
The text was updated successfully, but these errors were encountered: