Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drmaa errors- resubmit/retry #116

Open
cchng opened this issue Feb 11, 2020 · 0 comments
Open

drmaa errors- resubmit/retry #116

cchng opened this issue Feb 11, 2020 · 0 comments

Comments

@cchng
Copy link

cchng commented Feb 11, 2020

Hi ruffus team,

I'm using the drmaa wrapper to submit/run jobs on an SGE cluster. I'm running into communication exceptions that I've been working to resolve (Related issue: aws/aws-parallelcluster#1592). Has the ruffus team encountered this error? If not, is there a resubmit/retry feature that is ready to use? Even though not explicitly documented, it looks like the run_job function takes a resubmit parameter.

[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] File "/shared/amgenesis/helpers.py", line 126, in run
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] cmdline.run (options, logger=logger_proxy, multithread = options.jobs, exceptions_terminate_immediately = True)
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/ruffus/cmdline.py", line 834, in run
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] **appropriate_options)
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] File "/home/ec2-user/anaconda3/lib/python3.7/site-packages/ruffus/task.py", line 5424, in pipeline_run
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] raise job_errors
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] ruffus.ruffus_exceptions.RethrownJobError:
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] Original exception:
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] Exception #1
[2020-02-11 00:29:15,628: WARNING/ForkPoolWorker-1] 'drmaa.errors.DrmCommunicationException(code 2: failed receiving gdi request response for mid=65535 (can't send response for this message id - protocol error).)' raised in ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant