autotuning failed to run: pdsh@GCRSANDBOX324: localhost: ssh exited with exit code 255 #1792
dunalduck0
started this conversation in
General
Replies: 1 comment
-
same issue |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I didn't install pdsh and DeepSpeed worked fine for model training.
I tried to the new feature Autotuning. It seems to require pdsh. Unfortunately, it still didn't work after installing pdsh as I've got the error:
pdsh@GCRSANDBOX324: localhost: ssh exited with exit code 255
What was the reason that pdsh is not required for model training but autotuning?
Why does pdsh need to "ssh localhost" for autotuning?
Beta Was this translation helpful? Give feedback.
All reactions