Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement -A optional flag to return all exit codes on stdout #105

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sebalos314
Copy link

Hello @grondo, would it be ok to implement #58

you mentioned in the past that -A may already be used, I can switch to any letter you want, just tell me which one you prefer ? -E maybe ?

thanks in advance.

@grondo
Copy link
Member

grondo commented Sep 6, 2017

Can I ask the use case for this change? The reason I ask is that it feels a tiny bit off if we add a new option that will only have an effect when pdsh is using either the ssh or exec rcmd modules, i.e. when the pipecmd code is used. If this change is only needed when execing local processes, the perhaps the option should be added directly to the affected modules. If your use case requires the exit status to stdout no matter the rcmd module used (i.e. you are actually interested in success/failure of the remote command), then this status should be printed after rcmd_destroy in dsh.c (which in the case of pipecmd returns the WEXITSTATUS() of the exec'd command), and you could then move the option in the options structure instead of using a global.

Put another way, do you want the exit status to be printed for all invocations of pdsh with the -A option? Or is this really only meant for ssh and exec modes?

@grondo
Copy link
Member

grondo commented Sep 6, 2017

Also, along with adding the feature, it would be nice to add tests to ensure the new code works as designed (so we don't break it in the future) ;-)

RIght now automated Travis-CI checks are failing, but that doesn't appear to be the fault of this PR. I'll investigate and fix master.

@sebalos314
Copy link
Author

sebalos314 commented Sep 7, 2017

The use case is: we are post processing the output of running pdsh (ssh) on compute clusters and we post process both the output AND the exit codes (information and meta information are important for data-mining). The current implementation of pdsh (I would be happy to be contradicted) will only show exit codes when they are non zero.

A UI comment: we proposed to implement an option to avoid breaking output compatibility, not being sure that everyone would be happy to see... pdsh@node001 exited (0)... I do not think it is a problem to have this information on stderr or stdout, you probably have more knowledge.

Finally, I am not surprised the implementation we propose is sub-optimal because we do not have enough pdsh source code knowledge, in particular I am not familiar with the modules architecture nor the {ssh,exec} modules, I guess we are using the 'ssh' one, we did not mean to restrict this feature to ssh, but we only use ssh to contact other systems.

I also understand the need for a test module too but I am not sure how the testing environment works, in particular if we want to check that pdsh ssh hostname returns 0 when the test system is reachable.

In summary, we 'only' are basic but very longterm users of pdsh used as a sliding window ssh launcher (as probably many people in the HPC community) so this may give you context about us.

Regarding the failed checks, I also believe that master as the exact same problems, the code works very well on our clusters.

Sorry, more questions than answers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants