-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adopt token for WMAgent stage-in/stage-out #12144
Comments
I think we want to make the stage-out token safe now, i.e. in case a token is in the environment and transfer with token doesn't work, the stage-out doesn't fails. (Right now, if HTCondor makes a token available but token-based transfer doesn't work stage-out may fail.) print date/time gfal-copy ... if ( X509_USER_PROXY is set ) sleep 15 min (token information may change underneath us, so we should print it just before the debug gfal-copy).
|
Right! We will add the necessary safety mechanism once we integrate tokens in the grid jobs. Thanks Stephan. |
Issues I have noticed so far:
I am currently working that around by:
If |
We have not used the python binding in the initial tests. If there is a bug, we'll need to make a test case and submit a bug report.
|
I am not sure I understand this comment. It is important to notice though that we were using a CC7 node with RPM based deployment. Maybe this is the difference that you are trying to make here? Is the docker solution imposing limitations to this? |
I didn't know/realize this Alan. I was only aware of the condor_submit test. Then this is likely a config issue and not bug.
|
@amaltaro Submission through the python bindings work as long as there is a token present already. This can be achieved by creating the token by hand, or having a client that does this for you (like the host condor_submit). If the tests at FNAL started with a condor_submit first, and a job submission via WMAgent after, then this issue would not have shown up at all. The python bindings alone don't seem to invoke any token generation script, but that is probably not a huge deal because the first job usually prompts you to an oauth website for the authentication, and since we submit the jobs automatically via JobSubmitter, we do need to do this interactively beforehand anyway. I am just noting that as things are now, we need to generate a token at the host, either by hand or with a simple condor_submit test job, in order for WMAgent to submit and use the tokens later on. I don't think we need to dig too much into this at this point though, because this part of the token generation is going to change eventually, so that this is done without an interactive URL that the user needs to click and authenticate on. |
I thought about Kenyi's message about condor_vault_storer to being executed on a submit via python API. The initial authorization forwarding requires interactive execution and the device code URL might be lost on stdin in case of python API.
|
Impact of the new feature
WMAgent
Is your feature request related to a problem? Please describe.
Similarly to this ticket #11199 , we need to adopt token for the WMAgent payload. In other words, instead of using X509-based stage-in and stage-out auth/authz, we should adopt a token solution for this storage communication.
Describe the solution you'd like
Support token in WMAgent for stage-in / stage-out.
Tokens in the grid jobs will only be available once we configure
a) access to token in the agent node;
b) management of the token in the agent node;
c) propagation of the token by htcondor and WMAgent job description;
d) use of token by the grid job (stage-in / stage-out).
Unless we have all this setup in place, we shouldn't have production jobs accessing tokens during the job runtime.
As a result, that requires at least the following developments:
Describe alternatives you've considered
If token-based auth/authz fails, do we want to fallback to x509 ?
Additional context
None
The text was updated successfully, but these errors were encountered: