-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define an AXL BB post-stage script #75
Comments
One idea is to have the poststage script call |
The more I think about this, the more I like this setup:
|
It may help to break this into two steps:
For 1) let's test using a pattern that we can expect from a typical HPC application. We'd expect each process in an MPI job to write one or more files, with those files potentially scattered through a directory tree. As one concrete example, the pattern below is pretty common:
Under AXL, each process will have started its own transfer. We'll have one transfer handle and one state file per process. We should test for scaling, since it's at larger scale where BB transfers provide the most benefit. On a full system run on sierra, we'll have We could then look at what additional logic we need to add for SCR. |
With The AXL post-stage script would then scan the directory to get the full list of state files, and then it would invoke |
Side note: the original plan for SCR+BB, before we carved AXL into its own component, was to initiate a collective BB transfer in SCR and list all files under a single BB transfer id. In that case, a post stage script could wait on a single transfer id rather than dealing with one id per process. The scalability problem moves into the BB software under that model. |
Thanks @adammoody all that detail helps. One thing I did want to mention is that I plan to do away with the state file for resumes ( |
I thought we agreed in our meeting last week that neither of those is really a problem. First, the kvtree code needs to be fixed according to ECP-VeloC/KVTree#40, which will fix the corrupt file problem. We'll need this kvtree fix for other components, if we decide AXL doesn't require it. Second, I think it's fine for us to say that it doesn't make sense to "resume" a transfer that was never started. If a job dies after AXL_Create and before AXL_Dispatch, we can change AXL_Resume to return an error and require the user to start over with a fresh transfer using a new AXL_Create/Add/Dispatch. With the caveat that I haven't thought through everything, we should keep going with the state file until we're sure it doesn't work. Requiring the user to list all files moves the bookkeeping work from AXL back to the user, but I don't think it makes the task of bookkeeping any easier. |
For more background. The IBM BB software defines two types of post-stage scripts, and we eventually want to consider both. In this first go at things, we're looking to define a script that plugs in as a second BB post-stage script. The second script runs on the job launch node after the user has lost access to their compute nodes. The BB software will still complete any transfer the user had started before they gave up their allocation. This second script lets us wait on those transfers and take action when they complete. Each transfer will either complete successfully, or it will error out. If it completes successfully, we want to then finalize the files (e.g., rename and set metadata). If the transfer errors out, it's not possible for the user to start a new transfer of those files at that point -- they're just out of luck. The best we can do for them then is to delete the temporary files. As an example of where things get tricky if we require the user to list all files again, is actually in the success case. On success, we may want to set the metadata on the destination files to match the source files. However, at the point this second script runs, we can't access the source files anymore (only the BB software can). In particular, we can't Meanwhile, we already store this metadata in our state file. |
I suppose we could get that working without a state file if we also encode the metadata values as part of the temporary destination name. We currently preserve the uid, gid, mode bits, mtime, and atime values. But that leads us down the path of implementing this bookkeeping in two different places. |
Regarding the no-state_file method - One idea would be to pre-create the temporary files before the transfer with all the metadata values set except the permission bits. The permission bits would have to be encoded in the temp file name extension (and set on finalize). This is to get around the case where you're transferring a read-only file. |
Somewhat related to this issue: |
The IBM BB will transfer files even after the user job has ended. One can register a post-stage script that will run after the transfer ends. This transfer may either succeed or it may fail.
Can we provide a default AXL bb poststage script that people can use?
For example, this script can wait for any outstanding transfer to end. For any transfer that succeeds, it might rename the temporary _AXL file to the correct name. For any transfer that fails, it might delete the temporary, incomplete _AXL file.
The text was updated successfully, but these errors were encountered: