Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to use pyaestro to back a local pool style adapter. #345

Draft
wants to merge 10 commits into
base: develop
Choose a base branch
from

Conversation

FrankD412
Copy link
Member

This PR is a preliminary step towards a more "ensemble"-ish adapter related to the functionality discussed in #330.

This new LocalPoolAdapter utilizes pyaestro's Executor class in order to fake a scheduler. Some potential future additions are timeouts and resource support for the above #330.

@FrankD412 FrankD412 added enhancement Adapters Items that are directly related to Maestro's adapter structure. labels Feb 23, 2021
@FrankD412 FrankD412 added this to the Release 1.1.9 milestone Feb 23, 2021
@FrankD412
Copy link
Member Author

@jwhite242 -- Some considerations based on why we were discussing earlier:

  • I was hoping to eliminate the checks for local execution in the ExecutionGraph._StepRecord. This would bring all execution in line with scheduler-based adapters. You had mentioned some benefits to keeping the distinction around. What are your thoughts?
  • We need to conceptualize how we would handle job packing in an Executor for pyaestro and then do something similar here. What have you found in terms of packing and the like?

@FrankD412
Copy link
Member Author

Current todo: Cancellation works at the Maestro level, but leaves the pool executing the last processes that were running before the cancellation was posted. This bug boils up to a pyaestro bug.

maestrowf/interfaces/script/localpooladapter.py Outdated Show resolved Hide resolved
maestrowf/interfaces/script/localpooladapter.py Outdated Show resolved Hide resolved
"""
return f"#!{self._exec}"

def get_parallelize_command(self, procs, nodes=None, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we generalize this to execute or launch command rather than making parallel special? More just about a hook for a custom launch prefix/suffix/wrapper/etc than parallel in particular right? One scheduler thing i can see used is big memory jobs on hpc might request a whole node just for the memory, but only run serial (e.g. srun -N1 -n1..).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- I want to redesign this in pyaestro and then use it here.

identifier.
"""
try:
job_id = self._pool.submit(path, cwd)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there a future being tracked in here somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pool is entirely backed by pyaestro -- which does implement this with a future.

return JobStatusCode.ERROR, {}

status = {jid: State.UNKNOWN for jid in joblist}
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to put this inside the loop so it can actually get through the whole list and catch all the exceptions? Also, what exception did this get added to catch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this outside the loop since it's a basic operation -- also because the Enum is not a constructor, I couldn't use defaultdict like I wanted. I plan to explore making a meta-enum to handle the default case to make this more Pythonic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In regards to the except -- it's there to log any possible errors that do come up. It's not necessarily meant to take corrective action.


c_return = CancelCode.OK
ret_code = 0
if ExecCancel.FAILED in c_status:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check all of them, i.e. counting the number of cancels that failed and ones that succeeded separately for more informative error messages?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design choice is a decision carried forward from SLURM (when I initially made this); in that perspective, the status is reported if at least one record reports the state. The issue with things like SLURM is that it's more efficient to do a bulk cancel, so there's no way to get a per job report out there (not without polling the scheduler to see if it took).

elif executor_state == ExecTaskState.INITIALIZED:
return State.INITIALIZED
elif executor_state == ExecTaskState.CANCELLED:
return State.CANCELLED
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a CLOSING or CANCELLING status? I've noticed at least on slurm that if you check it again too quick after cancelling (sometimes up to a minute) you'll get the CG status from the scheduler while it's still cleaning up/shutting it down. Or can pending capture this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, those don't exist. The pool technically will cancel a job (in this case a process) outright.

batch:
shell: /bin/bash
type: local_pool
num_workers: 3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll definitely need to spend some effort on whatever nomenclature is settled on for this stuff to avoid confusion with workers/cores/tasks/jobs/...

Copy link
Member Author

@FrankD412 FrankD412 Feb 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I rename this to num_processes instead? I feel like that would be a better description that doesn't alias to other things.

@jwhite242
Copy link
Collaborator

@jwhite242 -- Some considerations based on why we were discussing earlier:

* I was hoping to eliminate the checks for local execution in the `ExecutionGraph._StepRecord`. This would bring all execution in line with scheduler-based adapters. You had mentioned some benefits to keeping the distinction around. What are your thoughts?

* We need to conceptualize how we would handle job packing in an Executor for pyaestro and then do something similar here. What have you found in terms of packing and the like?

So I was thinking this could be a useful distinction based on the two possible run modes: standalone batch jobs (the current scheduler adapters) and running maestro inside of an allocation manually for job packing (or spawned by another tool built on top of maestro). Doesn't necessarily need to use local as the distinction, but do we need some hook to enable the latter job packing mode?

As for the job packing, this is something where i think plugins would be really useful, or some way for users to write these things like with pgen. The number of scheduling behaviors and the optimization algorithms for implementing them is a pretty large space that doesn't necessarily need to be hard wired into maestro. A few simple/interesting ones would be good of course.

@FrankD412
Copy link
Member Author

@jwhite242 -- Some considerations based on why we were discussing earlier:

* I was hoping to eliminate the checks for local execution in the `ExecutionGraph._StepRecord`. This would bring all execution in line with scheduler-based adapters. You had mentioned some benefits to keeping the distinction around. What are your thoughts?

* We need to conceptualize how we would handle job packing in an Executor for pyaestro and then do something similar here. What have you found in terms of packing and the like?

So I was thinking this could be a useful distinction based on the two possible run modes: standalone batch jobs (the current scheduler adapters) and running maestro inside of an allocation manually for job packing (or spawned by another tool built on top of maestro). Doesn't necessarily need to use local as the distinction, but do we need some hook to enable the latter job packing mode?

As for the job packing, this is something where i think plugins would be really useful, or some way for users to write these things like with pgen. The number of scheduling behaviors and the optimization algorithms for implementing them is a pretty large space that doesn't necessarily need to be hard wired into maestro. A few simple/interesting ones would be good of course.

I was thinking about this and the case of running Maestro in an allocation would be where you have an "allocation adapter" class that comes in handy. That class would take in the global set of resources requested and schedule the conductor call. That's actually where I wanted to split out the MPI related functionality from the SchedulerScriptAdapter since you could either monkey patch the method in the LocalPoolAdapter to generate the right MPI or just have it in the batch settings to use a particular MPI (usual factory pattern there to get the class). We could mock that up in a discussion thread.

For this PR, I was thinking if I can get the local pool as the main local adapter (it would still return that it's local, but in the ExecutionGraph it'd be treated just like everything else since it's a process pool. The records would still return local otherwise.

I'm starting to wonder if this is a good time to introduce an orchestrate sub-command to handle the allocation clear so it's more transparent to the user? -- we should make a discussion on this haha

Hopefully this is making sense.

@FrankD412 FrankD412 marked this pull request as draft March 10, 2021 18:23
@FrankD412 FrankD412 removed this from the Release 1.1.9 milestone Aug 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Adapters Items that are directly related to Maestro's adapter structure. enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants