Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish CLI function implementations #525

Merged
merged 43 commits into from
Mar 15, 2024

Conversation

robertbartel
Copy link
Contributor

Complete support for CLI commands to get a list of job ids, to get job state details, and to stop and restart already-running jobs.

Additions

  • Implementations for request_job_info, request_job_release, request_job_restart, request_job_status, request_job_stop, and request_jobs_list functions to the JobClient class in the client package
  • External request and response message subtypes specifically for the required existing job control and query operations
  • ExistingJobRequestHandler external request handler subtype for receiving aforementioned new request types and directing them to scheduler service
  • JobExecStep values STOPPING (but not stopped) and CANCELED (i.e., first stopped, then resources released) jobs
  • Functions to JobManager and implementations to stop and restart jobs
  • Functions in Launcher to stop/remove a running job (i.e., the underlying Docker services)
  • Function in JobUtil interface to get all job ids (not just running)

Changes

  • Integrating the new external request handler into request service
  • Updating scheduler service to handle new job request types
  • Updating Job should_release_resources function for both correctness and new status step options
  • Updating JobManager release_resources function signature and implementations
  • Update Redis-backed job manager to more easily track all jobs
  • Adjust Redis-backed job manager to store job id of active jobs instead of the derived Redis key for the job id
  • Change Response to inherit from BasicResponseIndicator
  • Minor adjustments to move away from deprecated attribute assignment when setters are available
  • Updates to tests

Testing

  1. All unit tests pass on development system
  2. All "integration" tests pass on primary local development system except for modeldata package; this is expected to be environment-based (related to some current issues with standing up the object store stack) and not due to changes
  3. Tests don't all pass (neither set) on a secondary local development system; unsure currently of whether this is due to a user error or an environmental issue of some kind

@robertbartel robertbartel added enhancement New feature or request maas MaaS Workstream labels Feb 22, 2024
Copy link
Member

@aaraney aaraney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few minor comments, but in general this looks good!

@robertbartel robertbartel linked an issue Feb 27, 2024 that may be closed by this pull request
Copy link
Contributor

@christophertubbs christophertubbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's at least one syntax error, so that should be addressed.

@christophertubbs
Copy link
Contributor

fix the version, brochaho

- Adjusting the documentation of MessageEvenType.SCHEDULER_REQUEST to
  broaden its applicability beyond initiating a job.
- Making Response extend from BasicResultIndicator so that the 'data'
  attribute can be implemented just once (and correctly)
Bumping communication package dependency version required by scheduler
service package.
Adding several private functions to the scheduler service handler for
handling new messages for job information or control requests.
Updating listener function with redesigned logic for processing incoming
messages that includes handling of new message types for handling
requests for job information or to control existing jobs.
Added STOPPING (for when a job has been requested to be stopped, but
before it has) and CANCELED (when a STOPPED job has its resources
released) JobExecStep values.
Allow for releasing resources in CANCELED or COMPLETED steps.
Rewriting to implement via list comprehension rather than loop.
Adding function to get all job ids to interface, and updating redis impl
for this; also modifying tracking of active jobs to track job id, rather
than job key.
Modify release_allocations func in JobManager interface and Redis impl:

- accept either a job object reference or a job id as the param
- return a BasicResultIndicator, rather than nothing
- updating Redis impl specifically to only act on a job in a state that
  makes it eligible to release resources
- updating Redis impl specifically to include logic to saving the
  updated job object
Renaming get_parseable_request_funcs to _get_parseable_request_funcs to
indicate it isn't intended for use outside the class and its instances.
robertbartel and others added 24 commits March 15, 2024 09:46
Add support for handling JobControlRequest, JobInfoRequest, and
JobListRequest in main request service handler.
Adding implementations for request_job_info, request_job_release,
request_job_resume, request_job_status, request_job_stop, and
request_jobs_list functions.
Note that proper fix is out of scope for this; this addresses immediate
failures loading module.
Move "set" of several attribute values in it_RedisBackedJobManager.py to
use of setter functions.
Updating it_RedisBackedJobManager.py to account for changes in the
implementation of the release_allocations function, including some
expansion to test coverage.
Adjusting function name from request_job_resume to request_job_restart
to distinguish (hopefully) this is restarting a previously running job.
Updating newly implemented job query and control functions in JobClient
to account fully for failure scenarios, and refactoring to centralize
mostly duplicate routines that lent themselves well to shared
parameterized functions.
Copy link
Member

@aaraney aaraney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Thanks, @robertbartel. Merge at will

@robertbartel
Copy link
Contributor Author

Despite @aaraney's approval, Github still seems to want me to wait for @christophertubbs, since his review result was to request changes. I'm going to see what happens if I dismiss that stale review.

@robertbartel robertbartel dismissed christophertubbs’s stale review March 15, 2024 19:47

Outdated and superseded by Austin's approval

@robertbartel robertbartel merged commit ebae398 into NOAA-OWP:master Mar 15, 2024
8 checks passed
@robertbartel robertbartel deleted the f/finish_cli_funcs/main branch March 15, 2024 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request maas MaaS Workstream
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Complete implementations of Request Client functions
3 participants