-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cloudapi routes for listing and deleting composes #4154
base: main
Are you sure you want to change the base?
Conversation
This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days. |
This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This LGTM. Other than a couple of very minor test variable names, which I don't feel too strongly about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM :)
241d228
to
8d1cea1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Mostly some SQL issues.
And I am confused about one thing, I've always thought as the first job in the chain (depsolve > manifest > osbuild) to be the root node.
@@ -92,6 +92,9 @@ type JobQueue interface { | |||
|
|||
// Deletes the worker | |||
DeleteWorker(workerID uuid.UUID) error | |||
|
|||
// AllRootJobIDs returns a list of top level job UUIDs that the worker knows about | |||
AllRootJobIDs() ([]uuid.UUID, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add / filter on channel here. That way the brew tenant can't look at the job ids of the consoledot tenant etc…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that belongs at this layer. If the caller wants to filter by channel that is a separate issue. The goal here is to list everything so that composer-cli can let the user examine what's happening.
@@ -93,6 +93,9 @@ type JobQueue interface { | |||
// Deletes the worker | |||
DeleteWorker(workerID uuid.UUID) error | |||
|
|||
// AllJobIDs returns a list of all job UUIDs that the worker knows about | |||
AllJobIDs() ([]uuid.UUID, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, I'd add a job channel.
// the dependants list and then re-save the job instead of removing it. | ||
// | ||
// This assumes that the jobs have been created correctly, and that they have | ||
// no dependency loops. Shared Dependants are ok, but a job cannot have a dependancy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick, dependancy > dependency.
@@ -18,6 +18,7 @@ import ( | |||
"github.com/labstack/echo/v4" | |||
"github.com/labstack/echo/v4/middleware" | |||
"github.com/sirupsen/logrus" | |||
"golang.org/x/exp/slices" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use "slices" now, we're using 1.21
@@ -341,6 +342,42 @@ func (s *Server) RemoveJob(id uuid.UUID) error { | |||
return s.jobs.RemoveJob(id) | |||
} | |||
|
|||
// CleanupArtifacts removes worker artifact directories that do not have matching jobs | |||
func (s *Server) CleanupArtifacts() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder how this will run in the service, probably fine since I don't think we create anything in the artifact directory there.
|
||
return nil, nil | ||
var jobs []uuid.UUID | ||
jobs, err = q.AllJobIDs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to just write a SQL query to query all IDs which aren't in the dependency_id (in case the osbuild/koji-finalize jobs are the root nodes) column in the job_dependencies table. In the DB there could be thousands of jobs so it's better to just offload the filtering to the DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with thousands of jobs I don't think performance should be a problem. I find it far easier to think about this in go than in SQL. I find complex SQL queries are difficult to maintain.
@@ -70,6 +74,9 @@ const ( | |||
SELECT job_id | |||
FROM job_dependencies | |||
WHERE dependency_id = $1` | |||
sqlDeleteDependencies = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this isn't necessary, because deleting from the jobs table cascades automatically to the dependencies and heartbeats table.
// This assumes that the jobs have been created correctly, and that they have | ||
// no dependency loops. Shared Dependents are ok, but a job cannot have a dependency | ||
// on any of its parents (this should never happen). | ||
func (q *DBJobQueue) RemoveJob(id uuid.UUID) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just deleting from the jobs table and letting psql do the rest via cascades is advised here. Should be much simpler.
We used to use this query to recursively select dependencies:
sqlQueryDepedenciesRecursively = `
WITH RECURSIVE dependencies(d) AS (
SELECT dependency_id
FROM job_dependencies
WHERE job_id = $1
UNION ALL
SELECT dependency_id
FROM dependencies, job_dependencies
WHERE job_dependencies.job_id = d )
SELECT * FROM dependencies`
Could tweak it to switch to job dependants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I'm not sure if the way the db is setup is actually the same as the fsjobqueue then. The goal is to only delete the jobs that only depend on the jobs 'above' it in the tree. That's why there is a sqlDeleteDependencies
query.
I can't draw this in ascii so I'll try with words :) My understanding is that it is possible that a job could have two dependents (I like to call them parents). So when you delete a top level job you only want to walk down the tree as far as the job with two parents, and remove the parent that was in the path back to the root job you are deleting but leave that job with its other parent intact.
if err != nil { | ||
return jobIDs, err | ||
} | ||
if !exists || len(j.Dependents) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've always considered the root node to be the first job in a chain. So in a depsolve > manifest > osbuild chain it's the depsolve job that's the root I thought, and in a koji job koji-init. The osbuild / koji-finalize would be leaf nodes then.
But this considers the osbuild or koji-finalize job to be the root node, no? As those are the only ones without dependents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen any live SQL examples, but with the fsjobqueue backend when you start a compose the first job is the one with no Dependents. TBF I find the terminology here hard to keep straight in my head, but from what I see happening Dependencies are jobs the current job is a parent of, and Dependents are jobs that depend on the current job.
8d1cea1
to
45ac842
Compare
632ea2a
to
c2ce996
Compare
This lists the root job UUIDs (the jobs with no dependants). Currently only implemented by fsjobqueue. The function for dbjobqueue currently returns nil. Related: RHEL-60120
This will be used to list the top level job UUIDs. Related: RHEL-60120
Related: RHEL-60120
This function lists all of the jobs, not just the root jobs. Currently only implemented by fsjobqueue. The function for dbjobqueue currently returns nil. Related: RHEL-60120
This allows database entries to be removed. Related: RHEL-60120
This allows jobs to be removed from the database. Currently only implemented by fsjobqueue. The function for dbjobqueue currently returns nil. This will remove all the job files used by the root job UUID as long as no other job depends on them. ie. It starts at the top, and moves down the dependency tree until it finds a job that is also used by another job. Related: RHEL-60120
This will be used to delete jobs and their artifacts. Related: RHEL-60120
This adds the handler for /composes/delete which will delete a job and all of its dependencies. Related: RHEL-60120
This makes it easier for the delete handler to test for running jobs and reject the delete request. The job must be either a success or failure before it can be deleted. Related: RHEL-60120
This removes all artifact directories, and their contents, if there isn't an associated Job ID (as returned by worker.Server.jobs.AllJobIDs) Related: RHEL-60120
Related: RHEL-60120
Related: RHEL-60120
Related: RHEL-60120
This adds SQL to delete jobs and dependencies, and implements the database version of the RemoveJobs function. Related: RHEL-60120
This adds tests for retrieving all job, all root jobs, and removing jobs and unused dependencies. These tests are run against the fsjobqueue for unit testing, and against dbjobqueue for integration testing. Resolves: RHEL-60120
c2ce996
to
d9d43e7
Compare
This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days. |
This pull request includes:
This PR adds 2 new cloudapi routes, GET
/composes
and DELETE/composes/delete/<uuid>
, see the commits for the details.As part of this it now also cleans up artifacts that are no longer attached to a job.
What it doesn't do is work with the dbjobqueue yet. It works on-premises with the fsjobqueue.