-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Sitelists changes propagation to local workqueue #12245
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -6,6 +6,7 @@ | |||||||||
from WMCore.Lexicon import splitCouchServiceURL | ||||||||||
from WMCore.WMSpec.WMWorkload import WMWorkloadHelper | ||||||||||
from WMCore.WorkQueue.DataStructs.WorkQueueElement import STATES | ||||||||||
from WMCore.WorkQueue.DataStructs.CouchWorkQueueElement import CouchWorkQueueElement | ||||||||||
|
||||||||||
|
||||||||||
def convertWQElementsStatusToWFStatus(elementsStatusSet): | ||||||||||
|
@@ -287,7 +288,8 @@ def updateElementsByWorkflow(self, workload, updateParams, status=None): | |||||||||
# Update all workload parameters based on the full reqArgs dictionary | ||||||||||
workload.updateWorkloadArgs(updateParams) | ||||||||||
# Commit the changes of the current workload object to the database: | ||||||||||
workload.saveCouchUrl(workload.specUrl()) | ||||||||||
metadata = {'name': wfName} | ||||||||||
workload.saveCouch(self.hostWithAuth, self.db.name, metadata=metadata) | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't this method used by global workqueue as well? It might be that making this change for the agent will actually break global workqueue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes (is used by global) and No (won't break anything) and No (it is not redundant) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, I see this pattern has already been used in the method that updates the priority: WMCore/src/python/WMCore/Services/WorkQueue/WorkQueue.py Lines 253 to 256 in 180e27d
|
||||||||||
return | ||||||||||
|
||||||||||
def getWorkflowNames(self, inboxFlag=False): | ||||||||||
|
@@ -325,6 +327,32 @@ def deleteWQElementsByWorkflow(self, workflowNames): | |||||||||
deleted += len(ids) | ||||||||||
return deleted | ||||||||||
|
||||||||||
def getWQElementsByWorkflow(self, workflowNames, inboxFlag=False): | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Other than the workflow name, can't we have a method that will be very very similar to what is implemented for
? This implementation seems to be overloaded, also with the additional data structures created and grouped by database. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have changed the the data structure returned so it is now just a list of WQE. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand why you need the full element content.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. additionally, if we have unit tests for this module, please create a new one for this new method. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We are not using this method for updating any WQE, but rather for listing its content like this: [1]
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ahh, you are making this process in two operations:
This operation is very inefficient, instead you can add the following option:
in order to retrieve the actual documents together with the CouchDB view access. The relevant unit test can go in: https://github.com/dmwm/WMCore/blob/master/test/python/WMCore_t/Services_t/WorkQueue_t/WorkQueue_t.py There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok |
||||||||||
""" | ||||||||||
Get workqueue elements which belongs to a given workflow name(s) | ||||||||||
:param workflowNames: The workflow name for which we try to fetch the WQE elemtns for (could be a list of names as well) | ||||||||||
:param inboxFlag: A flag to switch quering the inboxDB as well (default: False) | ||||||||||
:return: A list of WQEs | ||||||||||
""" | ||||||||||
if inboxFlag: | ||||||||||
couchdb = self.inboxDB | ||||||||||
else: | ||||||||||
couchdb = self.db | ||||||||||
|
||||||||||
if not isinstance(workflowNames, list): | ||||||||||
workflowNames = [workflowNames] | ||||||||||
|
||||||||||
options = {} | ||||||||||
options["stale"] = "ok" | ||||||||||
options["reduce"] = False | ||||||||||
options['include_docs'] = True | ||||||||||
|
||||||||||
data = couchdb.loadView("WorkQueue", "elementsByWorkflow", options, workflowNames) | ||||||||||
wqeList=[] | ||||||||||
for wqe in data['rows']: | ||||||||||
wqeList.append(CouchWorkQueueElement.fromDocument(couchdb, wqe['doc'])) | ||||||||||
return wqeList | ||||||||||
|
||||||||||
def getElementsCountAndJobsByWorkflow(self, inboxFlag=False, stale=True): | ||||||||||
"""Get the number of elements and jobs by status and workflow""" | ||||||||||
if inboxFlag: | ||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we seem to have a mix of workload object from different sources:
pklFileName
, which I believe to live in the filesystem (probably under WorkQueueManager/cache)specUrl
from the CouchDB document.We should pick one of those and use it consistently. This will avoid confusion and we can also make sure that we are comparing the current site lists agains the correct workload spec file, hence avoid unnecessary updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, to me, it was not understood why should we switch to loading a
.pkl
file at the first place, since the initial idea and also, if I am not wrong, Valentin's previous implementation of this was by loading the spec from couch. only after the PR for swapping the functions to use mineupdatWorkloadArgs
, it was changed to loading the spec frompkl
without any explanation why. And as of consistency - the same mechanism happens at central services and we do it quite safely from the database. I do not see a reason why to change this approach here at the Agent.Something more - tha actual reason indeed is because we are not agnostic to the the way how the
workload
object is created.workload.specUrl()
returns:.pkl
: then the methodworkload.specUrl()
returns:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW we may need to investigate on that , because on the other hand I could not reproduce the behaviour just a minute ago, which was strange indeed. but in any case we'd better stay consistent acros Central services and WMAgent and stick to equivalent methods for obtaining the wrkflow spec, which would be guarantied to always produce the spec URI related to the couchDB rather then the filesystem.