Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce microservices memory footprint #12200

Open
mapellidario opened this issue Dec 10, 2024 · 1 comment
Open

reduce microservices memory footprint #12200

mapellidario opened this issue Dec 10, 2024 · 1 comment

Comments

@mapellidario
Copy link
Member

Impact of the new feature

MicroServices

Is your feature request related to a problem? Please describe.

We realized that the microservices memory footprint depends on their backlog, for example for ms-rulecleaner at every polling cycle runs the function _execute() only once [1] on every workflow with a certain status [2]

Describe the solution you'd like

Taking ms-rulecleaner as an example, we could change getRequestRecords into a generator that yields only a few workflows every time it is called. We would need to add a for loop in execute() around the call to _execute(). Not a huge effort, achievable without consistent refactoring.

Describe alternatives you've considered

The alternative would be to process once workflow at a time, possibly moving our model to a pub/sub, but this would require some major refactoring

Additional context

Follow-up to #12042 .


[1]

totalNumRequests, cleanNumRequests, normalArchivedNumRequests, forceArchivedNumRequests = self._execute(requestRecords)

[2]

result = self.reqmgr2.getRequestByStatus([reqStatus], detail=True)

@mapellidario mapellidario changed the title reduce microservice memory footprint reduce microservices memory footprint Dec 10, 2024
@vkuznet
Copy link
Contributor

vkuznet commented Dec 10, 2024

@mapellidario , yesterday I posted on MM chat to Alan and Andrea my observations which aligned with the ticket. Here is my posting (for completeness on the issue):

Here is a proof of memory spike in MSRuleCleanerWflow call which appears on line https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/MicroService/MSRuleCleaner/MSRuleCleaner.py#L263

I took test/python/WMCore_t/MicroService_t/MSRuleCleaner_t/MSRuleCleanerWflow_t.py code and added memory profile to one of the unit test as following:

import tracemalloc
    def setUp(self):
        ...
        tracemalloc.start()
    def tearDown(self):
        # Stop tracing and print memory usage details
        current, peak = tracemalloc.get_traced_memory()
        print(f"Current memory usage: {current / 1024:.2f} KB")
        print(f"Peak memory usage: {peak / 1024:.2f} KB")
....
    def testIncludeParents(self):
       ....
        tracemalloc.stop()
        for idx in range(10000):
            req = self.includeParentsReq
            for key, val in req.items():
                if isinstance(val, (str, bytes)):
                    req[key] += "%s" % idx
            MSRuleCleanerWflow(req)

Basically, I run over 10K requests which I modified slightly and call MSRuleCleanerWflow for each of them in a similar manner as MSRuleCleaner code is doing.

Here is the outcome:

  • without my loop I observer on average 10KB memory footprint
python test/python/WMCore_t/MicroService_t/MSRuleCleaner_t/MSRuleCleanerWflow_t.py
Current memory usage: 8.56 KB
Peak memory usage: 11.52 KB
.Current memory usage: 7.54 KB
Peak memory usage: 10.52 KB
.Current memory usage: 7.87 KB
Peak memory usage: 10.86 KB
.Current memory usage: 5.82 KB
Peak memory usage: 7.70 KB
.

and, when I enable my for loop I see the following:

Current memory usage: 1232.53 KB
Peak memory usage: 1276.86 KB
.Current memory usage: 7.54 KB
Peak memory usage: 10.30 KB
.Current memory usage: 7.87 KB
Peak memory usage: 10.64 KB
.Current memory usage: 5.82 KB
Peak memory usage: 7.49 KB
.

As you can see the first reported set of numbers which correspond to the test I modified spiked from 11KB to 1232KB.

Therefore if we take MSRuleCleaner for loop at line https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/MicroService/MSRuleCleaner/MSRuleCleaner.py#L262 and pass 10K requests you will see a spike of 1000x times in memory due to the memory allocation in MSRuleCleanerWflow call (which by itself makes couple of deepcopy calls over nested python dictionary)

Here is modified version I used MSRuleCleanerWflow_t.py

To fix the problem few steps should be performed:

  • the _execute should process single workflow or request, instead of taking list of requests and loading corresponding number of workflow objects.
  • The for loop for reqRecords should be taken out of this method to upper codebase and process only one workflow at a time which will keep memory be equal to one workflow
  • wfCounters should be taken outside of this code as well and converted to basic integers, instead of keeping them in a nested dict
  • the execute code should be refactored into something like this:
def execute(self, reqStatus):
      ...
      for status in reqStatus:
          # in this loop we'll only allocate single wflow object, process it and collect metrics
          # therefore, the memory allocation will be flat regardless of number of records.
          for rec in self.getRequestRecords(status):
               metrics = self._execute(rec)  # metrics is a tuple of integers
               total_num += metrics[0]  # first metric counte
               ...
               self.updateReportDict(summary, "total_num_requests", total_num)
      ...
      
def _execute(self, record):
      ...
      wflow = MSRuleCleanerWflow(req)
      ...
      # process pipelines and obtain necessary metrics
      metrics = (totalNum, cleanNum, normalArchivedNum, forceArchivedNumR)
      return metrics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants