-
Notifications
You must be signed in to change notification settings - Fork 107
Profiling the memory usage of a function
I'm writing these notes here before they get lost and I have to gather all this information again.
There are only a few python libraries for memory profiling, however it seems memory_profiler
has a wider functionality, allowing you to run your executable while you watch the memory usage, using decorators, profiling your code line-by-line, profiling a whole method/function, etc.
Another interesting library to fetch statistics (and also debug memory leaks) of which object types are taking up memory allocation is objgraph.
Both are third-party libraries and they are still not in our CMS stack of software, so we have to manually install them. Or, we can use a docker container as a virtual environment and install these packages inside our container only:
pip install memory_profiler # or ... easy_install memory_profiler
pip install objgraph # or ... easy_install objgraph
More documentation can be found in: Memory Profiler Objgraph
I took the decorator approach such that we can see at which line the memory usage is blowing up. Unfortunately, we need to update the source code - by adding decorators - to make it work.
I wanted to profile JobSubmitterPoller, so we need to first import memory_profiler in the source code:
+from memory_profiler import profile
and then add the profile decorator to each function that we want to get a memory report back. By default, the report is printed out to sys.stdout, however, if we're running unit tests, we better define a file stream such that we can see that output (only failed unit tests write anything to sys.stdout). Thus, we need to apply more changes to JobSubmitterPoller.py, as following:
...
+ refreshFp = open('refreshCache_stats.log', 'w+')
+ @profile(stream=refreshFp)
def refreshCache(self):
...
and to the algorithm
method that we want to profile as well:
...
+ algorithmFp = open('algorithm_stats.log', 'w+')
@timeFunction
+ @profile(stream=algorithmFp)
def algorithm(self, parameters=None):
...
Now we just need to run any unit tests that will call any of these functions, and we'll get those memory log back in the component directory, something like: test/python/WMComponent_t/JobSubmitter_t/algorithm_stats.log
The heaviest JobSubmitter unit test is:
nosetests JobSubmitter_t.py:JobSubmitterTest.testMemoryProfile
A full example can be seen in this branch: memory-profile-test
NOTE: if your test crashes, those file descriptors will be left open until they are eventually closed :(