Cache and process only specified entries #266

eric-rios · 2024-04-24T19:16:46Z

For a project I'm part of, we wanted to reduce RAM usage as a whole, so we dug a bit deeper in Goyang's code trying to find improvements that could be done for that goal.

We came up with a mechanism that allows the user to list the names of the Entries they specifically want in the entryCache and processing only that, greatly reducing the memory usage of Goyang. We think it's an option that could be of use to others, and so we wanted to bring it up to you to get inputs, see if it makes sense to have, etc. before creating a Pull Request. I'll get into more detail of what it changes, specially for the Process() function, which was the most affected.

There's a new field in the ParseOptions to define the names of the Entries that must be added to the cache. We are using a map[string]struct{} to have faster lookup.
- NOTE: If this map is left empty, Goyang works as it does currently, so those that have no use for this will not see any impact.
In the ToEntry function, when defering the storage of the Entry in the cache, it calls a new function CacheEntry() instead. If the filtering map is empty, it will store the Entry as usual, otherwise it will store the Entry only if its name is part of the filter.
- There is another aspect to the CacheEntry() function that I'll explain later.
- In our use case, we are storing only some "root" Entries, since we are able to operate by going through the Dir field of each Entry.
Now to the changes in the Process() function. After calling the private process(), instead of using ToEntry on every single module and submodule, only modules and submodules defined in the filtering map are processed (if this is set). Since ToEntry is recursive, we know every Node of interest will be processed, and by the end, the "roots" in the cache will allow access to every Node processed.
When it comes to handling Augments, Choices and Deviations (we refer to these as specializations), we are using a separate cache, this is where CacheEntry() comes in again: Before checking if the Entry should be added to the entryCache, if the current Entry contains an Augment, Choice, or Deviation, it is stored in this new cache (let's call it specializationsCache). With all this, when handling these specializations as part of Process(), we go through the specializationsCache instead of iterating every Module and Submodule, since, again, we want to limit the processing to what is of interest.

For one of our use cases, we saw a reduction of about 35% in RAM usage.

From what we've tested so far, the end results are the same with this alternative, but part of the idea of bringing this up is making sure we are not missing any special scenarios that could break this. Let me know if you want me to share any part of the changed code as support material for this idea.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache and process only specified entries #266

Cache and process only specified entries #266

eric-rios commented Apr 24, 2024

Cache and process only specified entries #266

Cache and process only specified entries #266

Comments

eric-rios commented Apr 24, 2024