You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
openjdk version "1.8.0_342"
OpenJDK Runtime Environment (build 1.8.0_342-8u342-b07-0ubuntu1~18.04-b07)
OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)
OS version (uname -a if on a Unix-like system):
Description of the problem including expected versus actual behavior:
TLDR: if cache_save_path is not provided, Netflow::TemplateRegistry does not call do_cleanup which is in charge of cleaning up the Vash memory caches.
In our testing, logstash heap memory usage would continually increase until it would crash with an out of memory exception. This would happen around every four hours in our environment.
Comparing heap dumps within those four hours, we noticed the memory usage of an object grow over 4x. (Right side is baseline, left is dump from oom crash)
Opening up the object, we can determine the class name from the metadata.
We trace this back to the corresponding source code:
In the heap dump screenshot var2 and var5 correspond with the two instances of Vash used in the TemplateRegistry. From our testing, the memory usage of these two objects were continuously growing.
The Vash object will forget any answer that is requested after the specified
TTL. It is a good idea to manually clean things up from time to time because
it is possible that you'll cache data but never again access it and therefor
it will stay in memory after the TTL has expired. To clean up the Vash object,
call the method: cleanup!
In TemplateRegistry, the cleanup call for both Vash objects are made in the TemplateRegistry::do_cleanup method.
Thus we can see that this situation happens when a value is not provided for cache_save_path, setting file_path to nil by default causing do_cleanup to always get skipped.
Steps to reproduce:
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered:
Logstash information:
Version: 7.16.3
JVM (e.g.
java -version
):openjdk version "1.8.0_342"
OpenJDK Runtime Environment (build 1.8.0_342-8u342-b07-0ubuntu1~18.04-b07)
OpenJDK 64-Bit Server VM (build 25.342-b07, mixed mode)
OS version (
uname -a
if on a Unix-like system):Description of the problem including expected versus actual behavior:
TLDR: if
cache_save_path
is not provided,Netflow::TemplateRegistry
does not calldo_cleanup
which is in charge of cleaning up theVash
memory caches.In our testing, logstash heap memory usage would continually increase until it would crash with an out of memory exception. This would happen around every four hours in our environment.
Comparing heap dumps within those four hours, we noticed the memory usage of an object grow over 4x. (Right side is baseline, left is dump from oom crash)
Opening up the object, we can determine the class name from the metadata.
We trace this back to the corresponding source code:
logstash-codec-netflow/lib/logstash/codecs/netflow.rb
Lines 537 to 553 in b7df239
In the heap dump screenshot
var2
andvar5
correspond with the two instances ofVash
used in theTemplateRegistry
. From our testing, the memory usage of these two objects were continuously growing.Looking at the
Vash
implementation, we can see that it requires a manualcleanup
call in order to release memory.https://gist.github.com/joshaven/184837
In
TemplateRegistry
, the cleanup call for bothVash
objects are made in theTemplateRegistry::do_cleanup
method.logstash-codec-netflow/lib/logstash/codecs/netflow.rb
Lines 661 to 667 in b7df239
do_cleanup
is then only ever called indo_persist
logstash-codec-netflow/lib/logstash/codecs/netflow.rb
Lines 643 to 659 in b7df239
However, note that on line 644, if
file_path
is not provided, then thedo_persist
function exits early, hence skipping the call todo_cleanup
.file_path
can then be traced back to thecache_save_path
setting in the initialization of theTemplateRegistry
.logstash-codec-netflow/lib/logstash/codecs/netflow.rb
Lines 67 to 68 in b7df239
Thus we can see that this situation happens when a value is not provided for
cache_save_path
, settingfile_path
tonil
by default causingdo_cleanup
to always get skipped.Steps to reproduce:
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: