-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logstash 2.1.1 immediate OOM when running simple lumberjack input (regression) #4333
Comments
@jamesblackburn can you put the heap dump somewhere? I would like to peak into it. |
I need to check what's in that decompressed byte array string @ph Interestingly one of the reasons I'm trying to upgrade is because of logstash-forwarder#293 We have a bunch of servers running lumberjack (including Something that might be related (or not), is that for 1.5.5 we sometimes see this exception in JZlibInflate:
|
I think it might be related, I don't think we have changed how we handle the zlib stuff in a while between 1.5 and 2.1, also where you see all the memory retain is inside the jruby core. It is possible you are sending really huge payload between LSF and logstash from time to time? If you are upgrading for that issue I suggest you take a look at |
@jamesblackburn I believe you should see a path in the data this might help us find which data is make logstash oom. I think LSF doesn't do a check on the size of the payload before sending it, it just send it regardless. Than the input just buffer the content before extracting it, so this might be the source of the oom. |
I don't think there's an easy way for me to upload the hprof file from here. To me it appears that all the memory has gone to a single JZlibInflate buffer. Looking at the strings they look highly repetetive, and likely compresses very well. Do you believe that filebeat has different behaviour in this regard? |
@jamesblackburn I am not sure if filebeat implement a different behavior, but since this is were the development is focused I would migrate to it to get all the latest features. LSF will eventually be unsupported. In the byte buffer do you see the path to the original file, might be worth a look at the content? With how much memory you are starting Logstash? |
It was with the defaults - 1G. It fails with 2G too, and seems to stay up with 8G. I notice on the logstash forwader, I'm seeing:
The file being sent isn't too large:
|
@jamesblackburn this is really a small files! How many LSF clients to that logstash instance? |
Quite a few... I'm running 8 instances of the logstash to receive data, and have configured the forwarder like:
My logstash receiver configuration is the above - I'm only writing to a Is there a go version of a receiver for the lumberjack / filebeats protocol? I don't think the JRuby version scales :( |
The JRuby version should scale to what you want to do, the bottleneck should only be disk IO in this case. What you are encountering is a bug and we can fixes theses, I don't have access to the data so I asking the question I can to debug this. From what I see from the screenshot. I think So it's either LSF is sending the wrong data or lumberjack doesn't know when a frame stop and just keep buffing, this might explain the duplicate of the data in the byte buffer. Filebeat has some improvement over LSF with how we transmit the data between the node and logstash I think this might help in this case. So if its possible for you to try it out I would look into it. The LSF If its not possible to upgrade, with the content of the byte buffer you should be able to find the host that generate the log and the problematic file, with this information it might be possible to recreate a simple test case to debug this issue. |
by simple test case, I meant 1 LSF + 1 Logstash. |
It's actually an issue when sending a large file that has been corrupted by logrotate / java. See the repro here:
2.1.1 always OOMs, whereas 1.5.5. doesn't |
Just to add @suyograo I believe repro. details have been provided? |
I will give a look as soon as possible, thanks for providing a test case for that. |
I will keep this one open, I am not sure if its still an issue with FB/input beats. |
closing as defunct, |
Trying to upgrade from 1.5.5 to 2.1.1 for a simple Logstash
lumberjack -> file
pipeline, I'm observing an immediate OOM on startup. The OOM appears to be inorg.jruby.ext.zlib.JZlibInflate
Running
logstash -f "/ahlinfra/monit/conf/logs.dev.man.com/init.d/../etc/fwd-receiver/*.conf" -w1
Configuration
Output
HProf analysis
The text was updated successfully, but these errors were encountered: