-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stage buffer sometimes sticks around and doesn't ever get queued #4662
Comments
I saw this error message again today:
In this example, I suspect:
fluentd/lib/fluent/plugin/buffer.rb Line 602 in 403a28f
I don't know why this
|
It seems this intermittent |
It seems that your own plugin is related with this stack trace.
Since we can't read this code, I'm not sure the cause yet. |
@ashie # Originally copied from https://github.com/yosssi/fluent-plugin-cloud-pubsub
# License: MIT
require 'google/cloud/pubsub'
module Fluent
class CloudPubSubOutput < BufferedOutput
MAX_REQ_SIZE = 10 * 1024 * 1024 # 10 MB
MAX_MSGS_PER_REQ = 1000
Plugin.register_output('cloud_pubsub', self)
config_param :project, :string, :default => nil
config_param :topic, :string, :default => nil
config_param :key, :string, :default => nil
config_param :max_req_size, :integer, :default => MAX_REQ_SIZE
config_param :max_msgs_per_req, :integer, :default => MAX_MSGS_PER_REQ
unless method_defined?(:log)
define_method("log") { $log }
end
unless method_defined?(:router)
define_method("router") { Fluent::Engine }
end
def configure(conf)
super
raise Fluent::ConfigError, "'project' must be specified." unless @project
raise Fluent::ConfigError, "'topic' must be specified." unless @topic
end
def multi_workers_ready?
true
end
def start
super
pubsub = Google::Cloud::PubSub.new(project_id: @project, credentials: @key)
@client = pubsub.topic @topic
end
def format(tag, time, record)
[tag, time, record].to_msgpack
end
def publish(msgs)
log.debug "publish #{msgs.length} messages"
@client.publish do |batch|
msgs.each do |m|
batch.publish m
end
end
end
def write(chunk)
msgs = []
msgs_size = 0
chunk.msgpack_each do |tag, time, record|
size = Yajl.dump(record).bytesize
if msgs.length > 0 && (msgs_size + size > @max_req_size || msgs.length + 1 > @max_msgs_per_req)
publish(msgs)
msgs = []
msgs_size = 0
end
msgs << record.to_json
msgs_size += size
end
if msgs.length > 0
publish(msgs)
end
rescue
log.error "unexpected error", :error=>$!.to_s
log.error_backtrace
end
end
end The error shows that it's happening during |
Thanks for sharing the plugin code.
AFAIK
If a chunk is possible to be processed by multiple thread simultaneously, it might be effective. fluentd/lib/fluent/plugin/output.rb Line 1189 in a2b935a
fluentd/lib/fluent/plugin/buffer.rb Lines 561 to 573 in a2b935a
In addition, you are using only a single flush threads (no flush_thread_count in you configuration).So I think #4336 wouldn't solve your issue. |
Describe the bug
I've been trying to track down what looks like a memory leak for the last week where a stage buffer doesn't get cleared out even though new data arrives. In my latest attempt to isolate the problem, I noticed a jump to 8 MB in the
fluentd_output_status_buffer_stage_byte_size
Prometheus metric, which measures the total bytes of the stage queue:This jump appears to persist indefinitely until I restart fluentd.
To Reproduce
I'm still working on this.
Expected behavior
No memory growth over time.
Your Environment
Your Configuration
I don't have a clear reproduction step yet. Our config looks something like this:
Your Error Log
The stuck 8MB buffer seems to have coincided with an EOF error:
Additional context
Note that previously when log messages were up to 3 MB, I would see more of these "step" jumps in memory usage. I've altered our filters to truncate the log messages to 200K, which seems to have stopped most of these stage buffer leaks. But I'm still wondering if there is a corner case here where the file buffer got cleared but the stage buffer did not.
The text was updated successfully, but these errors were encountered: