Broken Pipe and EOFErrors #119

leequarella · 2023-05-22T20:24:27Z

We recently switched from Sidekiq 6.5.8 to 7.1 (where there was a change to use redis-client) and have noticed a significant increase in errors, all coming from RedisClient. All errors follow one of three forms:

RedisClient::ConnectionError: EOFError
/app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/ruby_connection.rb:101:in `rescue in read': EOFError (RedisClient::ConnectionError)

RedisClient::ConnectionError: Broken pipe
/app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/ruby_connection.rb:88:in `rescue in write_multi': Broken pipe (RedisClient::ConnectionError)

RedisClient::ConnectionError: Broken pipe
/app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/ruby_connection.rb:76:in `rescue in write': Broken pipe (RedisClient::ConnectionError)
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/ruby_connection.rb:73:in `write'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/connection_mixin.rb:30:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client.rb:211:in `block (2 levels) in call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/middlewares.rb:16:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb:23:in `block in call'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/trace_operation.rb:192:in `block in measure'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/span_operation.rb:150:in `measure'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/trace_operation.rb:192:in `measure'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/tracer.rb:380:in `start_span'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/tracer.rb:160:in `block in trace'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/context.rb:45:in `activate!'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/tracer.rb:159:in `trace'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing.rb:18:in `trace'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb:13:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client.rb:210:in `block in call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client.rb:626:in `ensure_connected'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client.rb:209:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/decorator.rb:26:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/scripting.rb:21:in `block in call'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:107:in `block (2 levels) in with'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:106:in `handle_interrupt'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:106:in `block in with'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:103:in `handle_interrupt'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:103:in `with'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/scripting.rb:20:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/senate.rb:148:in `update_leader'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/senate.rb:124:in `election'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/senate.rb:103:in `cycle'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-7.1.0/lib/sidekiq/component.rb:10:in `watchdog'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-7.1.0/lib/sidekiq/component.rb:19:in `block in safe_thread'
/app/vendor/ruby-3.2.1/lib/ruby/3.2.0/openssl/buffering.rb:415:in `syswrite_nonblock': Broken pipe (Errno::EPIPE)
	from /app/vendor/ruby-3.2.1/lib/ruby/3.2.0/openssl/buffering.rb:415:in `write_nonblock'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/ruby_connection/buffered_io.rb:64:in `block in write'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/ruby_connection/buffered_io.rb:63:in `loop'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/ruby_connection/buffered_io.rb:63:in `write'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/ruby_connection.rb:74:in `write'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/connection_mixin.rb:30:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client.rb:211:in `block (2 levels) in call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/middlewares.rb:16:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb:23:in `block in call'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/trace_operation.rb:192:in `block in measure'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/span_operation.rb:150:in `measure'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/trace_operation.rb:192:in `measure'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/tracer.rb:380:in `start_span'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/tracer.rb:160:in `block in trace'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/context.rb:45:in `activate!'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/tracer.rb:159:in `trace'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing.rb:18:in `trace'
	from /app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb:13:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client.rb:210:in `block in call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client.rb:626:in `ensure_connected'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client.rb:209:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/redis-client-0.14.1/lib/redis_client/decorator.rb:26:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/scripting.rb:21:in `block in call'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:107:in `block (2 levels) in with'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:106:in `handle_interrupt'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:106:in `block in with'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:103:in `handle_interrupt'
	from /app/vendor/bundle/ruby/3.2.0/gems/connection_pool-2.4.0/lib/connection_pool.rb:103:in `with'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/scripting.rb:20:in `call'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/senate.rb:148:in `update_leader'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/senate.rb:124:in `election'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-ent-7.1.0/lib/sidekiq-ent/senate.rb:103:in `cycle'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-7.1.0/lib/sidekiq/component.rb:10:in `watchdog'
	from /app/vendor/bundle/ruby/3.2.0/gems/sidekiq-7.1.0/lib/sidekiq/component.rb:19:in `block in safe_thread'

#116 seemed possibly related, but it looks like Sidekiq already defaults to having reconnect_attempts set to 1 so maybe a dead end there.

I've struggled to figure out what the actual cause is. Any ideas what direction to head on this?

The text was updated successfully, but these errors were encountered:

byroot · 2023-05-22T20:34:04Z

Yes, AFAIK, sidekiq does set reconnect_attempts: 1. But perhaps it's depends how you configure it?

Also I don't have the code to sidekiq-ent handy, but perhaps it's a problem similar to sidekiq/sidekiq#5834. I'll have a look tomorrow to see if I can see something fishy.

FYI @mperham

byroot · 2023-05-22T20:35:04Z

Question, is the vast majority coming from watchdog / senate.rb, or is it all over the place?

leequarella · 2023-05-22T20:55:08Z

Question, is the vast majority coming from watchdog / senate.rb, or is it all over the place?

Seems to be all over the place.

casperisfine · 2023-05-23T08:02:34Z

Ok, so I had a look but couldn't spot anything wrong. Sidekiq 7.x does default to reconnect_attempts: 1 with no delay, which match redis-rb 4.x behavior.

I really fail to see what could have made the difference on your setup. Is the Sidekiq upgrade the only thing you shipped?

Any ideas what direction to head on this?

I'd say the very first step would be to double check reconnects do happen on your setup. I'd suggest forking the gem to add some logs here:

redis-client/lib/redis_client.rb

Lines 642 to 644 in ae29d65

    
           if !@disable_reconnection && config.retry_connecting?(tries, error) 
        
             tries += 1 
        
             retry

I suspect it's fine, but you never know.

Then I assume you use some third party Redis provider? It's possible that they're failing over of things like that. Any pattern on the errors? Is it a constant stream or is there some spikes?

If so, you may need to try exponential backoff retries: https://github.com/redis-rb/redis-client#exponential-backoff

imarquezc · 2023-07-10T19:44:58Z

Any updates on this issue? I'm experiencing the same after upgrading to sidekiq to 7.1.2 (which relies in redis-client 0.14.1).

I'm deploying on Heroku-22
Rails version: 7.0.6 (also happened on 6.1.7.3)

❯ heroku redis:info 
=== redis-cubic-41951 (REDIS_URL)
Plan:                   Premium 0
Status:                 available
Created:                2022-03-07 16:17 
Version:                6.2.12
Timeout:                300
Maxmemory:              noeviction
Maintenance:            not required
Maintenance window:     Mondays 19:00 to 23:00 UTC
Persistence:            AOF
HA Status:              Available
Requires TLS:           Yes
Keyspace Notifications: Disabled

byroot · 2023-08-01T15:24:33Z

I wonder if this could have been solved by #126.

But that does require a forking setup in which redis-client is used before fork, which isn't super common. I think sidekiq enterprise does support that? Have you have feature enabled?

brianclin · 2023-09-13T13:04:03Z

I've also been getting a lot of RedisClient::ConnectionError: Broken pipe errors.
It seems to have happened after I enabled TLS on Redis.
I'm using sidekiq 7.1 and redis 6.X on GCP.

RedisClient::ConnectionError: Broken pipe
/usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/ruby_connection.rb:76:in `rescue in write': Broken pipe (RedisClient::ConnectionError)
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/ruby_connection.rb:73:in `write'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/connection_mixin.rb:30:in `call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client.rb:224:in `block (2 levels) in call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/middlewares.rb:16:in `call'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb:23:in `block in call'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/trace_operation.rb:191:in `block in measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/span_operation.rb:149:in `measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/trace_operation.rb:191:in `measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:379:in `start_span'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:159:in `block in trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/context.rb:45:in `activate!'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:158:in `trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing.rb:18:in `trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb:13:in `call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client.rb:223:in `block in call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client.rb:643:in `ensure_connected'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client.rb:222:in `call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/decorator.rb:26:in `call'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:57:in `zpopbyscore'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:38:in `block (2 levels) in enqueue_jobs'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:33:in `each'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:33:in `block in enqueue_jobs'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/config.rb:164:in `block in redis'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:110:in `block (2 levels) in with'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:109:in `handle_interrupt'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:109:in `block in with'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:106:in `handle_interrupt'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:106:in `with'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/config.rb:161:in `redis'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/component.rb:28:in `redis'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:32:in `enqueue_jobs'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:107:in `enqueue'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/contrib/sidekiq/server_internal_tracer/scheduled_poller.rb:27:in `block in enqueue'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/trace_operation.rb:191:in `block in measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/span_operation.rb:149:in `measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/trace_operation.rb:191:in `measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:379:in `start_span'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:159:in `block in trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/context.rb:45:in `activate!'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:158:in `trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing.rb:18:in `trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/contrib/sidekiq/server_internal_tracer/scheduled_poller.rb:14:in `enqueue'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:99:in `block in start'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/component.rb:10:in `watchdog'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/component.rb:19:in `block in safe_thread'
/usr/local/lib/ruby/3.2.0/openssl/buffering.rb:415:in `syswrite_nonblock': Broken pipe (Errno::EPIPE)
	from /usr/local/lib/ruby/3.2.0/openssl/buffering.rb:415:in `write_nonblock'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/ruby_connection/buffered_io.rb:64:in `block in write'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/ruby_connection/buffered_io.rb:63:in `loop'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/ruby_connection/buffered_io.rb:63:in `write'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/ruby_connection.rb:74:in `write'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/connection_mixin.rb:30:in `call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client.rb:224:in `block (2 levels) in call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/middlewares.rb:16:in `call'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb:23:in `block in call'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/trace_operation.rb:191:in `block in measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/span_operation.rb:149:in `measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/trace_operation.rb:191:in `measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:379:in `start_span'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:159:in `block in trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/context.rb:45:in `activate!'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:158:in `trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing.rb:18:in `trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb:13:in `call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client.rb:223:in `block in call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client.rb:643:in `ensure_connected'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client.rb:222:in `call'
	from /usr/local/bundle/gems/redis-client-0.17.0/lib/redis_client/decorator.rb:26:in `call'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:57:in `zpopbyscore'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:38:in `block (2 levels) in enqueue_jobs'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:33:in `each'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:33:in `block in enqueue_jobs'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/config.rb:164:in `block in redis'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:110:in `block (2 levels) in with'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:109:in `handle_interrupt'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:109:in `block in with'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:106:in `handle_interrupt'
	from /usr/local/bundle/gems/connection_pool-2.4.1/lib/connection_pool.rb:106:in `with'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/config.rb:161:in `redis'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/component.rb:28:in `redis'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:32:in `enqueue_jobs'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:107:in `enqueue'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/contrib/sidekiq/server_internal_tracer/scheduled_poller.rb:27:in `block in enqueue'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/trace_operation.rb:191:in `block in measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/span_operation.rb:149:in `measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/trace_operation.rb:191:in `measure'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:379:in `start_span'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:159:in `block in trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/context.rb:45:in `activate!'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/tracer.rb:158:in `trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing.rb:18:in `trace'
	from /usr/local/bundle/gems/ddtrace-1.12.1/lib/datadog/tracing/contrib/sidekiq/server_internal_tracer/scheduled_poller.rb:14:in `enqueue'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/scheduled.rb:99:in `block in start'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/component.rb:10:in `watchdog'
	from /usr/local/bundle/gems/sidekiq-7.1.3/lib/sidekiq/component.rb:19:in `block in safe_thread'

slai11 · 2023-11-21T08:56:40Z

I'm seeing a similar behaviour too (both EOFError and Broken pipe). I'm using sidekiq 7.1.6, redis-client 0.18.0 and redis 6.X (auth enabled).

More info in https://gitlab.com/gitlab-com/gl-infra/production/-/issues/17158#note_1659816466.

We are fairly confident the upgrade caused the incident but unsure of exactly why. We did a roll-back of the release containing the sidekiq 7 upgrade and the problem (redis errors + sidekiq apdex issues) went away. We subsequently re-deployed the release but this time without the sidekiq 7 upgrade commit and the metrics were all fine.

casperisfine · 2023-11-21T09:07:59Z

@slai11 I'd love to get to the bottom of that issue, but without experiencing it myself, it's pretty much impossible.

But if you figure out a repro, please share I'll be happy to squash that bug.

CaioIcy · 2023-11-23T18:28:28Z

We've been seeing the same issue: Redis 6.X with TLS, redis-client 0.18, connection_pool 2.4.1

It happens to all our Redis connections, including caching & Sidekiq. I've tried messing with reconnect_attempts and timeout but this did not seem to help. Also, it seems to correlate with fetching larger data when it is the Rails.cache

I've tried to reproduce it but so far no luck. This comment seems the closest I could find to an explanation, so maybe it could help to link it here: mperham/connection_pool#177 (comment).

davidgm0 · 2023-11-24T07:29:34Z

I believe we're also getting the same here with redis-client 0.18.0 and connection_pool 2.4.1. We're also using TLS. It's only happening with a job that uses activejob-uniqueness and redlock though:

Redlock::LockAcquisitionError: failed to acquire lock on 'Too many Redis errors prevented lock acquisition:
RedisClient::ConnectionError: Broken pipe'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:323:in `lock_instances'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:275:in `block in try_lock_instances'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:271:in `times'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:271:in `try_lock_instances'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:79:in `lock'
/usr/local/bundle/gems/activejob-uniqueness-0.3.1/lib/active_job/uniqueness/strategies/base.rb:24:in `lock'

le0pard · 2023-11-25T20:08:48Z

sidekiq/sidekiq#5929 (comment) - provided more info about case.

So basically it happening with no load on sidekiq daemon and some threads get timeout error (some, not all of them in sidekiq). redis.rb had no such issues in previous sidekiq <= 6

stanhu · 2023-11-26T04:15:41Z

I'm not sure yet if I reproduced this specific issue, but today I set up a Google Cloud Memorystore instance with Redis 6.2.13 and TLS enabled. Then I ran:

require 'redis-client'
redis_config = RedisClient.config(host: 'redacted-redis-hostname', port: 6378, ssl: true, password: 'redacted-password')
redis = redis_config.new_client
redis.call("PING")
=> "PONG"

I waited an hour or so, and sent another PING, but this time I got the EPIPE:

irb(main):015:0> redis.call("PING")
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/ruby_connection.rb:76:in `rescue in write': Broken pipe (RedisClient::ConnectionError)
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/ruby_connection.rb:73:in `write'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/connection_mixin.rb:30:in `call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:256:in `block (2 levels) in call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/middlewares.rb:16:in `call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:255:in `block in call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:675:in `ensure_connected'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:254:in `call'
        from (irb):15:in `<main>'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/irb-1.3.5/exe/irb:11:in `<top (required)>'
        from /opt/gitlab/embedded/bin/irb:23:in `load'
        from /opt/gitlab/embedded/bin/irb:23:in `<main>'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/openssl-3.1.0/lib/openssl/buffering.rb:415:in `syswrite_nonblock': Broken pipe (Errno::EPIPE)
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/openssl-3.1.0/lib/openssl/buffering.rb:415:in `write_nonblock'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/ruby_connection/buffered_io.rb:64:in `block in write'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/ruby_connection/buffered_io.rb:63:in `loop'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/ruby_connection/buffered_io.rb:63:in `write'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/ruby_connection.rb:74:in `write'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/connection_mixin.rb:30:in `call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:256:in `block (2 levels) in call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/middlewares.rb:16:in `call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:255:in `block in call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:675:in `ensure_connected'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:254:in `call'
        from (irb):15:in `<main>'
        from /opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/irb-1.3.5/exe/irb:11:in `<top (required)>'
        from /opt/gitlab/embedded/bin/irb:23:in `load'
        from /opt/gitlab/embedded/bin/irb:23:in `<main>'

Naturally, I was curious if Google's 10-minute VPC timeout was kicking in, but the issue didn't seem to occur if I attempted a PING after 10 minutes. In addition, I also verified with Wireshark that TCP keepalives appear to be sent and acked every 15 seconds, so it doesn't appear that the TCP connection is going idle.

I also checked that the Redis TIMEOUT config parameter defaults to 0.

I'm not sure why the socket appears to be closing like this, so I'll continue to investigate.

stanhu · 2023-11-26T04:28:42Z

~~Interesting, Redis 6.2.13 does NOT have redis/redis#10931, which fixes redis/redis#10915.~~

Redis 6.2.8 did get it after all: redis/redis#11607

le0pard · 2023-11-26T07:43:47Z

Sidekiq threads send BRPOP each 2 seconds, so it is not VPC timeout

In Sidekiq 7, calls to `Sidekiq.redis` checks out a connection from an internal connection pool instead of the connection pool used by the job processors. If that connection is checked out but not used, it's possible that the heartbeat thread won't keep the connection alive before the Redis server client timeout. To avoid this, update the checks to return earlier if the Redis connection is not needed. This might help reduce intermittent `Broken pipe` errors as reported in redis-rb/redis-client#119.

In Sidekiq 7, calls to `Sidekiq.redis` checks out a connection from an internal connection pool instead of the connection pool used by the job processors. If a connection is checked out but not used, it's possible that the heartbeat thread won't keep the connection alive before the Redis server client timeout. To avoid this, update the checks to return earlier if the Redis connection is not needed. This might help reduce intermittent `Broken pipe` errors as reported in redis-rb/redis-client#119.

stanhu · 2023-11-27T23:15:23Z

I've done some lengthy investigation on this issue as it relates to Sidekiq, and at the moment I don't believe this is a redis-client problem. I suspect some of the issues may be caused by:

A Redis connection that's idling too long (for example in sidekiq-cron: Only check out a Redis connection if necessary sidekiq-cron/sidekiq-cron#438)
A pool contention/deadlock issue when a Sidekiq job tries to access Sidekiq.redis within the job (for example in sidekiq-cron: Avoid using Sidekiq.redis in Sidekiq 7 to avoid pool contention sidekiq-cron/sidekiq-cron#439).

I reproduced broken pipe and EOFErrors by setting the Redis TIMEOUT config to a non-zero value (60 or 120 s seems to work). By default, it should be set to 0.

Sidekiq 6 used a single connection pool and added 5 additional connections as leeway.

However, as described in https://github.com/sidekiq/sidekiq/blob/main/docs/capsule.md#redis-pools, Sidekiq 7 created:

A separate, internal pool of 10 connections.
A capsule pool set to a size of concurrency.

For those of you having trouble:

Do you have the Redis TIMEOUT value set? Run CONFIG GET timeout from redis-cli to verify.
Look carefully in the error backtrace to determine whether the broken pipe errors are happening in the Sidekiq internal pool vs. the capsule pool.
Look for calls to Sidekiq.redis within your application code, particularly ones that check out a Redis connection but don't actually issue commands. This could prevent the Sidekiq heartbeat thread from keeping the connection alive to avoid the Redis timeout.

It's important to determine where this is happening first because a call to Sidekiq.redis may use the local or capsule pool, depending on where it's called. Within a job or a job processor, Sidekiq.redis uses the capsule pool.

For example, I was seeing these broken pipe errors from within the heartbeat thread (line numbers may be slightly off due to debugging), until I realized that sidekiq-cron was calling Sidekiq.redis to access the internal pool.

/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-cron-1.9.1/lib/sidekiq/cron/job.rb:20:in `should_enque?'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-cron-1.9.1/lib/sidekiq/cron/job.rb:39:in `test_and_enque_for_time!'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-cron-1.9.1/lib/sidekiq/cron/poller.rb:36:in `enqueue_job'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-cron-1.9.1/lib/sidekiq/cron/poller.rb:23:in `block in enqueue'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-cron-1.9.1/lib/sidekiq/cron/poller.rb:22:in `each'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-cron-1.9.1/lib/sidekiq/cron/poller.rb:22:in `enqueue'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/patch/sidekiq_poller.rb:9:in `block (2 levels) in enqueue'
/opt/gitlab/embedded/service/gitlab-rails/gems/gitlab-safe_request_store/lib/gitlab/safe_request_store.rb:66:in `enabling_request_store'
/opt/gitlab/embedded/service/gitlab-rails/gems/gitlab-safe_request_store/lib/gitlab/safe_request_store.rb:59:in `ensure_request_store'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/patch/sidekiq_poller.rb:8:in `block in enqueue'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/activesupport-7.0.8/lib/active_support/execution_wrapper.rb:92:in `wrap'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/activesupport-7.0.8/lib/active_support/reloader.rb:72:in `block in wrap'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/activesupport-7.0.8/lib/active_support/execution_wrapper.rb:92:in `wrap'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/activesupport-7.0.8/lib/active_support/reloader.rb:71:in `wrap'
/opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/patch/sidekiq_poller.rb:7:in `enqueue'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/scheduled.rb:99:in `block in start'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/component.rb:10:in `watchdog'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/component.rb:19:in `block in safe_thread'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:672:in `ensure_connected'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client.rb:408:in `pipelined'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/redis-client-0.18.0/lib/redis_client/decorator.rb:51:in `pipelined'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/launcher.rb:153:in `block in ❤'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/config.rb:166:in `block in redis'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:110:in `block (2 levels) in with'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:109:in `handle_interrupt'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:109:in `block in with'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:106:in `handle_interrupt'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:106:in `with'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/config.rb:163:in `redis'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/component.rb:28:in `redis'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/launcher.rb:149:in `❤'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/launcher.rb:99:in `beat'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/launcher.rb:91:in `block in start_heartbeat'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/launcher.rb:90:in `loop'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/launcher.rb:90:in `start_heartbeat'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/component.rb:10:in `watchdog'
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/sidekiq-7.1.6/lib/sidekiq/component.rb:19:in `block in safe_thread'

In Sidekiq 6, the repeated BRPOP calls for every job processor every 2 seconds made it unlikely any of the connections in the pool would time out. However, the introduction of the internal pool in Sidekiq 7 makes it more likely for a connection in the internal pool to time out if the connection is checked out but not used.

In Sidekiq 7, calls to `Sidekiq.redis` checks out a connection from an internal connection pool instead of the connection pool used by the job processors. If a connection is checked out but not used, it's possible that the heartbeat thread won't keep the connection alive before the Redis server client timeout. To avoid this, update the checks to return earlier if the Redis connection is not needed. This might help reduce intermittent `Broken pipe` errors as reported in redis-rb/redis-client#119.

engwan · 2023-11-28T12:02:00Z

This may also just be due to instrumentation changes.

Looking at the stacktraces posted above, I see that you are using datadog.

/app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb is called inside ensure_connected so it captures these connection errors even if they are retried successfully.

For the old Redis <5 client, it just patches Redis::Client which means it won't see these errors since these will be caught inside ensure_connected and retried successfully.

See https://github.com/DataDog/dd-trace-rb/blob/master/lib/datadog/tracing/contrib/redis/patcher.rb#L58-L71

stanhu · 2023-11-28T12:32:40Z

This may also just be due to instrumentation changes.

Good point. I should have mentioned that, though I noticed many of the stack traces above suggest the problem is happening on the internal pool.

That makes me wonder whether the heartbeat thread isn't keeping these internal connections alive properly. As you mentioned somewhere else, a thread will tend to get the same connection that it used previously. I've validated this by printing out the conn values in the ❤ method.

In Sidekiq 6, the connection pool was touched by all job threads, so I suspect the likelihood of idling on any one connection was zero.

@mperham Should there be an option to make the heartbeat thread iterate through all connections? Right now it might be acceptable to let the Redis client reconnect gracefully, but users that have a Redis TIMEOUT setting defined and instrumentation to track EOFError will see more errors.

I think it would also help if connection_pool offered the ability to track:

The last time the connection was checked out.
The last time the connection issued a network command.

stanhu · 2023-11-28T12:40:45Z

I should also mention the default reconnect of 1 may not be sufficient if multiple connections in the internal pool have been disconnected by the Redis server. It seems to me we probably want to preserve the previous behavior of keeping these connections alive.

byroot · 2023-11-28T13:53:52Z

the default reconnect of 1 may not be sufficient if multiple connections in the internal pool have been disconnected by the Redis server

I don't follow. reconnect_attempt is per connection, or are you pointing at something else?

stanhu · 2023-11-28T15:04:06Z

reconnect_attempt is per connection, or are you pointing at something else?

My mistake. For some reason I thought the retry logic was in Sidekiq, but it's in redis-client.

To reiterate, I think we're likely seeing more of these EOFError exceptions because:

The instrumentation has changed so that any EOFError will be logged, even if the connection retried successfully, as mentioned in Broken Pipe and EOFErrors #119 (comment).
Sidekiq 7 no longer guarantees that each of the connections in the newly-introduced internal pool will send a Redis command within a Redis server timeout. The retry should succeed, however. The EOFErrors may just be noise in this case.

The problem is that it's difficult to determine whether these errors are just noise, or whether they're indicative of a more serious problem.

byroot · 2023-11-28T15:07:56Z

The problem is that it's difficult to determine whether these errors are just noise,

Right. I'll try to find some time to look at this instrumentation, see if we can hide non-final errors from it. Unless someone beat me to it of course.

In Sidekiq 7, calls to `Sidekiq.redis` checks out a connection from an internal connection pool instead of the connection pool used by the job processors. If a connection is checked out but not used, it's possible that the heartbeat thread won't keep the connection alive before the Redis server client timeout. To avoid this, update the checks to return earlier if the Redis connection is not needed. This might help reduce intermittent `Broken pipe` errors as reported in redis-rb/redis-client#119.

mperham · 2023-11-30T18:03:28Z

I've seen notes that Redis server has issues with OpenSSL 3. I've seen issues with Datadog and NewRelic's Redis instrumentation with the redis -> redis-client migration. It's difficult to know which bug or component is at fault.

If you are seeing mysterious errors, all I can advise at this point is to upgrade everything possibly related, including Redis server itself and any gems visible in the backtrace.

mperham · 2023-11-30T18:04:29Z

redis/redis#10915

slai11 · 2024-01-05T09:30:51Z

Right. I'll try to find some time to look at this instrumentation, see if we can hide non-final errors from it. Unless someone beat me to it of course.

@byroot I was looking into this and noticed that !@raw_connection.revalidate would only check if the connection is open from the client-side (I hacked around locally with a Redis on 1s timeout, the @io.closed? would be false even though redis-cli client list shows that the connection is closed).

Based on my understanding, retrying the command in ensure_connected is the more performant way to go about this so we should leave it at that.

Since the instrumentation middlewares are reporting the errors, could we pass useful information like retries to the block in

redis-client/lib/redis_client.rb

Line 677 in d1b9284

yield connection

and then to the middleware.call? The middleware authors could then handle the errors accordingly with this new information. e.g. rescue ConnectionError and ignore if attempts is 0.

   def measure_round_trip_delay
-    ensure_connected do |connection|
-      @middlewares.call(["PING"], config) do
+    ensure_connected do |connection, attempts|
+      @middlewares.call(["PING"], config, attempts) do
         connection.measure_round_trip_delay
       end
     end

liaden · 2024-07-31T14:11:01Z

This may also just be due to instrumentation changes.

Looking at the stacktraces posted above, I see that you are using datadog.

/app/vendor/bundle/ruby/3.2.0/gems/ddtrace-1.11.1/lib/datadog/tracing/contrib/redis/trace_middleware.rb is called inside ensure_connected so it captures these connection errors even if they are retried successfully.

For the old Redis <5 client, it just patches Redis::Client which means it won't see these errors since these will be caught inside ensure_connected and retried successfully.

See https://github.com/DataDog/dd-trace-rb/blob/master/lib/datadog/tracing/contrib/redis/patcher.rb#L58-L71

We were seeing the increase, and I was able to go from an error trace to viewing the error in the apm view. From there, I could see a lot of sidekiq job traces that had been affected with the same issue id, and see the status was "OK". This leads me to believe this theory.

Thanks for suggesting it. 😄

byroot · 2024-08-01T07:36:14Z

I opened DataDog/dd-trace-rb#3820, I'll try to work with Datadog people to make the necessary changes to these transcient errors can be either ignored or marked as retried.

sebgrebe · 2024-08-23T05:51:39Z

I believe we're also getting the same here with redis-client 0.18.0 and connection_pool 2.4.1. We're also using TLS. It's only happening with a job that uses activejob-uniqueness and redlock though:

Redlock::LockAcquisitionError: failed to acquire lock on 'Too many Redis errors prevented lock acquisition:
RedisClient::ConnectionError: Broken pipe'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:323:in `lock_instances'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:275:in `block in try_lock_instances'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:271:in `times'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:271:in `try_lock_instances'
/usr/local/bundle/gems/redlock-2.0.6/lib/redlock/client.rb:79:in `lock'
/usr/local/bundle/gems/activejob-uniqueness-0.3.1/lib/active_job/uniqueness/strategies/base.rb:24:in `lock'

@davidgm0 I am having the same issue with redis-client 0.22 and activejob-uniqueness. Did you get to the bottom of this?

leequarella changed the title ~~Broken Pipe and~~ Broken Pipe and EOFErrors May 22, 2023

mperham mentioned this issue May 23, 2023

Intermittent RedisClient::ReadTimeoutError and RedisClient::CannotConnectError issues. sidekiq/sidekiq#5929

Closed

stanhu mentioned this issue Nov 27, 2023

Only check out a Redis connection if necessary sidekiq-cron/sidekiq-cron#438

Merged

jorge-d mentioned this issue Jan 4, 2024

Sporadic RedisClient::ConnectionError: Broken pipe veeqo/activejob-uniqueness#69

Open

byroot mentioned this issue Aug 1, 2024

The Redis integration can't distinguish final errors from retried errors DataDog/dd-trace-rb#3820

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Broken Pipe and EOFErrors #119

Broken Pipe and EOFErrors #119

leequarella commented May 22, 2023

byroot commented May 22, 2023

byroot commented May 22, 2023

leequarella commented May 22, 2023

casperisfine commented May 23, 2023

imarquezc commented Jul 10, 2023

byroot commented Aug 1, 2023

brianclin commented Sep 13, 2023 •

edited

Loading

slai11 commented Nov 21, 2023

casperisfine commented Nov 21, 2023

CaioIcy commented Nov 23, 2023 •

edited

Loading

davidgm0 commented Nov 24, 2023 •

edited

Loading

le0pard commented Nov 25, 2023

stanhu commented Nov 26, 2023 •

edited

Loading

stanhu commented Nov 26, 2023 •

edited

Loading

le0pard commented Nov 26, 2023

stanhu commented Nov 27, 2023 •

edited

Loading

engwan commented Nov 28, 2023 •

edited

Loading

stanhu commented Nov 28, 2023

stanhu commented Nov 28, 2023

byroot commented Nov 28, 2023

stanhu commented Nov 28, 2023 •

edited

Loading

byroot commented Nov 28, 2023

mperham commented Nov 30, 2023

mperham commented Nov 30, 2023

slai11 commented Jan 5, 2024

liaden commented Jul 31, 2024

byroot commented Aug 1, 2024

sebgrebe commented Aug 23, 2024

Broken Pipe and EOFErrors #119

Broken Pipe and EOFErrors #119

Comments

leequarella commented May 22, 2023

byroot commented May 22, 2023

byroot commented May 22, 2023

leequarella commented May 22, 2023

casperisfine commented May 23, 2023

imarquezc commented Jul 10, 2023

byroot commented Aug 1, 2023

brianclin commented Sep 13, 2023 • edited Loading

slai11 commented Nov 21, 2023

casperisfine commented Nov 21, 2023

CaioIcy commented Nov 23, 2023 • edited Loading

davidgm0 commented Nov 24, 2023 • edited Loading

le0pard commented Nov 25, 2023

stanhu commented Nov 26, 2023 • edited Loading

stanhu commented Nov 26, 2023 • edited Loading

le0pard commented Nov 26, 2023

stanhu commented Nov 27, 2023 • edited Loading

engwan commented Nov 28, 2023 • edited Loading

stanhu commented Nov 28, 2023

stanhu commented Nov 28, 2023

byroot commented Nov 28, 2023

stanhu commented Nov 28, 2023 • edited Loading

byroot commented Nov 28, 2023

mperham commented Nov 30, 2023

mperham commented Nov 30, 2023

slai11 commented Jan 5, 2024

liaden commented Jul 31, 2024

byroot commented Aug 1, 2024

sebgrebe commented Aug 23, 2024

brianclin commented Sep 13, 2023 •

edited

Loading

CaioIcy commented Nov 23, 2023 •

edited

Loading

davidgm0 commented Nov 24, 2023 •

edited

Loading

stanhu commented Nov 26, 2023 •

edited

Loading

stanhu commented Nov 26, 2023 •

edited

Loading

stanhu commented Nov 27, 2023 •

edited

Loading

engwan commented Nov 28, 2023 •

edited

Loading

stanhu commented Nov 28, 2023 •

edited

Loading