All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- FIX: Falling back to primary PG server not reliable on Rails 7.1
- DEV: Update dependencies to officially support Rails 7.1
- FIX: Use
next
instead ofbreak
to avoid a local jump error
- DEV: Compatibility with Rails 7.1+ (drop support for Rails 6.0 & ruby 2.7)
- DEV: Remove the support for Ruby < 2.7
- DEV: Compatibility with Rails 7.1+
- FEATURE: Compatibility with Rails 7.0+
- FEATURE: Compatibility with Rails 6.1
No changes.
- FIX: Backward compatibility with Rails 6.0
- FEATURE: Partial compatibility with Rails 6.1
- FIX: Catch exceptions that are not intercepted by
ActionDispatch::DebugExceptions
.
- FIX: Handle the case when the replica is set equal to the primary
- FIX: Handle clients which are connecting during fallback
- FIX: Use concurrent-ruby maps to simplify concurrency logic. Resolves a number of possible concurrency issues
-
FIX: Recover correctly if both the primary and replica go offline
Previously, a replica failing would cause it to be added to the 'primaries_down' list. The fallback handler would then continuously try and fallback the replica to itself, looping forever, and meaning that fallback to primary would never happen.
-
FEATURE: Run failover/fallback callbacks once for each backend
Previously the failover callback would only fire when the first backend failed, and the fallback callback would only fire when the last backend recovered. Now both failover and fallback callbacks will be triggered for each backend. The key for each backend is also passed to the callbacks for consumption by consuming applications.
-
FEATURE: Add primaries_down_count function to failover handlers
This is intended for consumption by monitoring systems (e.g. the Discourse prometheus exporter)
-
FIX: Ignore errors from the redis socket shutdown call
This can fail with various i/o errors, but in all cases we want the thread to continue closing the connection with the error, and all the other connections.
-
FIX: Handle concurrency issues during redis disconnection (#10)
This handles concurrency issues which can happen during redis failover/fallback:
- Previously, 'subscribed' redis clients were skipped during the disconnect process. This is resolved by directly accessing the original_client from the ::Redis instance
- Trying to acquire the mutex on a subscribed redis client is impossible, so the close operation would never complete. Now we send the shutdown() signal to the thread, then allow up to 1 second for the mutex to be released before we close the socket
- Failover is almost always triggered inside a redis client mutex. Failover then has its own mutex, within which we attempted to acquire mutexes for all redis clients. This logic causes a deadlock when multiple clients failover simultaneously. Now, all disconnection is performed by the Redis::Handler failover thread, outside of any other mutexes. To make this safe, the primary/replica state is stored in the connection driver, and disconnect_clients is updated to specifically target primary/replica connections.
- FIX: Avoid disconnecting Redis connections abruptly.
-
FIX: Iteration and mutation of primaries_down in separate threads (#5)
Ruby hashes can't be modified whilst they are being iterated over.
Here, the primaries_down hash is iterated over to check each previously unavailable primary to see if it is now contactable. However, since this hash can be updated in other threads, this iteration isn't safe.
To prevent this, a copy of the hash is iterated over instead.
The GIL should not be released during a hash dup 1, but let's not tie ourselves unnecessarily to current MRI behaviour.
- FIX: Rescue from
Redis::TimeoutError
instead ofTimeout::Error
.
- FIX: Undefined method on nil class error in forking servers.
- FIX: Incorrectly rescuing from
PG::ServerError
.
- FIX: Only rescue from connection errors.