Design on the FIFO connection pool #1868
Replies: 1 comment 3 replies
-
Thanks for the detailed message - this is definitely an interesting discussion 👍
This is expected and I don't see how to fix/improve it. That is why we have both FIFO and LIFO. Does your fork address this?
We decided to continue to use a slice to have minimal diff from LIFO. database/sql also uses this so we assumed it is fast enough. I guess we could switch to using linked lists if you have some numbers (benchmark) proving it is worth it...
Would be nice to have this :+1 (in a separate PR if possible). |
Beta Was this translation helpful? Give feedback.
-
Hi, folks
Very glad to see go-redis begins to consider a different type of connection pool(FIFO pool) in the latest version and has done it within small pieces of code. Good job!
We also have implemented a FIFO connection pool in our interval fork of go-redis. Compared to the recent feature, I have read and thought carefully on it and found the following points that worths more considerations:
And let me explain a bit more why the FIFO pool is important in the proxy scenarios. As we all know the traffic distribution on the connection in the pool is now balanced. It's not a big deal in most cases. When a peak came, occasionally one of the remote proxies is a little slower than others but may not last long, the connections to it could be returned back a bit slower than others. Then after a long time running a reordering will happen to the connection pool, that is, some of the connections of that "lucky" endpoint sit on the top of the pool stack. Then more traffic will be applied to a single endpoint which will cause a high load on it.
Sure it's not critical because the traffic will have the chance to spread to other endpoints if the lucky one is "slow" enough under a higher load. But still, we will observe a significant imbalance in CPU consumption, QPS applied, a little higher latency on the lucky endpoint.
I have several figures that show the difference between the two types of pool.
Serial Test
Firstly we did a serial test. We used a simple loop to do the test. Only the last connection in the idle list has the chance to be used if using the LIFO connection pool.
We can see the difference in this use case:
Concurrent Test
We created a bunch of routines to simultaneously send the requests. And use a limiter to keep the traffic at a certain load and change the limit periodically to simulate the real changing traffic.
From the results, we can see:
More with
MaxConnAge
Sometimes we use
MaxConnAge
to make the connections have the chance to refresh. But in the current design, it will cause most of the connections to expire together since they are created together, too. So we also implement a tweak on it trying to distribute their life in the last 20% expected lifetime. By doing this we won't have a big chance to recreate many connections in a short time.So how do you think of it, is it necessary to have it for everyone? Btw we have used it in production for several months so I believe it's ready.
Beta Was this translation helpful? Give feedback.
All reactions