Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fixed issue with context cancelled error leading to connection spikes on Primary instances #3190

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions error.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,15 @@ type Error interface {

var _ Error = proto.RedisError("")

func isContextError(err error) bool {
switch err {
case context.Canceled, context.DeadlineExceeded:
Copy link
Contributor

@fishy fishy Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the code to know how the errors get passed here, but this approach would only match directly/exact context.Canceled and context.DeadlineExceeded errors, it won't match errors wrapping those errors.

In general (https://pkg.go.dev/errors#section-documentation) we should use errors.Is(err, context.Canceled) over err == context.Canceled, unless there's a reason we only want to match direct exact errors.

(not sure whether this is the reason this PR does not fix the issue we saw in #3282 (comment) verified that the change doesn't fix that issue either)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally errors are not wrapped in the go-redis package and this error should occur within the package. I do agree that with the new version there are tons of things that can be improved, error handling included. Currently we are focused on improving the stability, testing infrastructure , bug fixes and redis 8 support. Once those are addressed we will move to other improvements that are nice to have... I will continue looking into your issue and continue the discussion in its thread.

return true
default:
return false
}
}

func shouldRetry(err error, retryTimeout bool) bool {
switch err {
case io.EOF, io.ErrUnexpectedEOF:
Expand Down
4 changes: 3 additions & 1 deletion osscluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -1350,7 +1350,9 @@ func (c *ClusterClient) processPipelineNode(
_ = node.Client.withProcessPipelineHook(ctx, cmds, func(ctx context.Context, cmds []Cmder) error {
cn, err := node.Client.getConn(ctx)
if err != nil {
node.MarkAsFailing()
if !isContextError(err) {
node.MarkAsFailing()
}
_ = c.mapCmdsByNode(ctx, failedCmds, cmds)
setCmdsErr(cmds, err)
return err
Expand Down
33 changes: 33 additions & 0 deletions osscluster_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -539,6 +539,39 @@ var _ = Describe("ClusterClient", func() {
AfterEach(func() {})

assertPipeline()

It("doesn't fail node with context.Canceled error", func() {
ctx, cancel := context.WithCancel(context.Background())
cancel()
pipe.Set(ctx, "A", "A_value", 0)
_, err := pipe.Exec(ctx)

Expect(err).To(HaveOccurred())
Expect(errors.Is(err, context.Canceled)).To(BeTrue())

clientNodes, _ := client.Nodes(ctx, "A")

for _, node := range clientNodes {
Expect(node.Failing()).To(BeFalse())
}
})

It("doesn't fail node with context.DeadlineExceeded error", func() {
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Nanosecond)
defer cancel()

pipe.Set(ctx, "A", "A_value", 0)
_, err := pipe.Exec(ctx)

Expect(err).To(HaveOccurred())
Expect(errors.Is(err, context.DeadlineExceeded)).To(BeTrue())

clientNodes, _ := client.Nodes(ctx, "A")

for _, node := range clientNodes {
Expect(node.Failing()).To(BeFalse())
}
})
})

Describe("with TxPipeline", func() {
Expand Down
Loading