Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed frame double-release #139

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

echistyakov
Copy link
Contributor

Fix frame double-release for RequestResponse frame type.

Motivation:

One of our production services occasionally experiences the following panic:

frame has been released!

goroutine 3854415 [running]:
runtime/debug.Stack()
	third-party/go/1.23.4/linux_amd64/src/runtime/debug/stack.go:26 +0x5e
panic({0x843d00?, 0x13f8940?})
	third-party/go/1.23.4/linux_amd64/src/runtime/panic.go:785 +0x132
github.com/rsocket/rsocket-go/core/framing.(*bufferedFrame).HasFlag(...)
	fbcode/third-party-go/vendor/github.com/rsocket/rsocket-go/core/framing/buffered.go:51
github.com/rsocket/rsocket-go/core/framing.(*bufferedFrame).trySeekMetadataLen(0x18?, 0xc004e188c0?)
	fbcode/third-party-go/vendor/github.com/rsocket/rsocket-go/core/framing/buffered.go:121 +0x178
github.com/rsocket/rsocket-go/core/framing.(*bufferedFrame).trySliceData(0xc002b89680, 0x0)
	fbcode/third-party-go/vendor/github.com/rsocket/rsocket-go/core/framing/buffered.go:142 +0x25
github.com/rsocket/rsocket-go/core/framing.(*PayloadFrame).Data(0x0?)
	fbcode/third-party-go/vendor/github.com/rsocket/rsocket-go/core/framing/payload.go:66 +0x18
github.com/rsocket/rsocket-go/payload.Clone({0x7f7b95db3a18, 0xc004dedc80})
	fbcode/third-party-go/vendor/github.com/rsocket/rsocket-go/payload/payload.go:66 +0x233
github.com/rsocket/rsocket-go/rx/mono.(*oneshotProxy).Block(0xe81f33?, {0x1420168?, 0xc004e187e0?})
	fbcode/third-party-go/vendor/github.com/rsocket/rsocket-go/rx/mono/proxy_oneshot.go:134 +0x5d
thrift/lib/go/thrift.(*rsocketClient).RequestResponse(0xc0049aec00, {0x1420168, 0xc004e187e0}, {0xe81f33, 0x10}, 0x2, 0x1, 0xc0049aec90, {0xc004f86500, 0x1c, ...})
	fbcode/thrift/lib/go/thrift/rocket_rsocket_client.go:120 +0xde
thrift/lib/go/thrift.(*rocketClient).Flush(0xc001051550)
	fbcode/thrift/lib/go/thrift/rocket_client.go:123 +0x268
github.com/facebook/fbthrift/thrift/lib/go/thrift/types.(*SerialChannel).sendMsg(0xc003cbc740, {0x1420050?, 0x8c33c60?}, {0xe81f33, 0x10}, {0x13fe6c0, 0xc00315d4b0}, 0x1)
	fbcode/thrift/lib/go/thrift/types/serial_channel.go:61 +0x12f
github.com/facebook/fbthrift/thrift/lib/go/thrift/types.(*SerialChannel).Call(0xc003cbc740, {0x1420050?, 0x8c33c60?}, {0xe81f33, 0x10}, {0x13fe6c0?, 0xc00315d4b0?}, {0x13fe6e0, 0xc001f8fed0})
	[...irrelevant part of stack removed...]

After a brief investigation - I discovered that a RequestResponse frame gets double-released occasionally (dependent on a race condition):

  1. In the onFinally method here:
    common.TryRelease(handler.cache)
  2. In the onFinally->unregister->deleteFragment call chain - here:
    common.TryRelease(v)

Due to this double-release - it is possible for an IncRef() call on a frame to have no effect, the frame getting released pre-maturely and the above panic to be observed.

For the particular panic observed - the following IncRef() may lose its effect (in certain race-scenarios) due to the erroneous double-release:


When the above IncRef() condition occurs - the following payload.Clone() invocation will panic (as the frame would have already been released pre-maturely):
return payload.Clone(v), nil

Modifications:

  1. Rename: destroyFragment() -> destroyAllFragments() for clarity and to avoid ambiguity with destroyFragment(sid).
  2. Remove unnecessary common.TryRelease(handler.cache) from the onFinally method.
  3. Remove .cache field from the requestResponseCallback - as it seems that there is no need to cache the frame in the callback itself.

Result:

The panic should go away after this change.

@echistyakov
Copy link
Contributor Author

I'll take a look at the failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant