Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Segmentation fault during auction / SIGSEGV #196

Open
amirderakh opened this issue Mar 28, 2024 · 2 comments
Open

[Bug] Segmentation fault during auction / SIGSEGV #196

amirderakh opened this issue Mar 28, 2024 · 2 comments

Comments

@amirderakh
Copy link

amirderakh commented Mar 28, 2024

In vouch 1.8.0 and 1.8.1 I have been observing this error several times in the last few days.
It happens during a block auction. Luckily, the relay still publishes the block while vouch is crashing.

This log is from vouch-1.8.1-linux-amd64, connected CLs are Nimbus and Lighthouse. PC reboot had no effect.
Update: The error seems to only appear when unblind-from-all-relays is true.
The last message is first in this log. After it, vouch restarts:

/home/runner/work/vouch/vouch/services/beaconblockproposer/standard/propose.go:622 +0xad created by github.com/attestantio/vouch/services/beaconblockproposer/standard.(*Service).unblindBlock in goroutine 171275 /home/runner/work/vouch/vouch/services/beaconblockproposer/standard/propose.go:633 +0x356 github.com/attestantio/vouch/services/beaconblockproposer/standard.(*Service).unblindBlock.func1({0x1b28690, 0xc001826540}, {0x7fd0ec203c98, 0xc002618b00}, 0xc02327a0c0) /home/runner/go/pkg/mod/github.com/attestantio/[email protected]/http/unblindproposal.go:76 +0x656 github.com/attestantio/go-builder-client/http.(*Service).UnblindProposal(0xc002618b00, {0x1b28690, 0xc001826540}, 0xc02b7e8060) /home/runner/go/pkg/mod/github.com/attestantio/[email protected]/http/unblindproposal.go:257 +0x214 github.com/attestantio/go-builder-client/http.(*Service).unblindDenebProposal(0xc002618b00, {0x1b28690, 0xc01d1a9170}, {0x1582b09?, 0xf?, 0x251cde0?}, 0xc003b76000) /opt/hostedtoolcache/go/1.22.1/x64/src/encoding/json/stream.go:63 +0x75 encoding/json.(*Decoder).Decode(0xc00111e280, {0x1319220, 0xc020e1c0b8}) /opt/hostedtoolcache/go/1.22.1/x64/src/encoding/json/stream.go:140 +0x85 encoding/json.(*Decoder).readValue(0xc00111e280) /opt/hostedtoolcache/go/1.22.1/x64/src/encoding/json/stream.go:165 +0x188 encoding/json.(*Decoder).refill(0xc00111e280) /opt/hostedtoolcache/go/1.22.1/x64/src/io/io.go:628 +0x28 io.(*teeReader).Read(0xc0008880a0, {0xc003c54400, 0xc0009df858?, 0x200}) /opt/hostedtoolcache/go/1.22.1/x64/src/runtime/panic.go:770 +0x132 panic({0x13c3560?, 0x247ec30?}) /home/runner/go/pkg/mod/go.opentelemetry.io/otel/[email protected]/trace/span.go:426 +0xa82 go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End(0xc004077e00, {0x0, 0x0, 0xc031a89340?}) /home/runner/go/pkg/mod/go.opentelemetry.io/otel/[email protected]/trace/span.go:388 +0x25 go.opentelemetry.io/otel/sdk/trace.(*recordingSpan).End.deferwrap1() goroutine 172547 [running]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4edfc8] panic: runtime error: invalid memory address or nil pointer dereference panic: runtime error: invalid memory address or nil pointer dereference [recovered] {"level":"warn","strategy":"blindedbeaconblockproposal","impl":"first","provider":"xxxxxx","slot":xxxxxx,"error":"failed to request blinded beacon block proposal: failed to call GET endpoint: Get \"http://xxxxx/eth/v1/validator/blinded_blocks/xxxxx\": context canceled","time":"2024-03-28T22:31:49+10:00","message":"Failed to obtain blinded beacon block proposal"} {"level":"info","service":"blockrelay","impl":"standard","slot":xxxxxxx,"provider":"https://relay.ultrasound.money/","value":"39717939211775936","delta":"0","selected":true,"time":"2024-03-28T22:31:48+10:00","message":"Auction participant"}

@mcdee
Copy link
Contributor

mcdee commented Mar 28, 2024

Thank you for reporting this. I have run a few different scenarios but cannot find an obvious path as to why this is failing. My best guess is that you have a particular relay in your list that is returning some sort of non-standard response that is causing the failure.

If possible, could you update your Vouch configuration file with the following:

builderclient:
  log-level: 'trace'

and restart Vouch? This will log information sent to the builder, and should provide more details before the failure that should tell us which relay is responding with poor data, and what data it is returning that is causing the problem? If you are uncomfortable with sharing the resultant log publicly then I can provide you with an email address to which you can send it.

@amirderakh
Copy link
Author

amirderakh commented Mar 29, 2024

Hi Jim, thanks for looking into this. However, since "unblind-from-all-relays" fixed the problem for now, I would treat this as a lower priority. I should have enabled trace immediately. I wouldn't risk the crash again now that this is running on a production system.

In all instances, the crash occurred at the same point "addr=0x18 pc=0x4edfc8"/"propose.go:622 +0xad" so you may at least be able to see the problematic instruction.
As you say, it is most likely a relay, or timewise the problem also fell together with the update to Lighthouse 5.1.2. If I find anything helpful I will post here. Posting anonymised logs is good enough for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants