Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rados/striper test suite is flaky on octopus #1020

Closed
phlogistonjohn opened this issue Aug 16, 2024 · 2 comments
Closed

rados/striper test suite is flaky on octopus #1020

phlogistonjohn opened this issue Aug 16, 2024 · 2 comments

Comments

@phlogistonjohn
Copy link
Collaborator

Example: https://github.com/ceph/go-ceph/actions/runs/10413135499/job/28840033245

The new rados/striper package appears to occasionally panic when the test suite is run with the ceph octopus job.
Over the past week it has failed 2 or 3 times.

Given that this only appears to occur on octopus and octopus is next on this to lose support in the go-ceph test matrix I'm considering just disabling rados/striper on octopus preemptively. But I haven't made up my mind yet and I am open to suggestions.

@anoopcs9
Copy link
Collaborator

Example: https://github.com/ceph/go-ceph/actions/runs/10413135499/job/28840033245

=== RUN   TestStriperTestSuite
=== RUN   TestStriperTestSuite/TestListXattrs
=== RUN   TestStriperTestSuite/TestNewStriper
=== RUN   TestStriperTestSuite/TestNewStriperWithLayout
=== RUN   TestStriperTestSuite/TestReadAppend
coverage: 63.8% of statements in github.com/ceph/go-ceph/rados/striper
panic: test timed out after 10m0s
	running tests:
		TestStriperTestSuite (10m0s)
		TestStriperTestSuite/TestReadAppend (9m59s)
. . .
goroutine 24 [syscall]:
github.com/ceph/go-ceph/rados._Cfunc_rados_shutdown(0x7f32540134c0)
	_cgo_gotypes.go:1606 +0x3f
github.com/ceph/go-ceph/rados.freeConn.func1(0xc000141c90?)
	/go/src/github.com/ceph/go-ceph/rados/rados.go:126 +0x34
github.com/ceph/go-ceph/rados.freeConn(0xc000204b00)
	/go/src/github.com/ceph/go-ceph/rados/rados.go:126 +0x45
github.com/ceph/go-ceph/rados.(*Conn).Shutdown(...)
	/go/src/github.com/ceph/go-ceph/rados/conn.go:77
github.com/ceph/go-ceph/rados/striper.(*StriperTestSuite).TearDownTest(0xc000072240)
	/go/src/github.com/ceph/go-ceph/rados/striper/striper_test.go:70 +0x35
github.com/stretchr/testify/suite.Run.func1.1()
	/go/pkg/mod/github.com/stretchr/[email protected]/suite/suite.go:184 +0x269
github.com/stretchr/testify/suite.Run.func1(0xc0002404e0)
	/go/pkg/mod/github.com/stretchr/[email protected]/suite/suite.go:203 +0x498
testing.tRunner(0xc0002404e0, 0xc0002001b0)
	/opt/go/src/testing/testing.go:1690 +0xf4
created by testing.(*T).Run in goroutine 7
	/opt/go/src/testing/testing.go:1743 +0x390
FAIL	github.com/ceph/go-ceph/rados/striper	600.108s

I see TearDownTest in the backtrace. At least this one most likely timed out.

I couldn't find any other occurrence(post merge) where it points at a real failure.

@anoopcs9
Copy link
Collaborator

anoopcs9 commented Sep 4, 2024

CI runs are stable enough these days. Please reopen in case it fails more consistently.

@anoopcs9 anoopcs9 closed this as completed Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants