-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft/rafttest: TestBasicProgress failed #121745
Comments
I'm not able to reproduce this locally over 10k runs on 4a623b2. I'll pull a newer 24.1 SHA and try again.
|
No luck on d2ac826 either. This test was recently added when we embedded the up-stream etcd-io/raft lib in cockroach. I'm guessing this test was flaky before embedding. Moving to a GA-blocker. Full test log
|
@andrewbaptist could you add more detail to why this is a bug? I wasn't able to determine either way based on the test output, but I'm not intimately familiar here. |
I didn't dig into the detail on why this failed other than also trying to reproduce it. Based on the test and the set of conditions it needs deeper investigation to determine the cause. There aren't any network calls to induce flakiness, and the test normally passes within ~100-200ms, so the 5s to fail here doesn't seem likely to be fixed by a timing change. I'm also a little confused about the teamcity UI that makes it hard to see trends on this test. The page showing the trend is here vs the default if you click the link The test itself is quite simple, but there isn't a lot of debuggability as its written. So either it is not expected that we would have all 100 entries on all nodes, but that seems unlikely, or there is some bug that prevents it from happening. Without a repro, we might have to just observe this longer... |
This is just a flaky test that we inherited by importing ...
23:05:24 INFO: 3 became leader at term 2
...
23:05:25 INFO: 2 became leader at term 3
... and that caused some uncommitted entries wiped from the log (legitimately): 23:05:25 INFO: replace the unstable entries from index 7
23:05:25 INFO: replace the unstable entries from index 7 So we never committed all 100 entries, and the wait loop got stuck waiting for commit index to be >= 100. |
Hypothesis on the root cause of this flake: etcd-io/raft#195 (comment) |
We have marked this test failure issue as stale because it has been |
raft/rafttest.TestBasicProgress failed with artifacts on release-24.1 @ d2ac82658f68c96394a811a3eaecafcd808403f3:
Help
See also: How To Investigate a Go Test Failure (internal)
This test on roachdash | Improve this report!
Jira issue: CRDB-37487
The text was updated successfully, but these errors were encountered: