Skip to content

Avoid repeat error reports #218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

zdevito
Copy link
Contributor

@zdevito zdevito commented Jun 10, 2025

Stack from ghstack (oldest at bottom):

If an op depends on two nodes that are failing at the time it is issued, then history will generate two failure messages for the sequence number. If there was a future attached to the sequence number, then the second failre will cause a KeyError in the invocation dicationary since the first instance already removed it.

This depends on a race condition to hit because the messages for the failure must already be present when the node failing is being added.

Differential Revision: D76372510

If an op depends on two nodes that are failing at the time it is issued, then history will generate two failure messages for the sequence number. If there was a future attached to the sequence number, then the second failre will cause a KeyError in the invocation dicationary since the first instance already removed it.

This depends on a race condition to hit because the messages for the failure must already be present when the node failing is being added.

Differential Revision: [D76372510](https://our.internmc.facebook.com/intern/diff/D76372510/)

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 10, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76372510

mariusae pushed a commit to mariusae/monarch-1 that referenced this pull request Jun 10, 2025
Summary:

If an op depends on two nodes that are failing at the time it is issued, then history will generate two failure messages for the sequence number. If there was a future attached to the sequence number, then the second failre will cause a KeyError in the invocation dicationary since the first instance already removed it.

This depends on a race condition to hit because the messages for the failure must already be present when the node failing is being added.
ghstack-source-id: 289532983
exported-using-ghexport

Reviewed By: suo

Differential Revision: D76372510
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 2432c26.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants