-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diagnostics Logs: Fixes ContactReplicas to only include the successful replica contacted per physical partition. #4156
Conversation
…licas-to-only-include-the-successful-replicas-contacted-per-physical-partition
ContactedReplicas() semantically needs to include all replicas contacted part of serving request (ex: retries, quorum etc...). |
Thanks for your feedback. So, for clarification, are we saying that the bug is not that ContactReplicas should only show the last contacted replica only (and I guess that makes sense), but somehow there are replicas in the contacted replica array that has not actually been contacted that should not obviously be there? How did we come to this conclusion at the inception of this bug because if that is the case, not sure how to repo this? Maybe we should just close this for now. |
@@ -237,6 +237,11 @@ public void Visit(ClientSideRequestStatisticsTraceDatum clientSideRequestStatist | |||
this.jsonWriter.WriteObjectEnd(); | |||
} | |||
|
|||
private static List<TransportAddressUri> GetFinalContactedReplica(ClientSideRequestStatisticsTraceDatum clientSideRequestStatisticsTraceDatum) | |||
{ | |||
return new List<TransportAddressUri> { clientSideRequestStatisticsTraceDatum.ContactedReplicas.LastOrDefault() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A request might have contacted multiple replicas, why are we taking only 1?
A request can go to 1 replica or for some reason (Strong Consistency for example checking quorum, or receiving a 404/1002 and having to retry) might reach to another replica. The problem is not that the list has One way to tackle this is first understand all the steps/places involved in generating and collecting this data to understand which of them is wrong? |
Closing inactive PRs older than 3-months, please re-active it needed. |
Pull request was closed
Pull Request Template
Description
Describe the bug
Fixing a bug for ContactedReplicas. It should only display the one replica per physical partition that was contacted successfully, not all replicas contacted for the partition.
To Reproduce
Steps to reproduce the behavior. If you can include code snippets or links to repositories containing a repro of the issue that can helps us in detecting the scenario it would speed up the resolution.
Print any diagnostic log string where multiple replicas on the same physical partition failed and one was finally successful.
Expected behavior
Actual behavior
Environment summary
SDK Version: All
OS Version (e.g. Windows, Linux, MacOSX)
Additional context
Add any other context about the problem here (for example, complete stack traces or logs).
Solution
Impact of change
Breaking contracts: Tempting as it may be, I do not wish to break the current contract which is keeping ContactedReplicas as part of an array with a count of 1, even though we are always forcing only the last contacted replica in the array to be revealed. Always open to discuss if we want to break this, but for now, the impact of change is at its minimum.
Duplicate contact regions: The method and tests to verify the removing of duplicate contact replicas from the array is unnecessary since we are only revealing the last contacted region.
Type of change
Please delete options that are not relevant.
Closing issues
To automatically close an issue: closes #IssueNumber