Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated backport of #1472: Handle transient ServiceImport delete in the resolver #1473

Conversation

tpantelis
Copy link
Contributor

Backport of #1472 on release-0.16.

#1472: Use zerolog in coredns plugin

For details on the backport process, see the backport requests page.

...so we get timestamps and pretty formatting.

Signed-off-by: Tom Pantelis <[email protected]>
During DR testing where the broker cluster is recovered from a backup,
it was observed that the coredns plugin resolver received a ServiceImport
delete event followed by an add event while the EndpointSlice remained.
This resulted in failed DNS queries b/c the service info was removed. The
assumption is that a ServiceImport deletion means the service was
unexported, which is normally the case. However, during DR scenarios,
transient deletions may occur due to reconciliation on LH agent startup
if it observes a local copy of a remote resource doesn't yet exist on the
broker due to timing. When this occurs a restart of the coredns pod is
required to correct it.

To alleviate this issue, make the resolver more resilient by only
removing the service info data when there's no more cluster data, ie
when all the cluster EndpointSlices are deleted. Normally, on unexport,
the EndpointSlices are deleted after the aggregated ServiceImport is
deleted.

Signed-off-by: Tom Pantelis <[email protected]>
@submariner-bot
Copy link
Contributor

🤖 Created branch: z_pr1473/tpantelis/automated-backport-of-#1472-upstream-release-0.16
🚀 Full E2E won't run until the "ready-to-test" label is applied. I will add it automatically once the PR has 2 approvals, or you can add it manually.

@tpantelis tpantelis changed the title Automated backport of #1472: Use zerolog in coredns plugin Automated backport of #1472: Handle transient ServiceImport delete in the resolver Dec 22, 2023
@submariner-bot submariner-bot added the ready-to-test When a PR is ready for full E2E testing label Jan 2, 2024
@tpantelis tpantelis merged commit db23c0f into submariner-io:release-0.16 Jan 3, 2024
34 checks passed
@submariner-bot
Copy link
Contributor

🤖 Closed branches: [z_pr1473/tpantelis/automated-backport-of-#1472-upstream-release-0.16]

tpantelis added a commit to tpantelis/submariner-website that referenced this pull request Jan 3, 2024
aswinsuryan pushed a commit to submariner-io/submariner-website that referenced this pull request Jan 5, 2024
tpantelis added a commit to tpantelis/submariner-website that referenced this pull request Jan 11, 2024
aswinsuryan pushed a commit to submariner-io/submariner-website that referenced this pull request Jan 11, 2024
@tpantelis tpantelis deleted the automated-backport-of-#1472-upstream-release-0.16 branch August 19, 2024 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants