Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource hints updates are ignored and lead to a broken communication #608

Closed
Necropaw opened this issue Nov 17, 2022 · 6 comments
Closed

Comments

@Necropaw
Copy link

Background

In our project we are using the https://github.com/envoyproxy/go-control-plane with the xDS provider from https://github.com/wongnai/xds and [email protected] for client.

The client is communicating with multiple gRPC services on different hosts.

Problem

Resource hints are ignored from go-control-plane.

A grpc-java client changes the requested resources over it's lifetime, but go-control-plane stores all requested resources and does not send a resource, if it is requested again.

grpc-java client changing the requested resources:

2022-10-18 @ 07:11:28.057, DEBUG, http-nio-8080-exec-6, [xds-client<38>: (xds-resolver:1337)] Sending LDS request for resources: [service-a, service-b, service-c]
2022-10-18 @ 07:44:24.259, DEBUG, grpc-default-worker-ELG-1-2, [xds-client<38>: (xds-resolver:1337)] Sending LDS request for resources: [service-a]

Some time later the client wants to communicate with service-b again and tries to resolve it:

"2022-10-18 @ 08:05:33.361",DEBUG,"http-nio-8080-exec-5","[xds-resolver<92>: (service-b)] Start watching LDS resource service-b"
"2022-10-18 @ 08:05:33.361",TRACE,"http-nio-8080-exec-5","[xds-client<38>: (xds-resolver:1337)] Sent DiscoveryRequest
{
  ""versionInfo"": ""422bd1d65455a992"",
  ""node"": {
    ""id"": ""client-node-id"",
    ""userAgentName"": ""gRPC Java"",
    ""userAgentVersion"": ""1.50.0"",
    ""clientFeatures"": [""envoy.lb.does_not_support_overprovisioning"", ""xds.config.resource-in-sotw""]
  },
  ""resourceNames"": [""service-a"", ""service-b""],
  ""typeUrl"": ""type.googleapis.com/envoy.config.listener.v3.Listener"",
  ""responseNonce"": ""124""
}"
"2022-10-18 @ 08:05:33.361",DEBUG,"http-nio-8080-exec-5","[xds-client<35>] Subscribe io.grpc.xds.XdsListenerResource@74a2f92e resource service-b"
"2022-10-18 @ 08:05:33.361",DEBUG,"http-nio-8080-exec-5","[xds-client<38>: (xds-resolver:1337)] Sending LDS request for resources: [service-a, service-b]"
"2022-10-18 @ 08:05:46.369",DEBUG,"grpc-timer-0","[xds-client<35>] LDS resource service-b initial fetch timeout"
"2022-10-18 @ 08:05:46.369",DEBUG,"grpc-timer-0","[xds-client<35>] Conclude LDS resource service-b not exist"
"2022-10-18 @ 08:05:46.369",DEBUG,"grpc-timer-0","[xds-resolver<92>: (service-b)] LDS resource does not exist: service-b"

this fails because the xDS provider remembered that this client already knows service-b in this version and just adds a watcher for possible upcoming changes:

"2022-10-18 @ 08:05:29.397",debug,"-","open watch 1528 for type.googleapis.com/envoy.config.listener.v3.Listener[service-a service-b] from nodeID """", version ""422bd1d65455a992"""
"2022-10-18 @ 08:05:29.397",debug,"-","nodeID """" requested type.googleapis.com/envoy.config.listener.v3.Listener[service-a service-b] and known map[service-c:{} service-d:{} service-a:{} service-b:{}]. Diff []"

At this stage the client can not communicate with the service anymore.

go-control-plane does not update the known resources in https://github.com/envoyproxy/go-control-plane/blob/main/pkg/server/sotw/v3/server.go#L186, because the resource hints are just DiscoveryRequests without a nonce as per specification.

Reference

I initially thought this was a grpc-java issue and you can find more info here: grpc/grpc-java#9632

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Dec 17, 2022
@Necropaw
Copy link
Author

no stalebot

@alecholmez
Copy link
Contributor

@Necropaw what version of xDS are you using? Delta or SOTW?

@gowtham-sundara
Copy link

@alecholmez This is for sotw version, unfortunately on the java side if the resolution fails once it never works until the entire thing is restarted.

@atollena
Copy link

atollena commented Jan 3, 2024

I think this can. be marked as dup of #431

FYI @alecholmez All gRPC clients (Go, Java, C-Core and Node) only supports SotW ADS (per https://github.com/grpc/proposal/blob/master/A27-xds-global-load-balancing.md#background).

@valerian-roche
Copy link
Contributor

Closing as dupe of #431. This issue is also related to re-subscription not being handled properly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants