Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s resolver for the gRPC client #4190

Merged
merged 18 commits into from
Oct 19, 2023
Merged

K8s resolver for the gRPC client #4190

merged 18 commits into from
Oct 19, 2023

Conversation

pingsutw
Copy link
Member

@pingsutw pingsutw commented Oct 9, 2023

Tracking issue

#3936

Describe your changes

When running flyte agent with HPA, grpc client needs to know the new pod's IP address to load balance the requests.

To achieve this, we add a custom name resolver (k8sResolver), which can resolve k8s:///.... endpoint.
The resolver will create a go routine, and keep watching the k8s endpoints, and updating the clientConn.addresses for grpc client.

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Screenshots

Note to reviewers

Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
@pingsutw pingsutw marked this pull request as draft October 9, 2023 23:13
@codecov
Copy link

codecov bot commented Oct 10, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (a114769) 59.06% compared to head (87468c9) 59.58%.
Report is 4 commits behind head on master.

❗ Current head 87468c9 differs from pull request most recent head 6c4dc83. Consider uploading reports for the commit 6c4dc83 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4190      +/-   ##
==========================================
+ Coverage   59.06%   59.58%   +0.52%     
==========================================
  Files         621      552      -69     
  Lines       53105    39919   -13186     
==========================================
- Hits        31365    23787    -7578     
+ Misses      19240    13789    -5451     
+ Partials     2500     2343     -157     
Flag Coverage Δ
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...ugins/go/tasks/plugins/webapi/databricks/plugin.go 65.82% <100.00%> (+4.48%) ⬆️
flytepropeller/pkg/controller/controller.go 11.52% <0.00%> (-0.23%) ⬇️

... and 557 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

flytepropeller/pkg/resolver/k8s_resolver.go Outdated Show resolved Hide resolved
flytepropeller/pkg/resolver/k8s_resolver.go Outdated Show resolved Hide resolved
flytepropeller/pkg/resolver/k8s_resolver.go Outdated Show resolved Hide resolved
flytepropeller/pkg/resolver/k8s_resolver.go Outdated Show resolved Hide resolved
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
@pingsutw pingsutw marked this pull request as ready for review October 12, 2023 09:16
Signed-off-by: Kevin Su <[email protected]>
@pingsutw pingsutw requested review from honnix and EngHabu October 16, 2023 19:08
flytepropeller/pkg/controller/controller.go Outdated Show resolved Hide resolved
flytepropeller/pkg/controller/controller.go Outdated Show resolved Hide resolved
flytestdlib/resolver/k8s_resolver.go Outdated Show resolved Hide resolved
}

parts := strings.SplitN(service, ".", 3)
if len(parts) >= 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set namespace to default if parts==1 ?

Copy link
Member Author

@pingsutw pingsutw Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will use the namespace where agent is at if the namespace is None

flytestdlib/resolver/k8s_resolver.go Outdated Show resolved Hide resolved
flytestdlib/resolver/k8s_resolver.go Outdated Show resolved Hide resolved
flytestdlib/resolver/k8s_resolver.go Outdated Show resolved Hide resolved
}

// NewBuilder creates a kubeBuilder which is used by grpc resolver.
func NewBuilder(client kubernetes.Interface, schema string) resolver.Builder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to pass ctx here and use it in .Build() ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated it

})
}
}
err := k.cc.UpdateState(resolver.State{Addresses: newAddrs})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is cc.UpdateState thread safe? do we need to lock its usage in someway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Signed-off-by: Kevin Su <[email protected]>
Copy link
Member

@honnix honnix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments otherwise lgtm

}

// NewBuilder creates a kubeBuilder which is used by grpc resolver.
func NewBuilder(ctx context.Context, client kubernetes.Interface, schema string) resolver.Builder {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't intend to expose the scheme, maybe we could have another builder only for test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: can't we use NewBuilder to create a fake kube Builder?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant we are exposing schema string unnecessarily to users because they are not supposed to freely choose any string they like.

)

const (
K8sSchema = "k8s"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this is called scheme.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// Make sure watcher is started before we create the endpoint
time.Sleep(5 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make the waiting time shorter? This increases build time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set it to 2 now

Signed-off-by: Kevin Su <[email protected]>
)

const (
Schema = "k8s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you were going to rename it to

Suggested change
Schema = "k8s"
Scheme = "k8s"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's what I meant.

@pingsutw pingsutw merged commit e7035a7 into master Oct 19, 2023
36 checks passed
@pingsutw pingsutw deleted the k8s-resolver-v1 branch October 19, 2023 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants