-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Flytepropeller error and crash when "limit-namespace" is set #5087
Comments
This comment was marked as off-topic.
This comment was marked as off-topic.
I also run into this issue. Will help taking a look |
Root cause analysis
I haven't found a good solution yet. Flyte mixes the usage of |
#take |
Resolves: flyteorg#5087 Signed-off-by: Chi-Sheng Liu <[email protected]>
It looks like we upgraded from 0.12.1 of controller-runtime to 0.16.2 when we added support for open telemetry. That brought in a whole lot of changes one of which was the change you (line 287 of cache.go which is hidden) mentioned @ByronHsu. How did you find this? Tracing through, I couldn't find my way to that cache.go function. I don't think the current proposal quite works though. We have to do this through the informer. it's way too expensive to go through the client every time (unless the newer client is somehow already caching things?). @hamersaw - could you take a look at this when you get a chance please? i haven't worked with controller runtime enough. Thanks again for the investigation byron. |
Thanks @wild-endeavor! bullet 1 is in flyte code, but bullet 2 (cache.go) is in controller-runtime code. |
@wild-endeavor |
Resolves: flyteorg#5087 Signed-off-by: Chi-Sheng Liu <[email protected]>
Resolves: flyteorg#5087 Signed-off-by: Chi-Sheng Liu <[email protected]>
Update unittests corresponding to the changes from SharedIndexInformer to Informer Resolves: flyteorg#5087 Signed-off-by: Chi-Sheng Liu <[email protected]>
Describe the bug
Flytepropeller error and crash of the pod (version 1.11.0, same for version 1.10.6/1.10.7)
Expected behavior
no crash
Additional context to reproduce
Screenshots
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: