Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Add ray dashboard log link" #4226

Closed
wants to merge 2 commits into from

Conversation

jeevb
Copy link
Contributor

@jeevb jeevb commented Oct 13, 2023

Traced a regression in Ray plugin back to this change: 66e7b64

Error:

Workflow[flytesnacks:development:.flytegen.task.ray_task] failed. RuntimeExecutionError: max number of system retry attempts [11/10] exhausted. Last known status message: failed at Node[taskraytask]. RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [ray]: panic when executing a plugin [ray]. Stack: [goroutine 2818 [running]:
runtime/debug.Stack()
	/opt/homebrew/opt/go/libexec/src/runtime/debug/stack.go:24 +0x64
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/task.Handler.invokePlugin.func1.1()
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/task/handler.go:382 +0xb4
panic({0x103e54b40?, 0x105c06500?})
	/opt/homebrew/opt/go/libexec/src/runtime/panic.go:914 +0x218
github.com/flyteorg/flyte/flyteplugins/go/tasks/plugins/k8s/ray.rayJobResourceHandler.GetTaskPhase({}, {0x140063a83c0?, 0x20?}, {0x1043e5ef0, 0x140063a83c0}, {0x1043fdc88?, 0x14005235180})
	/Users/jeev/Workspace/repos/flyte/flyte/flyteplugins/go/tasks/plugins/k8s/ray/ray.go:405 +0x268
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/task/k8s.(*PluginManager).checkResourcePhase(0x140063b6a68, {0x1043d9df8, 0x140063a6d80}, {0x1043ee780, 0x140063ac0c0}, {0x1043fdc88?, 0x14005235180?}, 0x140063a2588)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go:282 +0x5f8
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/task/k8s.PluginManager.Handle({{0x102b18ce4, 0x3}, {0x1043da3a8, 0x105d39fa0}, {0x1043b9dc0, 0x14000778b00}, {0x14b9be8a8, 0x1400106b520}, {{0x1043fa058, 0x1400108acd0}, ...}, ...}, ...)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/task/k8s/plugin_manager.go:344 +0x3a8
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/task.Handler.invokePlugin.func1(0x140063b6f88?, {0x1043d9df8, 0x140063a6a80}, {0x1043de2c0?, 0x14002214160?}, 0x9?)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/task/handler.go:389 +0x110
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/task.Handler.invokePlugin({{0x1043dd4c0, 0x14000134cf0}, {0x1043bd970, 0x140001a8a00}, 0x14000d13140, 0x14000d13170, 0x14000d131a0, {0x1043de2c0, 0x14002718580}, 0x14001224d20, ...}, ...)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/task/handler.go:391 +0x60
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/task.Handler.Handle({{0x1043dd4c0, 0x14000134cf0}, {0x1043bd970, 0x140001a8a00}, 0x14000d13140, 0x14000d13170, 0x14000d131a0, {0x1043de2c0, 0x14002718580}, 0x14001224d20, ...}, ...)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/task/handler.go:572 +0x630
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/dynamic.dynamicNodeTaskNodeHandler.handleParentNode({{0x1043e76b8, 0x14001290680}, {{0x14001289840, {{...}, 0x0}, {0x140008bb130, 0xb, 0xb}}, {0x14001289860, {{...}, ...}, ...}, ...}, ...}, ...)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/dynamic/handler.go:67 +0xa0
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes/dynamic.dynamicNodeTaskNodeHandler.Handle({{0x1043e76b8, 0x14001290680}, {{0x14001289840, {{...}, 0x0}, {0x140008bb130, 0xb, 0xb}}, {0x14001289860, {{...}, ...}, ...}, ...}, ...}, ...)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/dynamic/handler.go:224 +0x888
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*nodeExecutor).execute(0x140012900d0, {0x1043d9df8, 0x14006383d40}, {0x1043dd580, 0x14000790780}, {0x1043f1cf0, 0x140063ac000}, {0x1044002e0?, 0x14006394460?})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/executor.go:822 +0xd8
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*nodeExecutor).handleQueuedOrRunningNode(0x140012900d0, {0x1043d9df8, 0x14006383d40}, {0x1043f1cf0?, 0x140063ac000?}, {0x1043dd580, 0x14000790780})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/executor.go:1116 +0x838
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*nodeExecutor).HandleNode(0x140012900d0, {0x1043d9df8, 0x14006383d40}, {0x1043b69e0, 0x14005713900}, {0x1043f1cf0, 0x140063ac000?}, {0x1043dd580?, 0x14000790780})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/executor.go:1384 +0x228
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*recursiveNodeExecutor).RecursiveNodeHandler(0x14000b37e00, {0x1043d9df8, 0x140063836e0}, {0x1043f6a48, 0x1400624eaf0}, {0x1043b69e0, 0x14005713900}, {0x1043dab18?, 0x14005713900?}, {0x1043f38e0, ...})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/executor.go:229 +0x51c
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*recursiveNodeExecutor).handleDownstream(0x1043d9df8?, {0x1043d9df8, 0x140063836e0}, {0x1043f6a48, 0x1400624eaf0}, {0x1043b69e0, 0x14005713900?}, {0x1043dab18?, 0x14005713900}, {0x1043f38e0, ...})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/executor.go:294 +0x300
github.com/flyteorg/flyte/flytepropeller/pkg/controller/nodes.(*recursiveNodeExecutor).RecursiveNodeHandler(0x14000b37e00, {0x1043d9df8, 0x140063836e0}, {0x1043f6a48, 0x1400624eaf0}, {0x1043b69e0, 0x14005713900}, {0x1043dab18?, 0x14005713900?}, {0x1043f38e0, ...})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/nodes/executor.go:236 +0x698
github.com/flyteorg/flyte/flytepropeller/pkg/controller/workflow.(*workflowExecutor).handleRunningWorkflow(0x140000ddd50, {0x1043d9df8, 0x140063836e0}, 0x14005713900)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/workflow/executor.go:149 +0x148
github.com/flyteorg/flyte/flytepropeller/pkg/controller/workflow.(*workflowExecutor).HandleFlyteWorkflow(0x140000ddd50, {0x1043d9df8, 0x140063836e0}, 0x14005713900)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/workflow/executor.go:395 +0x304
github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).TryMutateWorkflow.func2(0x14000a9f580, {0x1043d9df8, 0x140063836e0}, 0x140063bb7c8, 0x103c7f4a0?)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/handler.go:142 +0x104
github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).TryMutateWorkflow(0x14000a9f580, {0x1043d9df8, 0x14006383290}, 0x1400632ea00)
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/handler.go:143 +0x36c
github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*Propeller).Handle(0x14000a9f580, {0x1043d9df8, 0x14006383290}, {0x14005707b00, 0x17}, {0x14005707b18, 0x14})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/handler.go:259 +0xb10
github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).processNextWorkItem.func1(0x14000e64a20, 0x140063bbf10, {0x103c7f4a0?, 0x140063907e0})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/workers.go:88 +0x3b4
github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).processNextWorkItem(0x14000e64a20, {0x1043d9df8, 0x14006383290})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/workers.go:99 +0xc0
github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).runWorker(0x1043d9df8?, {0x1043d9df8, 0x14003feec60})
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/workers.go:115 +0x98
github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).Run.func1()
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/workers.go:150 +0x50
created by github.com/flyteorg/flyte/flytepropeller/pkg/controller.(*WorkerPool).Run in goroutine 220
	/Users/jeev/Workspace/repos/flyte/flyte/flytepropeller/pkg/controller/workers.go:147 +0x214

@codecov
Copy link

codecov bot commented Oct 14, 2023

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (2edc3bd) 59.52% compared to head (adb6d57) 59.32%.
Report is 2 commits behind head on master.

❗ Current head adb6d57 differs from pull request most recent head 032109b. Consider uploading reports for the commit 032109b to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4226      +/-   ##
==========================================
- Coverage   59.52%   59.32%   -0.20%     
==========================================
  Files         639      552      -87     
  Lines       54611    39882   -14729     
==========================================
- Hits        32506    23661    -8845     
+ Misses      19543    13894    -5649     
+ Partials     2562     2327     -235     
Flag Coverage Δ
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...lyteplugins/go/tasks/pluginmachinery/core/phase.go 21.50% <ø> (-1.50%) ⬇️
flyteplugins/go/tasks/plugins/k8s/ray/config.go 50.00% <ø> (+21.42%) ⬆️
...teplugins/go/tasks/plugins/k8s/ray/config_flags.go 38.70% <100.00%> (+2.34%) ⬆️
flyteplugins/go/tasks/plugins/k8s/ray/ray.go 77.56% <40.00%> (-6.05%) ⬇️

... and 572 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

eapolinario
eapolinario previously approved these changes Oct 14, 2023
Signed-off-by: Jeev B <[email protected]>
@jeevb jeevb force-pushed the jeev/revert-ray-dashboard-links branch from 68ff3ef to 032109b Compare October 14, 2023 00:10
Copy link
Contributor

@eapolinario eapolinario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging, can you leave a description explaining what we are seeing?

@pingsutw
Copy link
Member

fixed it here. #4239

@jeevb jeevb closed this Oct 16, 2023
@jeevb jeevb deleted the jeev/revert-ray-dashboard-links branch October 16, 2023 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants