Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Validate update calls for setting task resource attributes #3399

Closed
2 tasks done
katrogan opened this issue Mar 3, 2023 · 8 comments
Closed
2 tasks done

[BUG] Validate update calls for setting task resource attributes #3399

katrogan opened this issue Mar 3, 2023 · 8 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed stale

Comments

@katrogan
Copy link
Contributor

katrogan commented Mar 3, 2023

Describe the bug

Setting the task resource attributes for a project with
{"project":"foo","domain":"development","limits":{"cpu":"4096","memory":"8Ti"}}
causes flyteadmin to panic on resolving task resource attributes at create execution time

We shouldn't accept invalid configs or handle partial task resource attributes overrides gracefully

Expected behavior

Admin should not panic at create execution time for invalid or partial task resource attributes

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@katrogan katrogan added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Mar 3, 2023
@eapolinario eapolinario added good first issue Good for newcomers help wanted Extra attention is needed and removed untriaged This issues has not yet been looked at by the Maintainers labels Mar 10, 2023
@hhcs9527
Copy link

Hi @katrogan,
I am interested in this topic, but I am still new to flyte.
Can I try this issue?

@katrogan
Copy link
Contributor Author

hi @hhcs9527 absolutely!
you should be able to repro this on a flyte sandbox installation by updating task resource attributes for one of the boilerplate projects (like flytesnacks) using the resources in the GH issue. Then try launching an execution and verify that it causes admin to panic

This is the code that handles reading the overriden task resource attributes:
https://github.com/flyteorg/flyteadmin/blob/master/pkg/manager/impl/execution_manager.go#L531-L534 at execution time

Copy link

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable.
Thank you for your contribution and understanding! 🙏

@github-actions github-actions bot added the stale label Apr 13, 2024
@asd0300
Copy link

asd0300 commented Jan 8, 2025

#take

@asd0300
Copy link

asd0300 commented Jan 10, 2025

Hi @katrogan
I try update task resource attributes
with yaml

project: flytesnacks
domain: development
limits:
  cpu: "4096"
  memory: "8Ti"
  gpu: "2"

update success
then execute task

like to

@task(requests=Resources(cpu="4096", mem="8Ti", gpu="2"))
def say_hello() -> str:
    return "Hello, World!"

response
Running Execution on Remote. 0:00:00 Running execution on remote. [✔] Go to http://localhost:30080/console/projects/flytesnacks/domains/development/executions/av7jnqfhmg989hz8672l to see execution in the console.

seems can't repro the panic case

@katrogan
Copy link
Contributor Author

hi @asd0300 thank you for taking a look! what happens if you remove the gpu from the yaml?

@asd0300
Copy link

asd0300 commented Jan 10, 2025

test yaml will be

project: flytesnacks
domain: development
limits:
  cpu: "4096"
  memory: "8Ti"

before update, remove the default task-resource-attribute
then update the test .yaml -> updated success

then run task
1.

@task(requests=Resources(cpu="4096", mem="8Ti", gpu="2"))
def say_hello() -> str:
    return "Hello, World!"

or
2.

@task(requests=Resources(cpu="4096", mem="8Ti"))
def say_hello() -> str:
    return "Hello, World!"

both tasks could be executed with no panic occurred

@katrogan
Copy link
Contributor Author

thank you for verifying! sounds like this is no longer an issue - let's close it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed stale
Projects
None yet
Development

No branches or pull requests

4 participants