Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Declarative management of matcheable resources #5599

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fg91
Copy link
Member

@fg91 fg91 commented Jul 27, 2024

Tracking issue

RFC #3749

Why are the changes needed?

Currently, config overrides/matchable attributes/matchable resources can only be created imperatively via flytectl update.
In RFC #3749, which aims to improve the UX around these config overrides, we agreed that there must be a declarative way to managed them using infrastructure as code.

This PR adds such a declarative mechanism.


Details on config overrides:

From the Flyte docs:

Customizing project, domain, and workflow resources with flytectl
For critical projects and workflows, you can use the flytectl update command to configure settings for task, cluster, and workflow execution resources, set matching executions to execute on specific clusters, set execution queue attributes, and more that differ from the default values set for your global Flyte installation. These customizable settings are created, updated, and deleted via the API and stored in the FlyteAdmin database.
In code, these settings are sometimes called matchable attributes or matchable resources, because we use a hierarchy for matching the customizations to applicable Flyte inventory and executions.

What changes were proposed in this pull request?

How was this patch tested?

  • Added unit tests

  • Used the following config to ensure that the database entries are the exact same compared to creating the config overrides with flytectl update:

    flyteadmin --config configs/*.yaml migrate seed-resources

    matcheableResources:
      workflowExecutionConfigs:
        declarative: true
        values:
          - domain: development
            project: flytesnacks
            max_parallelism: 5
            security_context:
              run_as:
                k8s_service_account: demo
    
      pluginOverrides:
       declarative: true
       values:
        - domain: development
          project: flytesnacks
          overrides:
            - task_type: python_task # Task type for which to apply plugin implementation overrides
              plugin_id:             # Plugin id(s) to be used in place of the default for the task type.
                - plugin_override1
                - plugin_override2
              missing_plugin_behavior: 1 # Behavior when no specified plugin_id has an associated handler. 0 : FAIL , 1: DEFAULT
    
      executionQueueAttributes:
       declarative: true
       values:
        - domain: development
          project: flyteexamples
          tags:
            - foo
            - bar
    
      clusterResourceAttributes:
        declarative: true
        values: 
        - domain: development
          project: flytetest
          attributes:
            foo: "bar"
            buzz: "lightyear"
    
      taskResourceAttributes:
        declarative: true
        values: 
        - project: flytesnacks
          domain: development
          defaults:
            cpu: "1"
            memory: "150Mi"
          limits:
            cpu: "2"
            memory: "450Mi"
            gpu: "1"
    
      executionClusterLabels:
        declarative: true
        values:
        - project: flytetest
          domain: development
          workflow: my-wf
          label: "my-label"

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Copy link

codecov bot commented Jul 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 32.14%. Comparing base (d6da838) to head (20da362).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5599      +/-   ##
==========================================
- Coverage   35.90%   32.14%   -3.76%     
==========================================
  Files        1301     1008     -293     
  Lines      109419    90357   -19062     
==========================================
- Hits        39286    29047   -10239     
+ Misses      66036    58234    -7802     
+ Partials     4097     3076    -1021     
Flag Coverage Δ
unittests-datacatalog 51.37% <ø> (ø)
unittests-flyteadmin ?
unittests-flytecopilot 12.17% <ø> (ø)
unittests-flytectl 62.32% <ø> (+0.04%) ⬆️
unittests-flyteidl 7.09% <ø> (ø)
unittests-flyteplugins 53.31% <ø> (ø)
unittests-flytepropeller 41.75% <ø> (ø)
unittests-flytestdlib 55.27% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@fg91
Copy link
Member Author

fg91 commented Jul 27, 2024

@katrogan my initial plan was to invoke this logic together with the seed-projects logic in the init container. That won't work though because the init container is not re-executed when the config map changes. Do you have a suggestion where to best move this? Treat it the same way we treat the "cluster resources sync"?

@fg91 fg91 self-assigned this Jul 27, 2024
Values []WorkflowExecutionConfig `json:"values" pflag:", The list of workflow execution configs to be managed."`
}

type WorkflowExecutionConfig struct {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config override is not fully implemented yet because it has a lot of nested sub configs. Might do so in another PR to not overload this one.

"github.com/flyteorg/flyte/flytestdlib/config"
)

const SectionKey = "matcheableResources"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO:

Matcheable resources is "internal naming" according to the docs.

How should we call this? configCustomizations or configOverrides?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to anything that's not matchable resources :) I like configOverrides although both sound great!

@fg91 fg91 requested a review from katrogan July 27, 2024 13:37
@RRap0so
Copy link
Contributor

RRap0so commented Jul 31, 2024

@katrogan my initial plan was to invoke this logic together with the seed-projects logic in the init container. That won't work though because the init container is not re-executed when the config map changes. Do you have a suggestion where to best move this? Treat it the same way we treat the "cluster resources sync"?

I'm not entirely sure this is the correct approach, how would it work if a user of a flyte platform wants to declare their resources? Or are we just considering this feature to be used by the owners/maintainers of the platform?

We're very much pro-self-serve and take those responsibilities out of the platform maintainers, we've ended up building a controller that has it's own Object that creates/updates/deletes flyte projects and some of their resources. Happy to demo it :)

@fg91
Copy link
Member Author

fg91 commented Jul 31, 2024

I'm not entirely sure this is the correct approach, how would it work if a user of a flyte platform wants to declare their resources? Or are we just considering this feature to be used by the owners/maintainers of the platform?

flytectl update continues to work of course in case organizations want to allow users to manage this themselves.

Or are we just considering this feature to be used by the owners/maintainers of the platform?

Yes, exactly.

We're very much pro-self-serve and take those responsibilities out of the platform maintainers, we've ended up building a controller that has it's own Object that creates/updates/deletes flyte projects and some of their resources. Happy to demo it :)

Would be curious to see it! Do users need kubectl access to the cluster?

I feel that being able to manage these things through the helm values is in line with how everything else can be configured in Flyte today :)

Copy link
Contributor

@katrogan katrogan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is awesome, thank you for putting up this PR!

@katrogan my initial plan was to invoke this logic together with the seed-projects logic in the init container. That won't work though because the init container is not re-executed when the config map changes. Do you have a suggestion where to best move this? Treat it the same way we treat the "cluster resources sync"?

I actually think this is okay, we never re-read config map values in flyteadmin as it is, and depend on the service deployment being rolled for those to take effect

One optimization we could add is a lightweight marker database table that stores a checksum of the last applied configmap so we don't always have to re-run the migration script on every deploy rollout

"github.com/flyteorg/flyte/flytestdlib/config"
)

const SectionKey = "matcheableResources"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to anything that's not matchable resources :) I like configOverrides although both sound great!


var model models.Resource

if workflow != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe use a switch statement here?

}

func CreateKey(project string, domain string, workflow string, resourceType string) string {
return fmt.Sprintf("%s-%s-%s-%s", project, domain, workflow, resourceType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we version this string perhaps? (if we want to one day modify the format to support task or launch plan level overrides, for example) e.g. v1-%s-%s...

@katrogan
Copy link
Contributor

re: flytectl update continues to work of course in case organizations want to allow users to manage this themselves.

how does this work in a configmap managed environment? if a user decides to use the cli to update a matchable attribute in the configmap, then the service restart will wipe our their changes, right? should we disable api access in deployments where we use the configmap style approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants