Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show version and execute by version #2124

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

novahow
Copy link
Contributor

@novahow novahow commented Jan 21, 2024

Tracking issue

fixes flyteorg/flyte#4707

Why are the changes needed?

Show versions and creation time of remote-task/launchplan/workflow

What changes were proposed in this pull request?

This PR contains 2 parts.
For the show-versions part, I created a new top-level command VersionCommand which inherits the RunCommand class since they function similarly. Also, RemoteEntityVersionGroup inherits RemoteEntityGroup,and DynamicEntityVersionCommand inherits both click.RichGroup and DynamicEntityLaunchCommand. It inherits click.RichGroup because we want to display versions and creation time with click automatically.
We also extract parts of RunLevelParams to create a base param class RunBaseParams, so that VersionLevelParams which inherits RunBaseParams can display only the options that will be used is show-version command.

To fetch the creation time of the task, the created_at attribute is added in closures. This ensures that both versions and created_time can be retrieved from list_tasks_paginated method.

@classmethod
    def from_flyte_idl(cls, p):
        """
        :param flyteidl.admin.workflow_pb2.WorkflowClosure p:
        :rtype: WorkflowClosure
        """
        return cls(compiled_workflow=_compiler_models.CompiledWorkflowClosure.from_flyte_idl(p.compiled_workflow))
        return cls(
            compiled_workflow=_compiler_models.CompiledWorkflowClosure.from_flyte_idl(p.compiled_workflow),
            created_at=p.created_at.ToDatetime().replace(tzinfo=_timezone.utc) if p.HasField("created_at") else None,
        )

For the execute with version part, I simply split the entity name with ":", and iteratively loop through possible combinations of entity_name and version with the _looped_fetch_entity method.

How was this patch tested?

By running pyflyte show-versions remote-[launchplan|workflow|task] {name} , the versions will be displayed
Code snippet:

from math import sqrt
from flytekit import workflow, task, LaunchPlan
from typing import List, Union
import random
import pdb

@task
def standard_deviation(values: List[float], mu: float) -> float:
    variance = sum([(x - mu) ** 2 for x in values])
    return sqrt(variance)

@task
def standard_scale(values: List[float], mu: float, sigma: float) -> List[float]:
    return [(x - mu) / sigma for x in values]


@task
def mean(values: List[float]) -> float:
    return sum(values) / len(values)

@task
def mean2(values: List[float]) -> float:
    return sum(values) / len(values)

@task
def generate_data(num_samples: int, seed: int) -> List[float]:
    random.seed(seed)
    return [random.random() for _ in range(num_samples)]


@workflow
def standard_scale_workflow(values: List[float]) -> List[float]:
    mu = mean(values=values)
    sigma = standard_deviation(values=values, mu=mu)
    return standard_scale(values=values, mu=mu, sigma=sigma)


@workflow
def workflow_with_subworkflow(num_samples: int, seed: int) -> List[float]:
    data = generate_data(num_samples=num_samples, seed=seed)
    return standard_scale_workflow(values=data)


# pdb.set_trace()
print("Compiled")
# LaunchPlan.CACHE.pop("standard_scale_lp", None)
standard_scale_launch_plan = LaunchPlan.get_or_create(
    standard_scale_workflow,
    name="standard_scale_lp",
    default_inputs={"values": [3.0, 4.0, 5.0]}
)

Example:

pyflyte show-versions remote-task flyteTest.example.mean

One can then run
pyflyte run remote-task flyteTest.example.mean:{version} --inputs ...

Setup process

Screenshots

image image image image

Check all the applicable boxes

  • I updated the documentation accordingly.
  • [v] All new and existing tests passed.
  • [v] All commits are signed-off.

Related PRs

Docs link

@novahow novahow changed the title Core/show version Show version and execute by version Jan 21, 2024
@kumare3
Copy link
Contributor

kumare3 commented Jan 26, 2024

There is one problem with this
--show-versions implies you cannot have a task or a workflow with show_versions as an input
Example

@task 
def do(show_versions: bool):
    ...

which might be ok. But to counter this, I suggested add a subcommand of type

pyflyte run remote-task mytask show-versions

Note the missing --

cc @wild-endeavor what should we do?

If you create a subcommand, like how we do for remote-entities. the table also does not need to be created!

else:
raise ValueError(f"Unknown entity type {type(entity)}")
formatter = click.HelpFormatter()
task_table = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

codecov bot commented Jan 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (29662e7) to head (a5203ec).
Report is 20 commits behind head on master.

Current head a5203ec differs from pull request most recent head 18381a3

Please upload reports for the commit 18381a3 to get more accurate results.

Additional details and impacted files
@@             Coverage Diff              @@
##           master     #2124       +/-   ##
============================================
+ Coverage   71.79%   100.00%   +28.20%     
============================================
  Files         182         5      -177     
  Lines       18561       122    -18439     
  Branches     3654         0     -3654     
============================================
- Hits        13326       122    -13204     
+ Misses       4592         0     -4592     
+ Partials      643         0      -643     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wild-endeavor
Copy link
Contributor

can we make show-versions or maybe just versions a top level command on pyflyte? so just pyflyte versions instead of pyflyte run?

@novahow
Copy link
Contributor Author

novahow commented Jan 26, 2024

can we make show-versions or maybe just versions a top level command on pyflyte? so just pyflyte versions instead of pyflyte run?

Thanks for the advice, this sounds more feasible for me, I've modified my PR to implement this behavior.

@wild-endeavor
Copy link
Contributor

I can't get this to run locally actually. running the new command I'm getting

  File "/Users/ytong/go/src/github.com/flyteorg/flytekit/flytekit/clis/sdk_in_container/versions.py", line 19, in __init__
    super(click.RichGroup, self).__init__(name, h, entity_name, launcher, **kwargs)

related, why is DynamicEntityVersionCommand inheriting both click.RichGroup and DynamicEntityLaunchCommand? why is this related to launch?

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 14, 2024
@novahow
Copy link
Contributor Author

novahow commented Feb 14, 2024

@wild-endeavor Hi, sorry for the confusion. DynamicEntityVersionCommand inherits the DynamicEntityLaunchCommand because it needs to call the _fetch_entity method of DynamicEntityLaunchCommand in list_commands method, so that it can find the type of the desired entity.
Regarding the error, could you please provide the command you used that triggers the error you mentioned as well as the error log? Thanks

@novahow novahow force-pushed the core/show_version branch from 5556d00 to a8a6ede Compare March 27, 2024 15:10
@novahow novahow marked this pull request as draft March 27, 2024 15:12
@novahow novahow force-pushed the core/show_version branch from a8a6ede to 68884bd Compare April 1, 2024 18:14
@novahow novahow marked this pull request as ready for review April 1, 2024 18:22
@novahow novahow requested a review from samhita-alla as a code owner April 1, 2024 18:22
novahow added 3 commits June 21, 2024 12:14
Signed-off-by: novahow <[email protected]>
Signed-off-by: novahow <[email protected]>
Signed-off-by: novahow <[email protected]>

	modified:   flytekit/clis/sdk_in_container/run.py
	modified:   flytekit/clis/sdk_in_container/versions.py
	new file:   tests/flytekit/unit/cli/pyflyte/test_versions.py
@novahow novahow force-pushed the core/show_version branch from 68884bd to 9214dbe Compare June 21, 2024 04:24
Signed-off-by: novahow <[email protected]>
pingsutw
pingsutw previously approved these changes Jun 26, 2024
@jpvotta
Copy link

jpvotta commented Jun 26, 2024

Can we please also test with reference task, reference workflow, and reference launchplan?

Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the top command currently shows
image
Instead of every line item showing "Display version of a remote-task." can we show the version? If that's not easy to do, can we just hide that column completely?

@@ -1,4 +1,6 @@
import typing
from datetime import datetime as _datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not use the _library pattern anymore. Let's make this just from datetime import datetime

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just hide that column completely, thanks for the suggestion. Since the version is a list and is shown when you add the task as a subcommand

@@ -1,4 +1,6 @@
import typing
from datetime import datetime as _datetime
from datetime import timezone as _timezone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -124,16 +128,31 @@ def compiled_workflow(self):
"""
return self._compiled_workflow

@property
def created_at(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move type hint to function signature and remove comment

@@ -1,4 +1,5 @@
import typing
from datetime import timezone as _timezone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from datetime import timezone as _timezone
from datetime import timezone

@@ -349,15 +352,25 @@ def expected_outputs(self):
"""
return self._expected_outputs

@property
def created_at(self):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@novahow
Copy link
Contributor Author

novahow commented Jun 29, 2024

Can we please also test with reference task, reference workflow, and reference launchplan?

@jpvotta Hi, can you please elaborate more on how to cover that? Since I'm not really familiar with reference task, thanks

Signed-off-by: novahow <[email protected]>
@jpvotta
Copy link

jpvotta commented Jun 30, 2024

Can we please also test with reference task, reference workflow, and reference launchplan?

@jpvotta Hi, can you please elaborate more on how to cover that? Since I'm not really familiar with reference task, thanks

Apologies. I am going to defer to @eapolinario on whether or not this is necessary (guessing it shouldn't be relevant here).

@eapolinario
Copy link
Collaborator

eapolinario commented Jul 24, 2024

@novahow , sorry for the delay I was out in the last 2.5 weeks.

Reference entities are a way of referring to pre-existing Flyte entities. Take a look at https://docs.flyte.org/en/latest/user_guide/productionizing/reference_tasks.html#reference-tasks, https://docs.flyte.org/en/latest/user_guide/productionizing/reference_launch_plans.html, and https://docs.flyte.org/en/latest/api/flytekit/generated/flytekit.reference_workflow.html#flytekit-reference-workflow.

As for whether we should cover this in this change, I don't think it's necessary since we're already dealing with remote entities anyway (i.e. you need to know the names + versions before executing them).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Core feature] pyflyte run remote-task xyz.my_task show-versions
6 participants