Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flyteadmin] Show diff structure when re-registration two different task with same ids #4924

Merged
merged 13 commits into from
Mar 27, 2024

Conversation

austin362667
Copy link
Contributor

@austin362667 austin362667 commented Feb 21, 2024

Tracking issue

#4762

Why are the changes needed?

  • To highlight diff, the delta part, of error message for NewTaskExistsDifferentStructureError(), NewWorkflowExistsDifferentStructureError(), NewLaunchPlanExistsDifferentStructureError().

  • Usually when user submit two different task, workflow, launchplan with the same identifier.

What changes were proposed in this pull request?

  • By a language agnostic implementation located at flyteadmin.
  • Add "wI2L/jsondiff" go package for computing structural diff.

How was this patch tested?

Setup process

  • register two different code, but with the same project, domain, name, version.

Screenshots

  • Please check the comments below.
  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Copy link

welcome bot commented Feb 21, 2024

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Feb 21, 2024
@austin362667
Copy link
Contributor Author

austin362667 commented Feb 21, 2024

  • successful register task
  • re-register same task with same id (pass)
    Screenshot 2024-02-21 at 3 15 37 PM
  • re-register different(modified) task with same id (failed)
    Screenshot 2024-02-21 at 3 16 12 PM
  • re-register different(modified) task with same id (failed), with diff error message
    Screenshot 2024-02-21 at 3 18 31 PM

@austin362667
Copy link
Contributor Author

austin362667 commented Feb 21, 2024

What about Workflow, LaunchPlan?

e.g.,
Modify a task name(function name) used by another workflow

@austin362667
Copy link
Contributor Author

2. extend the error message to include structured data about what in the spec has changed

What if we just show diff as string, in order to prevent modify too much task proto buff?

Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks well to me.

first time registration

image

second time registration

image

flyteadmin/pkg/errors/errors.go Outdated Show resolved Hide resolved
flyteadmin/pkg/errors/errors.go Outdated Show resolved Hide resolved
flyteadmin/pkg/manager/impl/task_manager.go Outdated Show resolved Hide resolved
flyteadmin/pkg/manager/impl/task_manager.go Outdated Show resolved Hide resolved
flyteadmin/pkg/manager/impl/task_manager.go Outdated Show resolved Hide resolved
flyteadmin/pkg/manager/impl/task_manager.go Outdated Show resolved Hide resolved
@Future-Outlier
Copy link
Member

I think overall looks well, however I am not familiar with flyteadmin's codebase.
cc @katrogan, can you help review? thank you.

Copy link

codecov bot commented Feb 23, 2024

Codecov Report

Attention: Patch coverage is 69.56522% with 21 lines in your changes are missing coverage. Please review.

Project coverage is 59.03%. Comparing base (ec0bc4c) to head (04c34ba).

Files Patch % Lines
flyteadmin/pkg/manager/impl/task_manager.go 11.11% 8 Missing ⚠️
flyteadmin/pkg/manager/impl/launch_plan_manager.go 0.00% 7 Missing ⚠️
flyteadmin/pkg/manager/impl/workflow_manager.go 33.33% 3 Missing and 1 partial ⚠️
flyteadmin/pkg/errors/errors.go 95.74% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4924      +/-   ##
==========================================
+ Coverage   59.00%   59.03%   +0.02%     
==========================================
  Files         645      645              
  Lines       55672    55726      +54     
==========================================
+ Hits        32850    32896      +46     
- Misses      20226    20233       +7     
- Partials     2596     2597       +1     
Flag Coverage Δ
unittests 59.03% <69.56%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@austin362667
Copy link
Contributor Author

austin362667 commented Feb 23, 2024

from flytekit import task, workflow
import time

@task(container_image="image1")
def say_hello() -> str:
    time.sleep(1)
    return "Hello, World!"

@workflow
def wf() -> str:
    res = say_hello()
    return res

if __name__ == "__main__":
    print(f"Running hello_world_wf() {wf()}")
from flytekit import task, workflow
import time

@task(container_image="image2")
def say_hello() -> str:
    time.sleep(1)
    return "Hello, World!"

@workflow
def wf() -> str:
    res = say_hello()
    return res

if __name__ == "__main__":
    print(f"Running hello_world_wf() {wf()}")

pyflyte register workflow.py --image basics:v2 --version v1

Screenshot 2024-02-24 at 4 23 51 AM Screenshot 2024-02-24 at 4 30 45 AM

Copy link
Contributor

@katrogan katrogan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks so great! thank you for updating workflow validation as part of this task too


func NewTaskExistsDifferentStructureError(ctx context.Context, request *admin.TaskCreateRequest, oldSpec *core.TaskTemplate, newSpec *core.TaskTemplate) FlyteAdminError {
errorMsg := "task with different structure already exists:\n"
// omit source code file object storage path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

args/2 is still used for the digest computation, right? any reason to omit it here?

Copy link
Contributor Author

@austin362667 austin362667 Feb 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we still hash the whole task for digest.

The reason to omit file path shown in error message is that it already trapped in "different structure with same id" error, saying code file must be different. So it's trivial to hide the path changed information. cc @pingsutw

Copy link
Contributor Author

@austin362667 austin362667 Feb 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another concern of mine is what if the file path value ("s3://my-s3-bucket/flytesnacks/...") being stored with different key path in the future, for example, args/3.

The code will be broken and hard to locate bugs. Any suggestion?

func NewTaskExistsDifferentStructureError(ctx context.Context, request *admin.TaskCreateRequest, oldSpec *core.TaskTemplate, newSpec *core.TaskTemplate) FlyteAdminError {
errorMsg := "task with different structure already exists:\n"
// omit source code file object storage path
diff, _ := jsondiff.Compare(oldSpec, newSpec, jsondiff.Ignores("/Target/Container/args/2"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need two Compare calls here?

it looks like according to https://pkg.go.dev/github.com/wI2L/jsondiff#Differ.Compare this should output the diff between src & tgt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because jsondiff.Compare returns a json patch file meaning a series of operations of that json.

It only show the modified target json other than source json. Since we're talking about diff message. It must show the difference between changes from source to target.

diff = jsondiff.Compare(oldTemplate, newTemplate)
rdiff = jsondiff.Compare(newTemplate, oldTemplate
which prefix r means reverse.

Finally compareJsons() zip the diff and its complements rdiff.

flyteadmin/pkg/errors/errors.go Outdated Show resolved Hide resolved
flyteadmin/pkg/errors/errors.go Outdated Show resolved Hide resolved
flyteadmin/pkg/manager/impl/task_manager.go Outdated Show resolved Hide resolved
Comment on lines 74 to 75
message TaskErrorExistsDifferentStructure {
core.Identifier id = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! thx

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Feb 27, 2024
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Feb 27, 2024
@austin362667
Copy link
Contributor Author

  1. Task: macos to ubuntu
from flytekit import task, workflow
import time
from flytekit.image_spec import ImageSpec

image_spec = "macos"

@task(container_image=image_spec)
def say_hi() -> str:
    time.sleep(1)
    return "Hello, World!"


@workflow
def wf() -> str:
    res = say_hi()
    return res

if __name__ == "__main__":
    print(f"Running hello_world_wf() {wf()}")
from flytekit import task, workflow
import time
from flytekit.image_spec import ImageSpec

image_spec = "ubuntu"

@task(container_image=image_spec)
def say_hi() -> str:
    time.sleep(1)
    return "Hello, World!"
    

@workflow
def wf() -> str:
    res = say_hi()
    return res

if __name__ == "__main__":
    print(f"Running hello_world_wf() {wf()}")
Screenshot 2024-02-27 at 7 53 15 PM
  1. Workflow: say_hi() to say_hey()
from flytekit import task, workflow
import time
from flytekit.image_spec import ImageSpec

image_spec = "ubuntu"

@task(container_image=image_spec)
def say_hi() -> str:
    time.sleep(1)
    return "Hello, World!"


@workflow
def wf() -> str:
    res = say_hi()
    return res

if __name__ == "__main__":
    print(f"Running hello_world_wf() {wf()}")
from flytekit import task, workflow
import time
from flytekit.image_spec import ImageSpec

image_spec = "ubuntu"

@task(container_image=image_spec)
def say_hey() -> str:
    time.sleep(1)
    return "Hello, World!"
    

@workflow
def wf() -> str:
    res = say_hey()
    return res

if __name__ == "__main__":
    print(f"Running hello_world_wf() {wf()}")
Screenshot 2024-02-27 at 7 53 46 PM

Copy link
Contributor

@katrogan katrogan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, just one question about map comparison determinism


func NewTaskExistsDifferentStructureError(ctx context.Context, request *admin.TaskCreateRequest, oldSpec *core.CompiledTask, newSpec *core.CompiledTask) FlyteAdminError {
errorMsg := "task with different structure already exists:\n"
diff, _ := jsondiff.Compare(oldSpec, newSpec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since in golang map ordering is random/never guaranteed, I'm worried that converting the specs to bytes here might lead to non deterministic byte arrays

when we compute the digest, we use pbhash to marshal the protos to json

should we do the same here before calling Compare()?

alternatively, do you mind double checking that jsondiff.Compare() returns no diff for multiple calls comparing 2 identical pb objects that contain maps in the message?

Copy link
Contributor Author

@austin362667 austin362667 Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can!
I've carefully double checked that jsondiff.Compare() is an identical operation w.r.t determinism. Also added the test for this function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @austin362667 for the super fast turnaround!

in this case, i was more concerned about identical maps not showing a false diff - would you mind double checking that? otherwise PR looks great!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

flyteadmin/pkg/manager/impl/launch_plan_manager.go Outdated Show resolved Hide resolved
flyteadmin/pkg/manager/impl/launch_plan_manager.go Outdated Show resolved Hide resolved
flyteadmin/pkg/manager/impl/task_manager.go Outdated Show resolved Hide resolved
katrogan
katrogan previously approved these changes Mar 8, 2024
Copy link
Contributor

@katrogan katrogan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this so awesome, thank you!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 8, 2024
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>

format

Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>

rollback proto change

Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Let's merge it when all tests passed

@Future-Outlier Future-Outlier enabled auto-merge (squash) March 27, 2024 07:22
@Future-Outlier Future-Outlier disabled auto-merge March 27, 2024 07:22
@Future-Outlier Future-Outlier enabled auto-merge (squash) March 27, 2024 07:25
@Future-Outlier Future-Outlier merged commit a5e2733 into flyteorg:master Mar 27, 2024
47 of 48 checks passed
Copy link

welcome bot commented Mar 27, 2024

Congrats on merging your first pull request! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants