Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cannot read cache task output returning Optional[str] #4021

Closed
2 tasks done
mickjermsurawong-stripe opened this issue Sep 9, 2023 · 4 comments
Closed
2 tasks done
Labels
bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers

Comments

@mickjermsurawong-stripe
Copy link

Describe the bug

Re-running a cached task returning Optional[str] would fail.

  • Running the task below once is successful and cache is written.
  • Re-running (without cache overwrite) would fail because deserialized type from data catalog doesn't match the method signature.

For example:

@task(cache=True, cache_version="1.0")
def return_optional(task_out: str) -> Optional[str]:
    return task_out

Msg on Flyte console:

Workflow[...] failed. RuntimeExecutionError: max number of system retry attempts [51/50] exhausted. Last known status message: 
Failed to check Catalog for previous results: 
unexpected artifactData: [o0] 
type: [union_type:<variants:<simple:STRING structure:<tag:"str" > > > ] does not match any task output 
type: [union_type:<variants:<simple:STRING > variants:<simple:NONE > > ]

Error from flytepropeller log:

DataCatalog failed to get outputs from artifact dc8dc0fc-a61d-4126-85bd-34b163d49257, 
err: unexpected artifactData: [o0] 
type: [union_type:<variants:<simple:STRING structure:<tag:"str" > > > ] does not match any task output 
type: [union_type:<variants:<simple:STRING > variants:<simple:NONE > > ]

Looking at underlying decoded proto, which is consistent with error message of deserialized type.

1 {
  9 {
    1 {
      1 {
        1 {
          3: "foo"
        }
      }
    }
    2 {
      11 {
        1: "str"
      }
      1: 3
    }
  }
}

Expected behavior

Cached task returning optional string (or other types) should be able to get output from the cache.
Optional type should be round-trip serializable. Specifically, it seems that the serialized proto data doesn't encode "optionality" of type here.

Additional context to reproduce

Task return_optional will fail after the first successful run caching its output

from flytekit import task, workflow
from typing import Optional

@task(cache=True, cache_version="2.0")
def return_optional(task_out: str) -> Optional[str]:
    return task_out

@workflow
def example() -> Optional[str]:
    return return_optional(task_out="foo")

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@mickjermsurawong-stripe mickjermsurawong-stripe added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Sep 9, 2023
@kumare3
Copy link
Contributor

kumare3 commented Sep 11, 2023

@mickjermsurawong-stripe what version of flytepropeller and flyteadmin are you running?

@mickjermsurawong-stripe
Copy link
Author

mickjermsurawong-stripe commented Sep 11, 2023

ah thanks for checking. We are on older version:

  • propeller v1.1.90
  • catalog v1.0.46
  • admin v1.1.100

Are you aware of changes regarding optional serialization in the newer versions?

@kumare3
Copy link
Contributor

kumare3 commented Sep 11, 2023

Yes, we found a few issues and were fixed on newer versions. What you are running is close to a year old

@mickjermsurawong-stripe
Copy link
Author

Thanks for confirming! Sorry we did look for issues related to this serialization but couldn't find, and assumed it wasn't due to version. We should do a better job here :) Definitely are prioritizing upgrading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers
Projects
None yet
Development

No branches or pull requests

2 participants