-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix propeller crash when inferring literal type for an offloaded literal #5771
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5771 +/- ##
==========================================
- Coverage 36.31% 36.31% -0.01%
==========================================
Files 1304 1304
Lines 110048 110072 +24
==========================================
- Hits 39969 39968 -1
- Misses 65923 65942 +19
- Partials 4156 4162 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is fine, but cc @eapolinario
this doesn't break any flyte releases right? |
@wild-endeavor yes right now it doesn't since the feature is disabled, but this happens in peculiar case of map task accepting an offloaded literal. I am working on more elaborate fix for this as the current fix doesn't completely solve it. |
@wild-endeavor added more fixes for the crash along with array node support to read offloaded literal ,promise support etc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -96,6 +96,10 @@ func ExtractFromLiteral(literal *core.Literal) (interface{}, error) { | |||
} | |||
} | |||
return mapResult, nil | |||
case *core.Literal_OffloadedMetadata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, thank you for updating
@@ -192,11 +192,20 @@ func (a *arrayNodeHandler) Handle(ctx context.Context, nCtx interfaces.NodeExecu | |||
|
|||
size := -1 | |||
for _, variable := range literalMap.Literals { | |||
if variable.GetOffloadedMetadata() != nil { | |||
// variable will be overwritten with the contents of the offloaded data which contains the actual large literal. | |||
// We need this for the map task to be able to create the subNodeSpec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just so I understand, this is also needed so the map task can index into the inputs as well too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah subNodeSpec covers that as it contains the inputs for each subnode
@@ -122,7 +122,7 @@ func (m *CatalogClient) Get(ctx context.Context, key catalog.Key) (catalog.Entry | |||
logger.Debugf(ctx, "DataCatalog failed to get artifact by tag %+v, err: %+v", tag, err) | |||
return catalog.Entry{}, err | |||
} | |||
logger.Debugf(ctx, "Artifact found %v from tag %v", artifact, tag) | |||
logger.Debugf(ctx, "Artifact found %v from tag %v", artifact.GetId(), tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why these changes are necessary? do we ever send too big inputs/outputs to cache service? shouldn't we be using the offloaded literal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even they are not too big breaching the 10 MB limit, but anything less than that threshold will get logged which is also huge and hence removed it. Let me know if you disagree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, I could still see this being useful, maybe we only log a deterministic prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was logging an entire object and hence i think logging an identifier should good for long term too and we have api's to fetch the aritifact using the ID if we want to debug further. Logging a prefix of object converted to string format doesn't seem useful IMO
beb1a2b
to
0542dab
Compare
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
0542dab
to
9edd5f9
Compare
Why are the changes needed?
Followup to this ticket #5705
Fixes propeller crash when the type inferred is nil.
What changes were proposed in this pull request?
The LiteralTypeFromLiteral function didn't handle Offloaded type
Along with this adds the following changes.
This does increase the memory profile for propeller whenever it needs to download large literal
How was this patch tested?
Tested using sandbox and the changes working fine for the following examples . Tested for variation of caching aswell
Notice
consume_list_objects
returns a slice since the outputting of offloaded data is yet to be supported in flytekit without which if the data is too large the datacatlog call errors out with grpc error due to max size breachedSetup process
Screenshots
Check all the applicable boxes
Related PRs
Docs link