Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata add 1:many custom transformations #1225
Metadata add 1:many custom transformations #1225
Changes from 1 commit
d549041
3be669a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spirit of this call - slicing the output into separate subsequent transforms is really weird. For the JSON Transformer composite transformer code, we'll do direct composition. In this case, there's an implicit fanout because the type for the per-index metadata transform calls is a scalar Index - not a collection of Indices.
My opinion here, from the limited code that I've seen, would be to abstain from allowing the RFS-MetaData CompositeTransformer to host transforms that can return lists. Keep two types of transforms. Merge the outputs of the list one with a specific class that does fanout across another sequence of 'simple' transforms. That might be a few more lines of code, but it might be a lot clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment above about the composability problem. It's hard to get one's head around this once you know that we need to chain some transforms together.
What if you keep all of the "simple transforms" that are IndexMetadata->IndexMetadata. For when you need to return a list or take a list in, use an adapter that does the wrapping and any fanout. It feels capricious/fragile in the code below when we're boxing within a list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the name of this class backward? It feels like the IJsonTransformerAdapter is being transformed into a Metadata Transformer interface, not the other way around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: map/collect would work too - this and the next comment could return immutable lists. If you can't do that, you should add why (as I think of it, you might not be able to do that - so it is tempting).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: List.of() would work too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting this calling pattern to be a transform on the whole list would be safer (no collisions), give the user more flexibility, & probably also make it easier for genAI to create complete transforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will address in https://opensearch.atlassian.net/browse/MIGRATIONS-2301
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some risk here. If a custom transform maps some components to OS2 and leaves others behind. We'd really want/need to let version transforms run for some things and not others. As discussed w/ Andre, adding a version key to each item would help us identify the version in the transform and also let subsequent transforms know that it's already been updated to that schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, this code might be quicker to read if it were tighter (it would be easy to compress w/out the log statement - which could be moved to outside the construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree w/ the SonarQube warning - if this fails, just return a List.of(...).
Then, if it succeeds, go into the for loop w/ the separate try-catch blocks
Check failure on line 70 in RFS/src/main/java/org/opensearch/migrations/bulkload/worker/IndexRunner.java
GitHub Actions / Run SonarQube Analysis
java:S1141
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did sonarqube not flag catching a raw Exception?