-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PathExtraction initinal implementaion #1
Conversation
* Path Extraction implementation that exposes the IonReader to registered callbacks * Package setup, e.g. gradle, checkstyle, copyright
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. Good tests. Some comments below.
*/ | ||
class PathComponentParser { | ||
|
||
private static final IonSystem ION_SYSTEM = IonSystemBuilder.standard().build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend replacing this with an IonReaderBuilder
and an IonTextWriterBuilder
; a full IonSystem
is no longer required for constructing readers and writers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, will change it
* language governing permissions and limitations under the License. | ||
*/ | ||
|
||
package software.amazon.com.ionpathextraction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The com
is superfluous. For example, ion-java uses software.amazon.ion
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for noticing that, I'll submit the change in namespace as a separate PR to avoid polluting this one, #2
|
||
private boolean isWildcard(final IonReader reader) { | ||
if (reader.stringValue().equals(Wildcard.TEXT)) { | ||
for (final Iterator<String> iter = reader.iterateTypeAnnotations(); iter.hasNext(); ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with ion-c, this should only check to see if the first annotation is WILDCARD_ESCAPE_ANNOTATION
. (Processing of local symbol tables serves as precedent for this--only structs whose first annotation is $ion_symbol_table are treated as local symbol tables.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
private static final boolean DEFAULT_MATCH_RELATIVE_PATHS = false; | ||
private static final boolean DEFAULT_CASE_INSENSITIVE = false; | ||
private final List<SearchPath> searchPaths = new ArrayList<>(); | ||
private final List<Function<IonReader, Integer>> callbacks = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think there is value in giving the user the ability to pass through some context to the callback. The first adopter of the ion-c path extractor is using this feature. That would make the callback function a BiFunction<IonReader, Object, Integer>
, and you'd probably have to add a class to hold the pair of the function and Object context
.
Then again, maybe this can be achieved in Java with no loss of functionality if the author of the callback just refers to external variables containing any desired context. If that's the case, then we can keep it as-is, and we should change the developers' guide to make it clear that presence of user context in the callback is is left to the implementation based on what's easiest in the particular language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Java the user can use the lambda closure or a concrete Function
implementation as the user context. The callback I use in the tests uses the lambda closure for this so it can accumulate all results in a IonList
Will add that as an example in the documentation
|
||
private static final String WILDCARD_ESCAPE_ANNOTATION = "$ion_extractor_field"; | ||
|
||
List<PathComponent> parse(final String ionPathExpression) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to make all these methods static
and not require an actual instance of PathComponentParser
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really.
I was thinking of having interfaces for all these classes to make it easier to swap implementations eventually but since PathComponentParser
is not public this is not necessary. I should do that for PathExtractor
at least though
Will change PathComponentParser
and add an interface to PathExtractor
} | ||
|
||
int getCurrentDepth() { | ||
return stack.size() - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this could cause problems when there are no search paths. Maybe we should do a quick return in match
when that's the case?
Or we can require at least one search path, but currently ion-c and the documentation don't require that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll keep parity with ion-c here as it can be useful to have a no-op PathExtractor
, i.e. with zero search paths
I added some tests and it works with zero paths.
Take a look at PathExtractor#matchRecursive
, even though currentDepth
will be -1
it is not used by anything as there are no search paths, and thus no active paths skipping the for
body where currentDepth
is used
Agree that a short circuit is good here as it makes the code clearer and avoids unnecessary work so will add that.
|
||
assertEquals(ION.singleValue("[1,2,3]"), out) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test case for calling match
with zero registered search paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And one for returning an integer from a matched callback that is greater than the reader's depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do. I'm assuming that the callback return should not be higher than the reader current relative depth to account for matchRelativePaths
example
data {foo: {bar: 1}}
search path: (bar)
return 2 // max
edit: was supposed to say "should not be higher" instead of "should be higher"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, correct.
|
||
private int invokeCallback(final IonReader reader, final SearchPath searchPath) { | ||
int previousReaderDepth = reader.getDepth(); | ||
int stepOutTimes = callbacks.get(searchPath.getId()).apply(reader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fail if stepOutTimes
is greater than the reader's depth (as ion-c does)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't, will fix
private static final boolean DEFAULT_MATCH_RELATIVE_PATHS = false; | ||
private static final boolean DEFAULT_CASE_INSENSITIVE = false; | ||
private final List<SearchPath> searchPaths = new ArrayList<>(); | ||
private final List<Function<IonReader, Integer>> callbacks = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of having two parallel ArrayList
s, how about modifying the SearchPath
class to hold a List<PathComponent>
and the callback Function
(instead of an Integer id
, which is only used as an index into the callbacks list).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do, see my answers to your comments on SearchPath
above for more context
"reader must be at depth zero, it was at:" + reader.getDepth()); | ||
|
||
// marks all search paths as active | ||
tracker.reset(searchPaths); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted PathExtractor
to be thread-safe (with the caveat that each thread would need to use a different IonReader
), it looks like it would be as simple as instantiating a new Tracker
on each invocation of match
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Trade off is more instantiation every match call. TBH I'm not sure how big that trade off is so erred on making it faster instead of thread safe as you can pool PathExtractor
s easily.
In any case should be to refactor so we can have both versions and let the user decide. Would wait on a use case before benchmarking both versions and/or providing both versions
Showing how you can use the lambda closure as ion-c user context
making the wildcard escape annotation valid only as the first annotation. Added a test to show that
PathExtractorImpl is now package private to make easier to add new implementations in the future if necessary
Already worked, but short circuiting makes it clearer and avoids unecessary work. Also adds missing tests for this use case
and some other spelling and missing comments
* Moving callback to SearchPath * Simplifying partial match detection logic * Fixing needToStepIn logic * Adding tests for nested search paths
Tried to address each comment in a separate commit to make it easier for the reviewer. I also renamed |
README.md
Outdated
.register("(baz 1)", callback) | ||
.build(); | ||
|
||
IonReader ionReader = ion.newReader("{foo: 1}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please help me teach users to use an IonReaderBuilder
for constructing IonReader
s. I dream of deprecating all of the miscellaneous non-DOM-related utility methods from IonSystem
one day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed. Also added a test for the example to avoid breaking in the future, which already happened...
Also added a test for the example to make sure it works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before you merge, if you're going to make the first commit part of the history, don't forget to change it from "PathExtraction initinal implementaion" to "PathExtraction initial implementation".
README.md
Outdated
```java | ||
PathExtractorBuilder.standard() | ||
.withMatchCaseInsensitive(true) | ||
.register("(foo)", (reader) -> { ... }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
withSearchPath
README.md
Outdated
|
||
### Notification | ||
Each time the `PathExtractor` encounters a value that matches a registered search path it will invoke the respective | ||
callback passing the reader positioned at the current value. See `PathExtractorBuilder#register` methods for more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
withSearchPath
registered callbacks
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.