Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URIExtractionNamespace: Avoid problems due to canonicalization of lookup fields. #4307

Merged
merged 1 commit into from
May 25, 2017

Conversation

gianm
Copy link
Contributor

@gianm gianm commented May 22, 2017

Disables canonicalization for simpleJson, where we expect field names to be unique
anyway. Keeps canonicalization enabled for customJson, but avoids sharing the
table with the global ObjectMapper.

This fixed an issue we saw where canonicalization of simpleJson fields caused serious
performance problems (taking >2 minutes to read a 22MB file).

…kup fields.

Disables canonicalization for simpleJson, where expect field names to be unique
anyway. Keeps canonicalization enabled for customJson, but avoids sharing the
table with the global ObjectMapper.
@gianm gianm added this to the 0.10.1 milestone May 22, 2017
this.parser = new DelegateParser(
new JSONParser(jsonMapper, ImmutableList.of(keyFieldName, valueFieldName)),
new JSONParser(jsonMapper.copy(), ImmutableList.of(keyFieldName, valueFieldName)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are bugs with jsonmapper copy if I recall. So the behavior of the copy might vary with Jackson version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you remember what kind of bugs?

If it's things along the lines of "not all configs and modules are copied" then that should be fine -- the jsonMapper shouldn't be finicky to that, since it's just reading generic Map<String, String> and not serializing anything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link. This should be fine then, it doesn't matter or not if we have injectable values for these parsers that just want to read Map<String, String>.

@fjy
Copy link
Contributor

fjy commented May 23, 2017

👍

@fjy fjy merged commit fe42db9 into apache:master May 25, 2017
gianm added a commit to implydata/druid-public that referenced this pull request May 26, 2017
…kup fields. (apache#4307)

Disables canonicalization for simpleJson, where expect field names to be unique
anyway. Keeps canonicalization enabled for customJson, but avoids sharing the
table with the global ObjectMapper.
gianm added a commit to implydata/druid-public that referenced this pull request Jul 14, 2017
…kup fields. (apache#4307)

Disables canonicalization for simpleJson, where expect field names to be unique
anyway. Keeps canonicalization enabled for customJson, but avoids sharing the
table with the global ObjectMapper.
@gianm gianm deleted the lcanon branch September 23, 2022 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants