Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-3987 introduce InternedIRI for use in Vocabularies to improve performance #3988

Merged
merged 2 commits into from
Jun 19, 2022

Conversation

hmottestad
Copy link
Contributor

GitHub issue resolved: #3987

Briefly describe the changes proposed in this PR:

  • New (internalOnly) class InternedIRI which interns the stringValue and pre-calculates the hashCode
  • Use InternedIRI in Vocabularies and in CoreDatatype

PR Author Checklist (see the contributor guidelines for more details):

  • my pull request is self-contained
  • I've added tests for the changes I made
  • I've applied code formatting (you can use mvn process-resources to format from the command line)
  • I've squashed my commits where necessary
  • every commit message starts with the issue number (GH-xxxx) followed by a meaningful description of the change

@hmottestad hmottestad marked this pull request as draft June 15, 2022 11:07
@hmottestad
Copy link
Contributor Author

hmottestad commented Jun 15, 2022

I've changed both #3990 and this PR to draft for the time being. They both definitely improve performance, but that was only really obvious when combined with #3934 .

@hmottestad hmottestad force-pushed the GH-3987-vocabulary-iri branch from a44b487 to f2e086b Compare June 15, 2022 16:32
@hmottestad hmottestad force-pushed the GH-3987-vocabulary-iri branch from f2e086b to 52cfc1b Compare June 15, 2022 20:55
@JervenBolleman
Copy link
Contributor

JervenBolleman commented Jun 16, 2022

Hi just a quick thought. Will this really help at scale? I have had bad experiences with intern on string similar to https://shipilev.net/jvm/anatomy-quarks/10-string-intern/

Never mind I see you notes this in the comments 😹

@hmottestad
Copy link
Contributor Author

hmottestad commented Jun 17, 2022

Before

Benchmark                                                                                        (param)  Mode  Cnt   Score   Error  Units
ReasoningBenchmark.forwardChainingSchemaCachingRDFSInferencer                            moreRdfs::12180  avgt    5  12.385 ± 0.043  ms/op
ReasoningBenchmark.forwardChainingSchemaCachingRDFSInferencerMultipleTransactions        moreRdfs::12180  avgt    5   9.372 ± 0.072  ms/op
ReasoningBenchmark.forwardChainingSchemaCachingRDFSInferencerMultipleTransactionsSchema  moreRdfs::12180  avgt    5   9.447 ± 0.019  ms/op
ReasoningBenchmark.forwardChainingSchemaCachingRDFSInferencerSchema                      moreRdfs::12180  avgt    5   9.726 ± 0.029  ms/op
ReasoningBenchmark.noReasoning                                                           moreRdfs::12180  avgt    5   3.877 ± 0.016  ms/op
ReasoningBenchmark.noReasoningMultipleTransactions                                       moreRdfs::12180  avgt    5   3.858 ± 0.036  ms/op

After

Benchmark                                                                                        (param)  Mode  Cnt   Score   Error  Units
ReasoningBenchmark.forwardChainingSchemaCachingRDFSInferencer                            moreRdfs::12180  avgt    5  11.068 ± 0.074  ms/op
ReasoningBenchmark.forwardChainingSchemaCachingRDFSInferencerMultipleTransactions        moreRdfs::12180  avgt    5   8.388 ± 0.013  ms/op
ReasoningBenchmark.forwardChainingSchemaCachingRDFSInferencerMultipleTransactionsSchema  moreRdfs::12180  avgt    5   8.540 ± 0.140  ms/op
ReasoningBenchmark.forwardChainingSchemaCachingRDFSInferencerSchema                      moreRdfs::12180  avgt    5   8.866 ± 0.029  ms/op
ReasoningBenchmark.noReasoning                                                           moreRdfs::12180  avgt    5   3.901 ± 0.012  ms/op
ReasoningBenchmark.noReasoningMultipleTransactions                                       moreRdfs::12180  avgt    5   3.718 ± 0.021  ms/op

@hmottestad hmottestad marked this pull request as ready for review June 17, 2022 08:04
@hmottestad hmottestad closed this Jun 17, 2022
@hmottestad hmottestad reopened this Jun 17, 2022
@hmottestad
Copy link
Contributor Author

This is showing a decent 10% performance improvement in the more complex RDFS reasoning benchmark.

@hmottestad hmottestad requested review from abrokenjester and removed request for abrokenjester June 17, 2022 10:28
@hmottestad hmottestad merged commit ac63da0 into develop Jun 19, 2022
@hmottestad hmottestad deleted the GH-3987-vocabulary-iri branch June 19, 2022 04:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants