Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow decoding performance #236

Open
sjakobi opened this issue May 21, 2020 · 6 comments
Open

Slow decoding performance #236

sjakobi opened this issue May 21, 2020 · 6 comments

Comments

@sjakobi
Copy link
Contributor

sjakobi commented May 21, 2020

dhall uses cborg to cache imported Dhall expressions. We've been getting several reports that the decoding of these caches is slower than expected. The decoding speed we're seeing is about 10 MB/s which seems slow.

1. What kind of decoding speed should we expect from cborg?

Profiles of the dhall code end up pointing mostly at getDecodeAction and deserialiseIncremental. The Core seems fairly impenetrable to us.

2. Do you have a suggestion how to better figure out what's taking so much time?

If you want to take a look at the code, the main entry point is decodeExpression and most of the decoding logic is in decodeExpressionInternal:

@sjakobi
Copy link
Contributor Author

sjakobi commented May 21, 2020

You can find samples of the CBOR-encoded cache files, accompanied by versions in the CBOR diagnostic format, in Dhall's acceptance test suite, for example:

@sjakobi
Copy link
Contributor Author

sjakobi commented May 22, 2020

And here's an interesting heap memory profile, that might indicate a space leak?

https://discourse.dhall-lang.org/t/high-memory-use-when-decoding-dhall-expressions/171

For a binary of about 5 MB, getDecodeAction and deserialiseIncremental take a total of about 70 MB maximum heap residency.

@bgamari
Copy link
Member

bgamari commented May 22, 2020

Thanks for filing this, @sjakobi!

@Gabriella439
Copy link

Gabriella439 commented Jun 13, 2020

Here are larger files that will be useful for benchmarking purposes:

@Gabriella439
Copy link

@bgamari: Here's a version of the larger file that should work with newer Dhall versions (including 1.32.0 and the current master)

12209b260e02c107d04052e323311ab8ed429652018f9d2ed3610ff2b8dfc5f064a0.gz

@bgamari
Copy link
Member

bgamari commented Jun 14, 2020

The current state of things:

  • Perform packing of small integers eagerly #240 fixes a couple of bits of low-hanging fruit which sum to a few percent improvement
  • dhall's master branch already has a few optimisations in the Binary module which improve deserialisation performance
  • I have a few additional changes (currently characterising) which improve things a bit further
  • On the whole, I think decode performance is pretty reasonable: merely deserialising @Gabriel439's 11 MByte output (with dhall decode --quiet) requires around 280 milliseconds of mutator time on my four-year-old laptop. This is approximately 40 MByte/second which is not terrible given that this is a rather "dense" AST. Of course, this excludes GC time, but I think this is fair given that this program does no useful work.
  • With my branch there are a (nearly) no long-lived thunks produced that aren't either maps or sets. Given that these may not be forced in the end I'm not sure that it makes sense to do this work eagerly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants