Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocator<T>::uncompress is a performance nightmare #867

Closed
dirkwhoffmann opened this issue Jan 22, 2025 · 3 comments
Closed

Allocator<T>::uncompress is a performance nightmare #867

dirkwhoffmann opened this issue Jan 22, 2025 · 3 comments

Comments

@dirkwhoffmann
Copy link
Owner

Uncompressing a snapshot can take multiple seconds if it contains a hard drive. The reason is that the implementation utilizes std::vector which is awfully slow, at least in debug builds.

TODO: Rewrite the code using good old C arrays.

@dirkwhoffmann
Copy link
Owner Author

Comparison between old and new code. Test case is a 138 MB snapshot shrinking down to 9.3 MB after compression:

  • Debug build

    • Old code
      Uncompressing 9790563 bytes (hash: ff80197f)... 2.48 sec
      Compressing 145308450 bytes (hash: f9ea0396)... 0.39 sec
    • New code
      Uncompressing 9790563 bytes (hash: ff80197f)... 0.17 sec
      Compressing 145308450 bytes (hash: f9ea0396)... 0.40 sec
  • Release build

    • Old code
      Uncompressing 9790563 bytes (hash: ff80197f)... 0.17 sec
      Compressing 145308450 bytes (hash: cf0b5b96)... 0.11 sec
    • New code
      Uncompressing 9790563 bytes (hash: ff80197f)... 0.07 sec
      Compressing 145308450 bytes (hash: cf0b5b96)... 0.14 sec

Pros:

  • No more beachballing in debug builds.

Cons:

  • The new code is more complicated because I have to deal with dynamic buffer resizing myself.
  • In release builds, the 2.5x speed boost does not really matter as the 0.17 sec consumed by the old code is still acceptable.
  • The old vector-based compression code outperforms my C-array-style code. This is interesting by itself as I have no idea how this is possible.

Overall, the cons outweigh the pros. Therefore, I'll revert to the old code.

dirkwhoffmann added a commit that referenced this issue Jan 23, 2025
@mras0
Copy link

mras0 commented Jan 23, 2025

Might be faster to just use this rather than the decode lambda:

vec.insert(vec.end(), isize(ptr[++i]), prev);

@dirkwhoffmann
Copy link
Owner Author

Might be faster to just use this rather than the decode lambda:

Indeed. This makes a huge difference. Very cool!

Debug build:

  • Old: Uncompressing 9790563 bytes... 2.45 sec
  • New: Uncompressing 9790563 bytes... 1.50 sec

Release build:

  • Old: Uncompressing 9790563 bytes... 0.15 sec
  • New: Uncompressing 9790563 bytes... 0.09 sec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants