Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BMI2 impact / xpack name #2

Open
JoeUX opened this issue Aug 13, 2016 · 1 comment
Open

BMI2 impact / xpack name #2

JoeUX opened this issue Aug 13, 2016 · 1 comment

Comments

@JoeUX
Copy link

JoeUX commented Aug 13, 2016

Hello! I admire your work here and in libdeflate. Do you have any sense yet for the performance impact of the BMI2 instructions? Relatedly, have you tried the Trailing Bit Manipulation instructions at all? They're AMD-only, which understandably makes them hard to use (and share the resulting binaries), but I'm curious if you've tested them. The bit field extract instruction seems helpful here (and there are ten instructions in the set overall). I feel like there's a lot of untapped potential in AMD chips, particularly with respect to instructions only they have implemented, but sadly there's a network effect that works against exploiting that potential.

Separately, are you aware that XPack is already used as a name for a compression utility by these folks? http://faculty.cs.tamu.edu/caverlee/pubs/rocco05xpack.pdf

It probably doesn't matter too much, since the paper is from 2005 and nothing seems to have followed it. It's a fascinating piece of work on XML compression, and reminds me that the web could really use compressors tailored to, well, web formats. I found it when googling xpack compression.

Cheers,

Joe

@ebiggers
Copy link
Owner

Well, it's easy to see the performance difference in decompression if you have a BMI2-capable processor. Do a benchmark with the current code, then change xpack_decompress.c to always call xpack_decompress_default() instead of xpack_decompress_bmi2() and do another benchmark. On my computer the BMI2 version is about 10% faster. I do not use any assembly language in the decompressor, so the difference is only from instructions the gcc compiler uses. Interestingly, the only BMI2 instructions gcc actually uses are the extended shift instructions (BEXTR is not used).

If you have an AMD processor supporting TBM you are welcome to do benchmarks with and without TBM instruction generation enabled. I do not have such a processor, so I cannot do such a benchmark.

I was not aware that there is already an XML document compression system named XPack. It can be hard to choose a good name for a compression format, and many good names are already taken. I am not going to come up with a different name right now because I am not planning to update the project much more (for several reasons, the main one being that I believe time is better spent on Zstandard at this point).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants