-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write dict_id in zlib header in big-endian style. #207
base: master
Are you sure you want to change the base?
Conversation
Looks like it should be written as bit-endian. However, how about int isal_read_zlib_header(struct inflate_state *state, struct isal_zlib_header *zlib_hdr)
{
...
if (zlib_hdr->dict_flag) {
case ISAL_ZLIB_DICT:
ret = fixed_size_read(state, &next_in, ZLIB_DICT_LEN);
... The |
I don't have a big-endian machine, I can't prove whether it works or not. However, zlib trailer is the same as zlib dict, if zlib trailer worked fine on a big-endian machine, I think zlib dict would work as well. |
Does ISA-L support any big-endian machines? x86_64 is little-endian and aarch64 is mostly little-endian, if I remember correctly. |
Good point but the trailer is read as
@iii-i did a lot of work to ensure s390, big-endian was correct but we don't have one in the test pool. |
I still think it's the same.
|
You can get a big-endian machine for testing here: https://www.ibm.com/community/z/linuxone-cc/login-vm/ |
33f4d02
to
a5266f2
Compare
Signed-off-by: Mo Zhai <[email protected]>
Is anything blocking this? This seems like a major bug fix (current behavior not in line with zlib format specification). |
@rhpvorderman, I don't think this one is correct. I haven't seen any confirmation that this is an actual bug and this fix may introduce a bug if it is currently correct. As I understood it, @MoZhai15 added this from observation in an unrelated issue without any BE testing.
Also this line was changed explicitly by Ilya to store_le_u32() in the port to BE as were all the header writes. There is currently an issue with the igzip_wrapper_hdr_test but if you skip these it reports So I would label this as waiting on test and confirmation. |
I added some tests to python-isal. (https://github.com/pycompression/python-isal/blob/a6c2dcff4e2386d7c9b5a6406ccf399ad3a4d3de/tests/test_compat.py#L170)
The later test succeeds. So it looks like ISA-L writes the dict correctly. The former test fails however. So ISA-L apparently cannot parse the zlib dict correctly? Or something goes wrong in the python wrapper code. That is also a possibility. |
Hi @rhpvorderman. Is this still an issue? Thanks! |
I still believe so, but it's been a long time, I cannot find my demo.
And after this patch, it works fine. Consider of zlib trailer, write_trailer in BE
read_trailer in LE
Then zlib read dict in LE.
zlib write dict in LE.
My point is trailer and dict are similar, so the read dict should be BE. |
The test cases in python-isal work. So not an issue anymore. |
Thanks! Actually, with this patch, the "igzip_wrapper_hdr_test" unit test fails, so unless the read has to be in big endian, this will break the tests. Since python-isal works, it looks like the header produced by isa-l is the same as the one produced by zlib? |
Looks like it. Python-isal tests compatibility against zlib. Maybe the best course of action is to keep the code as is until someone files a bug report about wrong headers? |
Thanks. Looking into those tests, it looks like the header is not actually read, so it won't matter how the dict_id was written, the dictionary is passed directly to both interfaces. Am I right? |
Yes, but the header is also parsed by zlib (the zstream is initiated with wbits=15). I don't know if it ignores the dict id if inflateSetDict or deflateSetDict functions have been used beforehand. I am not that knowledgeable on zlib internals. |
@rhpvorderman It might be ignoring the header, because I checked and isal is not writing the dict id. As Greg Tucker pointed out here #206 (comment), this is expected and isal_write_zlib_header should be called. |
@pablodelara Ah, ok. Well in that case python-isal might do something wrong. It is not a common use case at all for python-isal, so the issue is certainly not a blocker for me. I also do not have time to investigate this further on short notice. Sorry for that. |
Thanks for the input. Let's defer this for the next release. It is not clear there is something wrong, as RFC1950 doesn't state the endianness of this dict_id field. |
To use a preset dictionary, a dict_id should be written in zlib header. That information should be written in big-endian style but not little-endian.
Mo Zhai [email protected]