-
-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blosc 1.7.0 compression uses more than uncompressed_size + BLOSC_MAX_OVERHEAD #159
Comments
Well, I think what you describe is completely compatible with the docstrings for blosc_compress():
but I agree that guaranteeing |
Reviewing this, I think I did not understand well the question. In fact, Blosc ensures that |
@FlorianLuetticke any updates? |
I think the behaviour was due to an error in my code. In this case, |
Interesting. This would indicate an API breakage. Blosc is supposed to guarantee, the data doesn't get bigger during compression and the above inequality should always hold true. Is there any chance you could put together a minimal example to illustrate your case? |
I encountered the same issue, except that my total_size = /* sum_of_each_block */
total_size += (nb_block * BLOSC_MAX_OVERHEAD);
...
for (i = 0; i < nb_block; i++) {
int r = blosc_compress_ctx(5, BLOSC_SHUFFLE, 1, blocksize[i], blockbuf[i],
&dst[off], total_size - off, BLOSC_LZ4_COMPNAME, 0, 1);
/* eventually r == 0 */
off += r;
} But when I give int r = blosc_compress_ctx(5, BLOSC_SHUFFLE, 1, blocksize[i], blockbuf[i],
&dst[off], blocksize[i] + BLOSC_MAX_OVERHEAD, BLOSC_LZ4_COMPNAME, 0, 1);
/* always r != 0 */ |
Could you provide a minimal, self-contained example of this so that we can check this out? |
OK, I've managed to reduce to a minimal test case #include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include "/usr/local/include/blosc.h"
int main()
{
char *src = "\xe0\x42\xcb\xfd\x23\x9b\xe1\x06\x50\x8d\x13\x10\x4c\xbf\xdd\xd2\x50\xdb\x87\x92\x42\x3e\xa2\xf1\x53\xd8\x43\x3c\x28\xb7\x78\x09\xfa"
"\x43\x06\x1d\xde\xe8\x23\x2e\x75\x36\x3e\xc1\xf5\x1c\x93\x46\xf7\x1b\xd8\x39\x59\x7a\x2a\xad\x52\x6d\xe9\x7b\x25\x61\x84\x1f\xa4\x8a"
"\x3c\x82\x72\x5f\xb0\xe8\x96\xef\xa9\x8b\x0b\x3d\xd1\x02\x58\xaa\x3b\xb1\x24\x65\x5e\x77\xd2\x47\xf2\xf7\xa8\x76\x16\x4c\x00\x52\xce"
"\x73\xb2\x7f\x5b\x48\x6e\x04\xd3\x79\x41\xa5\x7b\x99\x4f\xb6\x4b\x73\x1b\xa9\xea\xed\xf1\xdc\xe5\x99\x52\xfb\xe6\x53\x4e\xb4";
int src_size = 130;
int total_len = src_size + BLOSC_MAX_OVERHEAD + 100;
int i;
uint8_t dst[1024];
int r = blosc_compress_ctx(5, // clevel
BLOSC_SHUFFLE, // doshuffle
1, // typesize
src_size, // nbytes
src, // src
dst, // dest
total_len, // <-- try change to src_size + BLOSC_MAX_OVERHEAD
BLOSC_LZ4_COMPNAME, // compressor
0, // blocksize, 0: automatic blocksize
1 // numinternalthreads
);
printf("%d => %d (%d)\n", src_size, r, r - src_size);
return 0;
} It compressed 130 bytes to 154 bytes, which is obviously greater than |
I have observed the following, but I am unsure, if this can be called an Issue.
Using a comp_buffer_size much larger than uncompressed_size and executing
int compressedSize = blosc_compress(compress_level,shuffel,block_size,
uncompressed_size,uncompressed_buffer,
comp_buffer,comp_buffer_size);
the compressedSize can be larger than uncompressed_size + BLOSC_MAX_OVERHEAD, but will always be smaller than comp_buffer_size. For me, this was not clear from the documentation, I had the assumption, that compressedSize <= uncompressed_size + BLOSC_MAX_OVERHEAD would always hold true.
This can be cured by working with
int compressedSize = blosc_compress(compress_level,shuffel,block_size,
uncompressed_size,uncompressed_buffer,
comp_buffer,uncompressed_size + BLOSC_MAX_OVERHEAD );
Is there a performancechange between the two? Is this expected behavior?
The text was updated successfully, but these errors were encountered: