Java Femtopzip gets into a nasty infinite loop on corrupt input data #6

ehrmann · 2013-01-17T23:56:21Z

I have a slightly proprietary example I can give you (contact me at ehrmann+1923 <at> gmail).

What happened was that I base 64 encoded a byte array after compressing it, converted it to lower case, decoded it back to a byte array, then tried to decompress it. compressionModel.decompress(data) took much longer than it should have, then the JVM ran out of memory. There's a chance Femtozip is correctly decoding what becomes a massive byte array, but it could also be a bug.

A nice workaround might be to have a maxExpectedSize parameter on decompress to guard against this.

gtoubassi · 2013-01-18T00:59:31Z

Hi Dave,

I'd like to dig into the specific case but don't expect to have time over the next few weeks. I believe gzip avoids this problem by encoding an adler-32 checksum to verify the sanity of the data. I chose to avoid doing something like this to keep from growing the output (since fz is designed for small payloads, adding 2 or 4 bytes would be significant). I assume for your purposes putting your own checksum around the payload would be unacceptable? I like the idea of having a maxExpectedSize as a hint.

ehrmann · 2013-01-18T01:11:54Z

At least for now, I could prefix it with my own checksum. What I really want to make sure of is that there isn't a bug that's leading to this.

The other nice feature would be a decompressInterruptibly() method that can be aborted.

gtoubassi · 2013-01-19T05:33:46Z

Yes the first order issue is diagnosing the case at hand it would be nice
if the decompressor was more of a generator that you would invoke
repeatedly to pump out the bytes. gzi has somewhat this architecture. It
added significant complexity to the code and I figured for small payloads
it wasn't worth it.

I'm open to all apparoaches here..

On Thu, Jan 17, 2013 at 5:11 PM, David Ehrmann [email protected]:

At least for now, I could prefix it with my own checksum. What I really
want to make sure of is that there isn't a bug that's leading to this.

The other nice feature would be a decompressInterruptibly() method that
can be aborted.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/6#issuecomment-12402822.

ehrmann · 2013-03-07T03:16:32Z

When you get a chance, let me know if you'd like an example of the model and byte[] that cause the issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Java Femtopzip gets into a nasty infinite loop on corrupt input data #6

Java Femtopzip gets into a nasty infinite loop on corrupt input data #6

ehrmann commented Jan 17, 2013

gtoubassi commented Jan 18, 2013

ehrmann commented Jan 18, 2013

gtoubassi commented Jan 19, 2013

ehrmann commented Mar 7, 2013

Java Femtopzip gets into a nasty infinite loop on corrupt input data #6

Java Femtopzip gets into a nasty infinite loop on corrupt input data #6

Comments

ehrmann commented Jan 17, 2013

gtoubassi commented Jan 18, 2013

ehrmann commented Jan 18, 2013

gtoubassi commented Jan 19, 2013

ehrmann commented Mar 7, 2013