Update Zstd decompression for "unknown decompressed size" when streaming API was used for compression #116
Labels
Filter - ZSTD
Priority - 1. High 🔼
These are important issues that should be resolved in the next release
Type - Improvement
Milestone
Introduction
The Zstandard plugin for HDF5 should be modified to allow for an unknown decompressed size in the frame header.
Currently, the Zstd decompression scheme, following from the original implemention, uses
ZSTD_getDecompressedSize
to obtain the size of the decompressed buffer. The returned value is not validated and passed directly tomalloc
.hdf5_plugins/ZSTD/src/H5Zzstd.c
Lines 59 to 60 in 770d70a
ZSTD_getDecompressedSize
returns0
if the decompressed size is empty, unknown, or an error has occured. Ifmalloc
is asked to allocate0
bytes, it will returnNULL
, resulting in returning an error condition. This is an incorrect result if the decompressed size is actually empty or unknown and there is no actual error.ZSTD_getDecompressedSize
is obsolete.ZSTD_getFrameContentSize
should replace the use ofZSTD_getDecompressedSize
.ZSTD_getFrameContentSize
distinguishes between empty, unknown, or an error. The unknown or error states are indicated by a return value ofZSTD_CONTENTSIZE_UNKNOWN
orZSTD_CONTENTSIZE_ERROR
, respectively.The unknown decompression state is common. This occurs when the compression is done via the streaming API via
ZSTD_compressStream
orZSTD_compressStream2
.ZSTD_compressStream2
in particular only stores the frame size when eitherZSTD_e_end
is provided on the initial call orZSTD_CCtx_setPledgedSrcSize
is used.Tasks
ZSTD_getFrameContentSize
instead of the obsoleteZSTD_getDecompressedSize
to correctly distinguish between empty, unknown, or error states when determining the decompressed size.ZSTD_getFrameContentSize
againstZSTD_CONTENTSIZE_UNKNOWN
ZSTD_decompressStream
References
[1] https://facebook.github.io/zstd/zstd_manual.html
The text was updated successfully, but these errors were encountered: