Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H5Z_FLAG_OTPTIONAL compliant behavior when lz4 compressoin enabled #111

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

hernot
Copy link

@hernot hernot commented Jan 30, 2022

Makes bitshuffle with lz4 compression enabled abort when output size exceeds input size as recommended by libhdf5 documentation when H5Z_FLAG_OPTIONAL is set on compression/compressing filters as h5py high level interface does for all registered compression filters. Plausibility check whether bithsuffling in combination with lz4 can reduce size of input more than the 16 bytes required to store raw input size, overall output size and the size of the first compressed block plus a minimum of 2bytes payload resulting from lz4 compression following bitshuffling. If it input is too short for this minimum gain bithsuffling with lz4 compression is not attempted at all. In case lz4 compression is not enabled all data is shuffled disregarding whether H5Z_FLAG_OPTIONAL is set or not.

Further noticed that when compiled with omp enabled than in case of error the locks an mutices used to protect ioc_chain are not torn down properly in case of error. Just when bithsullfling with and without compression enabled completes successfully. Details see commit message.

This PR is related to and resolves issue #110

Awaiting your your comments and review.

@jrs65
Copy link
Collaborator

jrs65 commented Mar 21, 2022

Hi @hernot it would be great to get this merged at some point soon, thanks for all the effort.

It kind of arrived right about the time we were merging in the zstd support, and a large number of other changes so it's ended up with a bunch of merge conflicts. Would you be able to resolve those and then I'll take a close look at the changes?

I guess this also won't have the compliant behaviour for the zstd pipeline, but we can look into adding that later on based on your changes.

Thanks!

@hernot
Copy link
Author

hernot commented Mar 22, 2022

Sure. I already expected. Just need to find some time to rebase to master/head and check the differences. I guess that zstd support is not so much different from lz4 so i can have a look and include necessary adjustments or at leas come up with some ideas how to also apply to zstd.

When lz4 compression is enabled and the H5Z_FLAG_OPTIONAL flag is set
compression is abborted and an error returned when the size of output
stream exceeds the size of the input stream in bytes. This is the
behaviour for compression filters recommended by the libhdf5
documentation for the HZ5_FLAG_OPTIONAL.

Added compiletime macros allowing fast abort in single threaded mode and
when compiled with omp support. In the latter case it is ensured that
no futher chunk is shuffled and compressed after abort has been
signalled by any of the threads. The actual abort occurs as soon as all
workers have safely cleared their queue.

Missing calls to  ioc_destroy in case of error by bshuf_blocked_wrap_fun
========================================================================
ioc_destroy function was only called on return when bshuf_bloced_wrap_fun
finished successfully. In any error case just the error coded was
returned but the ioc_pipipine was not released. In single threaded mode
this does not have any effect as ioc_destroy is a noop which is very
likely removed during call site optimization step. But when compiling
with omp enabled than it is responsible that all synchonization barriers
and locks are properly released again. Which was not the case when
bshuf_blocked_wrap_fun aborted returning an error code.

Added Compile time macros for tearing down ioc before returning error
code.
Bitte geben Sie eine Commit-Beschreibung für Ihre Änderungen ein. Zeilen,
die mit '#' beginnen, werden ignoriert, und eine leere Beschreibung
bricht den Commit ab.
not able to run tests cause of some errors need help figuring and fix
@hernot
Copy link
Author

hernot commented Mar 22, 2022

Ok rebase done but have some issues with running tests
test_h5plugin.py compains about filter 32008 not defined/registered
test_h5filter.py complains about missing zstd libraries
and test_regression.py complains about missing lib/plugin

running python 3.8.10 and h5py 3.6.0

compiled with

python3 setup.py build_ext --inplace

and set exprort PYTHONPATH=./

In other words i need some hint what could be missing or wrong. git difftool -t vimdiff master is not really helpful for figuring.

@hernot
Copy link
Author

hernot commented Mar 23, 2022

Ok figured at least for building and text_ext.py and test_h5filter.py

libzstd latest version seems to include some assembly language files but these have postfix .S instead of .s expected by setuptools build_ext_. so to make the whole compile i had to disable assembly support by adding a define in setup.py. Further a small code-mixup introduced during rebasing and fixing conflicts prevented tests from running . The remaining tests you have to check if proper.

Should i squash my commits into one single commit or keep them?

Question

Just see that pullrequest #78 is still open since 3 years. What is the reason it was not considered? If it is just cause sumiter had no time to resolve conflicts? In this case I could offer to reflect it in this pull-request. As a side-mark: Windows can be a bitch with its tons of versions of c runtime environment and their neither compatible nor linked memory allocators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants