-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Undefined behaviour crash on NaN/inf/-inf input #242
Comments
In fact in reversible mode at least, with these flags I do find it round-trips NaNs correctly. YMMV and obviously wouldn't work in lossy mode. |
It is of course undesirable for zfp to crash if fed NaNs or infinities, which I suppose happens in the conversion from floating point to integer in the block-floating-point transform, which causes UB. On the other hand, what would be an acceptable solution here? zfp cannot do anything meaningful with such data (except in reversible mode, as you correctly point out), and so could only at best return an error without actually representing the data. At this point, the application basically has to terminate as this results in complete loss of data. Of course, doing so without crashing is preferable. However, to detect non-numbers, one would have to insert potentially expensive When zfp is used for offline storage (as in xarray and zarr), a reasonable solution would be for the application code to perform such checks before calling zfp. I realize that such libraries may have to be modified deep down to perform such checks, but they still need to deal with the fact that zfp would fail (ideally without crashing) in this case. In the case of zarr, there's a layer around zfp (in numcodecs) where, IMO, such testing (e.g., via I'm unaware of xarray usage of zfp and so can't comment on that particular case. All this being said, we have thought of several ways of supporting (encoding) non-numbers, primarily for offline storage, where performance is not as critical. Likely the solution would be to set a compile-time flag or a runtime flag (in Sorry for not being more helpful here, but I think having the application or I/O layer check the data for validity is a reasonable compromise. Another might be to add a compile-time macro (disabled by default) to zfp to request that it check inputs and return zero upon failure. Keep in mind that this would result in |
Thanks again for the considered response. A hard crash isn't nice, but I think it would be reasonable to always raise a Python exception from zfpy on NaN/inf/-inf input. Even better would be an option to silently ignore NaN/inf/-inf, if you don't care what these values round-trip as and don't want the overhead of checking for and replacing them before calling zfp -- although not strictly essential. I'm not an expert on this but I believe on CPU you can trap some of these failure conditions at a fairly low level via signal handlers without having to check each value, and I think this is what numpy does with e.g. IMO numpy's approach is quite nice in giving the user a choice: do you want to silently ignore and accept some possibly-implementation-dependent default value, raise an exception, log a warning but proceed, ... You can see this play out by trying e.g. |
Thanks for the suggestion. From IEEE 754-2019 Section 5.8 (Details of conversions from floating-point to integer formats), it seems clear enough that an exception would be generated:
Though it's unclear if Additionally, the invalid operation exception would be raised only when the actual conversion takes place, which invokes UB. From C99 6.3.1.4:
Would it not be "too late" at that point to catch the exception, assuming one is thrown? Probably not in practice, but preferably UB would be avoided altogether. |
Hey, yeah sorry I didn't mean that np.seterr would catch this for you, just that it's an example of a python library with functionality that does avoid crashing on similar errors in its own code. Looks like I was wrong and numpy is not trapping SIGFPE, but testing CPU flags. Again not an expert but I'm assuming these flags persist until cleared, so numpy can check only after the entire operation has completed? I can see this is a tricky one anyway, and not as urgent as the other bug, so no worries if there's not an easy fix. |
Sorry another UB bug:
I'm aware that zfp doesn't currently support round-tripping non-finite values, but it would be good if it didn't crash hard on NaN/inf/-inf input, in particular since client libraries like xarray / zarr will often pad data with NaNs, sometimes at a fairly low level in the library e.g. to achieve uniform chunk sizes before writing to zarr.
As in #241, with clang I get SIGILL crashes, and with UBSAN enabled it's attributed to, e.g.:
Assuming we want to allow some default conversion to happen silently, this can be worked around via
-fno-strict-float-cast-overflow -fno-sanitize=float-cast-overflow
, but I'm reluctant to enable these too widely in case it masks some other bug.The text was updated successfully, but these errors were encountered: