-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rest_vol_dataset: (fix) Write hyperslab (memory) #50
Conversation
jwsblokland
commented
Aug 23, 2023
- Improved the function RV_dataset_write() such it makes use of the function H5Dgather() to gather the data needed for writing a hyperslab (memory) to a file. This approach is similar to the one found in the function RV_dataset_read().
- Adapted the hyperslab and point selection test.
The CI failures are due to an edge case in random datatype testing trying to create a zero-length string, not an issue with this PR. Otherwise, LGTM. |
@@ -829,6 +829,13 @@ RV_dataset_write(size_t count, void *dset[], hid_t mem_type_id[], hid_t mem_spac | |||
FUNC_GOTO_ERROR(H5E_DATATYPE, H5E_BADVALUE, FAIL, "memory datatype is invalid"); | |||
|
|||
write_body_len = (size_t)file_select_npoints * dtype_size; | |||
if (NULL == (write_body = (char *)RV_malloc(write_body_len))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably worth only allocating this buffer and calling H5Dgather if the memory space wasn't set to H5S_ALL and has a non-contiguous selection in it. Otherwise, the H5Dgather call could potentially add overhead for no reason. Checking for H5S_ALL should be straightforward, but detecting a non-contiguous selection may take a little effort in the way of H5S calls.
I also noticed that the code below for point selections allocates the same write_body
buffer. If I'm not mistaken, it looks like writing with a point selection could cause the memory allocated here to be leaked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like point selection does cause a memory leak. From test_rest_vol:
Testing small write to dataset w/ point selection
==309263== Invalid read of size 2
==309263== at 0x48529E0: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==309263== by 0x497F43D: H5D__gather_mem (H5Dscatgath.c:417)
==309263== by 0x496BB55: H5Dgather (H5D.c:1756)
==309263== by 0x48F44E2: RV_dataset_write (rest_vol_dataset.c:835)
==309263== by 0x4B1E977: H5VL__dataset_write (H5VLcallback.c:2236)
==309263== by 0x4B24335: H5VL_dataset_write_direct (H5VLcallback.c:2280)
==309263== by 0x496767B: H5D__write_api_common (H5D.c:1331)
==309263== by 0x496A71F: H5Dwrite (H5D.c:1388)
==309263== by 0x132BFE: test_write_dataset_small_point_selection (test_rest_vol.c:7535)
==309263== by 0x10F869: main (test_rest_vol.c:18250)
==309263== Address 0x777fcf0 is 32 bytes before a block of size 48 in arena "client"
==309263==
==309263== Invalid read of size 1
==309263== at 0x48528F1: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==309263== by 0x497F43D: H5D__gather_mem (H5Dscatgath.c:417)
==309263== by 0x496BB55: H5Dgather (H5D.c:1756)
==309263== by 0x48F44E2: RV_dataset_write (rest_vol_dataset.c:835)
==309263== by 0x4B1E977: H5VL__dataset_write (H5VLcallback.c:2236)
==309263== by 0x4B24335: H5VL_dataset_write_direct (H5VLcallback.c:2280)
==309263== by 0x496767B: H5D__write_api_common (H5D.c:1331)
==309263== by 0x496A71F: H5Dwrite (H5D.c:1388)
==309263== by 0x132BFE: test_write_dataset_small_point_selection (test_rest_vol.c:7535)
==309263== by 0x10F869: main (test_rest_vol.c:18250)
==309263== Address 0x777feac is 28 bytes after a block of size 48 in arena "client"
==309263==
==309263== Use of uninitialised value of size 8
==309263== at 0x48C8218: RV_base64_encode (rest_vol.c:3100)
==309263== by 0x48F4A30: RV_dataset_write (rest_vol_dataset.c:900)
==309263== by 0x4B1E977: H5VL__dataset_write (H5VLcallback.c:2236)
==309263== by 0x4B24335: H5VL_dataset_write_direct (H5VLcallback.c:2280)
==309263== by 0x496767B: H5D__write_api_common (H5D.c:1331)
==309263== by 0x496A71F: H5Dwrite (H5D.c:1388)
==309263== by 0x132BFE: test_write_dataset_small_point_selection (test_rest_vol.c:7535)
==309263== by 0x10F869: main (test_rest_vol.c:18250)
==309263==
==309263== Use of uninitialised value of size 8
==309263== at 0x48C8298: RV_base64_encode (rest_vol.c:3105)
==309263== by 0x48F4A30: RV_dataset_write (rest_vol_dataset.c:900)
==309263== by 0x4B1E977: H5VL__dataset_write (H5VLcallback.c:2236)
==309263== by 0x4B24335: H5VL_dataset_write_direct (H5VLcallback.c:2280)
==309263== by 0x496767B: H5D__write_api_common (H5D.c:1331)
==309263== by 0x496A71F: H5Dwrite (H5D.c:1388)
==309263== by 0x132BFE: test_write_dataset_small_point_selection (test_rest_vol.c:7535)
==309263== by 0x10F869: main (test_rest_vol.c:18250)
==309263==
==309263== Use of uninitialised value of size 8
==309263== at 0x48C82DB: RV_base64_encode (rest_vol.c:3111)
==309263== by 0x48F4A30: RV_dataset_write (rest_vol_dataset.c:900)
==309263== by 0x4B1E977: H5VL__dataset_write (H5VLcallback.c:2236)
==309263== by 0x4B24335: H5VL_dataset_write_direct (H5VLcallback.c:2280)
==309263== by 0x496767B: H5D__write_api_common (H5D.c:1331)
==309263== by 0x496A71F: H5Dwrite (H5D.c:1388)
==309263== by 0x132BFE: test_write_dataset_small_point_selection (test_rest_vol.c:7535)
==309263== by 0x10F869: main (test_rest_vol.c:18250)
==309263==
==309263== Use of uninitialised value of size 8
==309263== at 0x48C8204: RV_base64_encode (rest_vol.c:3099)
==309263== by 0x48F4A30: RV_dataset_write (rest_vol_dataset.c:900)
==309263== by 0x4B1E977: H5VL__dataset_write (H5VLcallback.c:2236)
==309263== by 0x4B24335: H5VL_dataset_write_direct (H5VLcallback.c:2280)
==309263== by 0x496767B: H5D__write_api_common (H5D.c:1331)
==309263== by 0x496A71F: H5Dwrite (H5D.c:1388)
==309263== by 0x132BFE: test_write_dataset_small_point_selection (test_rest_vol.c:7535)
==309263== by 0x10F869: main (test_rest_vol.c:18250)
==309263==
And the vol-tests:
==309052== 40 bytes in 1 blocks are definitely lost in loss record 2 of 7
==309052== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==309052== by 0x541B4B2: ???
==309052== by 0x4B2EBA7: H5VL__dataset_write (H5VLcallback.c:2236)
==309052== by 0x4B34565: H5VL_dataset_write_direct (H5VLcallback.c:2280)
==309052== by 0x4906D3B: H5D__write_api_common (H5D.c:1331)
==309052== by 0x4909DDF: H5Dwrite (H5D.c:1388)
==309052== by 0x149AF1: test_write_dataset_small_point_selection (vol_dataset_test.c:5758)
==309052== by 0x161B98: vol_dataset_test (vol_dataset_test.c:11257)
==309052== by 0x10FC1E: vol_test_run (vol_test.c:118)
==309052== by 0x10FC1E: main (vol_test.c:279)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhendersonHDF and @mattjala, thanks for the review and reminding me that I can use valgrind.
The mentioned memory leak of write_body
has been address and also the strange behavior of test_write_dataset_small_point_selection
test. For the test, it turns out I forgot the modify the H5Dwrite()
call by including a memory dataspace.
The only thing left to implement is checking for H5S_ALL and non-contiguous selection. Currently, if H5S_ALL == mem_space_id
and H5S_ALL != mem_space_id
the selection of the file_space_id
will be copied to mem_space_id
. Is this the intended behavior. I assume it but I ask to make sure. This is not what I would naively expect. With this behavior, I think, you implicitly assume that the memory dataspace is exactly the same as the file dataspace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you made a typo - did you mean to ask about the case where H5S_ALL == mem_space_id
and H5S_ALL != file_space_id
? If so, I believe that copying the selection from file space is the intended behavior of the library - see the second row of the mem_space_id/file_space_id table in the documentation for H5Dwrite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, you are right. Thanks for pointing me to the documentation of H5Dwrite().
69ccdeb
to
a88fd3a
Compare
- Improved the function RV_dataset_write() such it makes use of the function H5Dgather() to gather the data needed for writing a hyperslab (memory) to a file. This approach is similar to the one found in the function RV_dataset_read(). - Adapted the hyperslab and point selection test.
a88fd3a
to
823e090
Compare
Moved to #57 |