-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fix: Ensure correct data retrieval for global cell order in dense reader, avoiding fill values #5413
Conversation
2b2ec34
to
7ff9722
Compare
@@ -1539,7 +1539,12 @@ tuple<bool, uint64_t, uint64_t> DenseReader::cell_slab_overlaps_range( | |||
const NDRange& ndrange, | |||
const std::vector<DimType>& coords, | |||
const uint64_t length) { | |||
const unsigned slab_dim = (layout_ == Layout::COL_MAJOR) ? 0 : dim_num - 1; | |||
const auto cell_order = array_schema_.cell_order(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking my understanding:
layout_
is the user's requested data order.
A "slab" is a run of cells within a tile which are contiguous in the users' requested data order. e.g. if the user requests global order then the whole tile is a single slab; if the user requests row-major order then each row within a tile is a slab if the cell order is row-major, etc.
slab_dim
is meant to be the dimension which the cells of the slab are contiguous. If the requested order is row major then the slabs are contiguous columns (i.e. dimension 1), if the cell order is column major then the slabs are contiguous rows (i.e. dimension 0).
In your example, the tile and cell orders are column major. The read query is global order, so the whole tile is a slab, so the slab_dim
should be chosen using the schema's cell order.
And that is what your fix does: slab_dim
is unchanged if the user requests row or column major, but if they request global order then the cell order is used to determine slab_dim
.
Without this, slab_dim
is incorrect which results in returning an incorrect intersection, leading to fill values used for other cells.
Do I have this right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rroelke , my understanding is exactly the same.
Co-authored-by: Ypatia Tsavliri <[email protected]>
7ff9722
to
b78ab26
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
slab_dim
is being calculated incorrectly inDenseReader::cell_slab_overlaps_range
, causing global cell order reads to return some fill values. After this fix,slab_dim
remains unchanged for row-major or column-major requests, but for global order requests, cell order is used to determineslab_dim
.Let's align the calculation with the approach already used in other cases:
TileDB/tiledb/sm/query/readers/dense_reader.cc
Lines 2319 to 2320 in 87c9860
The initial issue was also verified with TileDB-Py, and after applying this fix, the results are as expected.
[sc-60301]
TYPE: NO_HISTORY | BUG
DESC: Fix incorrect
slab_dim
calculation inDenseReader::cell_slab_overlaps_range
to prevent fill values in global cell order reads.