Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix: Ensure correct data retrieval for global cell order in dense reader, avoiding fill values #5413

Merged
merged 4 commits into from
Jan 8, 2025

Conversation

kounelisagis
Copy link
Member

@kounelisagis kounelisagis commented Dec 30, 2024

slab_dim is being calculated incorrectly in DenseReader::cell_slab_overlaps_range, causing global cell order reads to return some fill values. After this fix, slab_dim remains unchanged for row-major or column-major requests, but for global order requests, cell order is used to determine slab_dim.

Let's align the calculation with the approach already used in other cases:

if (layout_ == Layout::ROW_MAJOR ||
(layout_ == Layout::GLOBAL_ORDER && cell_order == Layout::ROW_MAJOR))

The initial issue was also verified with TileDB-Py, and after applying this fix, the results are as expected.

[sc-60301]


TYPE: NO_HISTORY | BUG
DESC: Fix incorrect slab_dim calculation in DenseReader::cell_slab_overlaps_range to prevent fill values in global cell order reads.

@kounelisagis kounelisagis force-pushed the agis/fix-global-cell-order-fill-values branch from 2b2ec34 to 7ff9722 Compare January 3, 2025 13:59
@kounelisagis kounelisagis marked this pull request as ready for review January 3, 2025 14:18
test/regression/targets/sc-60301.cc Outdated Show resolved Hide resolved
@@ -1539,7 +1539,12 @@ tuple<bool, uint64_t, uint64_t> DenseReader::cell_slab_overlaps_range(
const NDRange& ndrange,
const std::vector<DimType>& coords,
const uint64_t length) {
const unsigned slab_dim = (layout_ == Layout::COL_MAJOR) ? 0 : dim_num - 1;
const auto cell_order = array_schema_.cell_order();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking my understanding:

layout_ is the user's requested data order.

A "slab" is a run of cells within a tile which are contiguous in the users' requested data order. e.g. if the user requests global order then the whole tile is a single slab; if the user requests row-major order then each row within a tile is a slab if the cell order is row-major, etc.

slab_dim is meant to be the dimension which the cells of the slab are contiguous. If the requested order is row major then the slabs are contiguous columns (i.e. dimension 1), if the cell order is column major then the slabs are contiguous rows (i.e. dimension 0).

In your example, the tile and cell orders are column major. The read query is global order, so the whole tile is a slab, so the slab_dim should be chosen using the schema's cell order.

And that is what your fix does: slab_dim is unchanged if the user requests row or column major, but if they request global order then the cell order is used to determine slab_dim.

Without this, slab_dim is incorrect which results in returning an incorrect intersection, leading to fill values used for other cells.

Do I have this right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rroelke , my understanding is exactly the same.

test/regression/targets/sc-60301.cc Outdated Show resolved Hide resolved
@kounelisagis kounelisagis force-pushed the agis/fix-global-cell-order-fill-values branch from 7ff9722 to b78ab26 Compare January 7, 2025 14:13
@kounelisagis kounelisagis requested review from rroelke and ypatia January 7, 2025 14:24
Copy link
Contributor

@rroelke rroelke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@kounelisagis kounelisagis merged commit e7021c4 into main Jan 8, 2025
64 checks passed
@kounelisagis kounelisagis deleted the agis/fix-global-cell-order-fill-values branch January 8, 2025 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants