Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexOutOfBoundsException when accessing partition where the column was deleted #3731

Open
wants to merge 3 commits into
base: trunk
Choose a base branch
from

Conversation

sunil9977
Copy link
Contributor

  1. Added a null check for foundValue before processing it.
  2. Added appropriate test case for the same.

@bbotella
Copy link
Contributor

bbotella commented Dec 9, 2024

Please add a Jira ticket to the patch.

@sunil9977
Copy link
Contributor Author

@@ -2554,6 +2554,24 @@ public void filteringOnStaticColumnTest() throws Throwable
});
}

@Test
public void filteringOnDeletedStaticColumnValue() throws Throwable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a test for non-static as well? If we delete v0 only and not v1 then the row will exist and the column would have been deleted

@@ -705,7 +705,7 @@ public boolean isSatisfiedBy(TableMetadata metadata, DecoratedKey partitionKey,
if (column.type.isCounter())
{
ByteBuffer foundValue = getValue(metadata, partitionKey, row);
if (foundValue == null)
if (foundValue == null || foundValue.remaining() == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this patch is missing org.apache.cassandra.db.filter.RowFilter.MapElementExpression#isSatisfiedBy as well.

Rather than updating every code path, why not push this logic into getValue?

In getValue we can do

default:
                    Cell<?> cell = row.getCell(column);
                    if (cell == null) return null;
                    ByteBuffer bb = cell.buffer();
                    return bb.hasRemaining() ? bb : null;

with that one change all 4 code paths impacted are now seeing null (which they already handle)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also curious why we have empty bytes rather than null in this case... this implies that the pattern of

Cell<?> cell = row.getCell(column);
return cell == null ? null : cell.buffer();

is unsafe and has always been wrong... yet this pattern is common...

$ grep -r 'cell == null ? null : cell.buffer()' src/
src//java/org/apache/cassandra/db/marshal/MapType.java:            return cell == null ? null : cell.buffer();
src//java/org/apache/cassandra/db/marshal/ListType.java:            return cell == null ? null : cell.buffer();
src//java/org/apache/cassandra/db/marshal/UserType.java:            return cell == null ? null : cell.buffer();
src//java/org/apache/cassandra/db/filter/RowFilter.java:                    return cell == null ? null : cell.buffer();
src//java/org/apache/cassandra/index/internal/CassandraIndex.java:                               cell == null ? null : cell.buffer()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at callers of org.apache.cassandra.db.rows.Cell#buffer there are 62+ code paths that are similar pattern...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looked into it... we do org.apache.cassandra.cql3.terms.Constants.Deleter when we delete a column. This is defined as

builder.addCell(BufferCell.tombstone(column, timestamp, nowInSec, path));

which is

public static BufferCell tombstone(ColumnMetadata column, long timestamp, long nowInSec, CellPath path)
{
    return new BufferCell(column, timestamp, NO_TTL, nowInSec, ByteBufferUtil.EMPTY_BYTE_BUFFER, path);
}

so delete column is defined as write empty bytes... that explains the empty bytes at least

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants