Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Window operator with NaN as the frame bound #11293

Closed
wants to merge 1 commit into from

Conversation

kagamiori
Copy link
Contributor

Summary:
NaN could appear as the frame bound of k-range frames in window
operations. When NaN appear in either of the frame bounds, the window
operation should return NULL at this row.

This diff fixes #11213.

Differential Revision: D64510519

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 17, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

Copy link

netlify bot commented Oct 17, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 9f1b807
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/673656af8716d400088477a0

kagamiori added a commit to kagamiori/velox that referenced this pull request Oct 17, 2024
)

Summary:

NaN could appear as the frame bound of k-range frames in window 
operations. When NaN appear in either of the frame bounds, the window 
operation should return NULL at this row.

This diff fixes facebookincubator#11213.

Differential Revision: D64510519
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

kagamiori added a commit to kagamiori/velox that referenced this pull request Oct 17, 2024
)

Summary:

NaN could appear as the frame bound of k-range frames in window 
operations. When NaN appear in either of the frame bounds, the window 
operation should return NULL at this row.

This diff fixes facebookincubator#11213.

Differential Revision: D64510519
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

kagamiori added a commit to kagamiori/velox that referenced this pull request Oct 17, 2024
)

Summary:

NaN could appear as the frame bound of k-range frames in window 
operations. When NaN appear in either of the frame bounds, the window 
operation should return NULL at this row.

This diff fixes facebookincubator#11213.

Differential Revision: D64510519
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

@aditi-pandit
Copy link
Collaborator

@kagamiori : Quick question : This logic should be applicable for k row frames as well. If yes, please modify https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L347 as well.

@kagamiori
Copy link
Contributor Author

@kagamiori : Quick question : This logic should be applicable for k row frames as well. If yes, please modify https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L347 as well.

Hi @aditi-pandit, NaN shouldn't be an issue to k rows frames because rows frames require the frame bounds to be integral while NaN is floating-point. Let me know if I misunderstand your question.

@aditi-pandit
Copy link
Collaborator

@kagamiori : Quick question : This logic should be applicable for k row frames as well. If yes, please modify https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L347 as well.

Hi @aditi-pandit, NaN shouldn't be an issue to k rows frames because rows frames require the frame bounds to be integral while NaN is floating-point. Let me know if I misunderstand your question.

@kagamiori : Makes sense about k row frame.

Though, we should add a validation in WindowNode for rejecting NaN value for k range constant frame as well https://github.com/facebookincubator/velox/blob/main/velox/core/PlanNode.cpp#L1412

@@ -383,6 +384,20 @@ void WindowPartition::updateKRangeFrameBounds(
orderByRowColumn.nullByte(),
orderByRowColumn.nullMask()));

// Mark the frame invalid if the frame bound is NaN.
auto frameType = data_->columnTypes()[mappedFrameColumn];
if (frameType->isReal()) {
Copy link
Collaborator

@aditi-pandit aditi-pandit Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could avoid this check on a per row basis. They can be computed outside the loop or the outer function could pass to the inner function if it should do a float or double NaN check. You can even go as far as templatizing this function with boolean parameters for doing floating/double NaN check, and then you can avoid the impact of the conditionals at runtime.

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kagamiori thanks for the fix % minors!

typename T,
std::enable_if_t<std::is_floating_point_v<T>, int32_t> = 0>
bool isNanAt(const char* row, int32_t columnIndex) {
auto column = columnAt(columnIndex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const auto

@@ -158,14 +158,18 @@ class WindowPartition {
/// @param numRows number of rows to compute buffer for.
/// @param rawPeerStarts buffer of peer row values for each row. If the frame
/// column is null, then its peer row value is the frame boundary.
/// @param rawFrameBounds SelectivityVector to keep track of valid frames.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the comment correct? validFrames?

@@ -383,6 +384,20 @@ void WindowPartition::updateKRangeFrameBounds(
orderByRowColumn.nullByte(),
orderByRowColumn.nullMask()));

// Mark the frame invalid if the frame bound is NaN.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we directly call this method and return false for non floating data type?

@aditi-pandit
Copy link
Collaborator

@kagamiori : Quick question : This logic should be applicable for k row frames as well. If yes, please modify https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L347 as well.

Hi @aditi-pandit, NaN shouldn't be an issue to k rows frames because rows frames require the frame bounds to be integral while NaN is floating-point. Let me know if I misunderstand your question.

@kagamiori : Makes sense about k row frame.

Though, we should add a validation in WindowNode for rejecting NaN value for k range constant frame as well https://github.com/facebookincubator/velox/blob/main/velox/core/PlanNode.cpp#L1412

@kagamiori : I realized later that k range frames (even if constant) always have range frame column. So we don't get the constant k value in WindowNode PlanNode frame. So it won't be possible for us to reject this case in Velox. We should error it in the co-ordinator. The best at runtime is to return NULL values for the entire result.

Ideally we shouldn't have such an inconsistent implementation. But we will need to do more design work if we want to do this validation.

@kagamiori
Copy link
Contributor Author

kagamiori commented Oct 25, 2024

@kagamiori : Quick question : This logic should be applicable for k row frames as well. If yes, please modify https://github.com/facebookincubator/velox/blob/main/velox/exec/Window.cpp#L347 as well.

Hi @aditi-pandit, NaN shouldn't be an issue to k rows frames because rows frames require the frame bounds to be integral while NaN is floating-point. Let me know if I misunderstand your question.

@kagamiori : Makes sense about k row frame.
Though, we should add a validation in WindowNode for rejecting NaN value for k range constant frame as well https://github.com/facebookincubator/velox/blob/main/velox/core/PlanNode.cpp#L1412

@kagamiori : I realized later that k range frames (even if constant) always have range frame column. So we don't get the constant k value in WindowNode PlanNode frame. So it won't be possible for us to reject this case in Velox. We should error it in the co-ordinator. The best at runtime is to return NULL values for the entire result.

Ideally we shouldn't have such an inconsistent implementation. But we will need to do more design work if we want to do this validation.

Hi @aditi-pandit, I see. Just to ensure we're on the same page, so for k range frames with either constant or column bounds, Presto coordinator is going to translate the frame bound into the form acceptable by Velox, so we'll let coordinator error out when the frame bound contains NaN. Is this right?

@aditi-pandit
Copy link
Collaborator

@kagamiori : Yes. Let me elaborate to reduce confusion. We have 2 error situations with range frames and Nan :

  • Constant bound k range frame : (window_func) OVER PARTITION BY ... ORDER BY ... RANGE BETWEEN NaN PRECEDING AND 5 FOLLOWING
    For such frames Presto co-ordinator computes the range frame column, which contains the value that we have to search for in ORDER BY column for the frame bound. This range frame column is computed as (order_by_column_value - NaN) at each row for the above frame.
    Generally we want to reject this entire frame bound itself as the range frame column value for each row is not meaningful.
    Velox never sees the original frame string. Velox API only has a spec for the range frame column, which Presto has populated with NaN at each row.

    So we will have to rely on Presto rejecting this frame.

  • Column bound k range frames : (window_func) OVER PARTITION BY ... ORDER BY ... RANGE BETWEEN column PRECEDING AND 5 FOLLOWING
    The range frame column is computed for this frame also. The range frame value at any row is order by column value (-/+) range frame column value depending on the semantics of the frame.
    In this situation we can expect that some values could be NaN. This PR is modifying the runtime to always return NULL for such rows. Which is reasonable behavior.

Its possible that all column values in the range frame column are NaN and so all window values returned at NULL as well, but that is expected behavior.

To me that is a point of inconsistency between the RANGE frame behavior between the constant and column bound as something that looks similar has different results. We should try to avoid it in general.

But the only option to fix it would be to pass the original frame value to Velox. Definitely something we can do with some design work.

@kagamiori
Copy link
Contributor Author

kagamiori commented Oct 28, 2024

Hi @aditi-pandit, thank you for the reply! Could you help me understand this?

Its possible that all column values in the range frame column are NaN and so all window values returned at NULL as well, but that is expected behavior.

I thought that for (window_func) OVER PARTITION BY ... ORDER BY ... RANGE BETWEEN column PRECEDING AND 5 FOLLOWING, if there are NaNs in column, Presto should reject it and throw an error too before sending the query plan to Velox. (My understanding is that both constant-bound and column-bound k range frame are processed by Presto co-ordinator first and NaN frame bounds can be rejected there, so Velox simply would not see NaNs in k range frame bounds when used in Prestissimo.) Am I missing anything?

==> Update: I've just realized...is it that Presto co-ordinator wouldn't process column bound of k range frames, but would add project node before the window node in the query plan to do that process in workers?

@aditi-pandit
Copy link
Collaborator

Hi @aditi-pandit, thank you for the reply! Could you help me understand this?

Its possible that all column values in the range frame column are NaN and so all window values returned at NULL as well, but that is expected behavior.

I thought that for (window_func) OVER PARTITION BY ... ORDER BY ... RANGE BETWEEN column PRECEDING AND 5 FOLLOWING, if there are NaNs in column, Presto should reject it and throw an error too before sending the query plan to Velox. (My understanding is that both constant-bound and column-bound k range frame are processed by Presto co-ordinator first and NaN frame bounds can be rejected there, so Velox simply would not see NaNs in k range frame bounds when used in Prestissimo.) Am I missing anything?

==> Update: I've just realized...is it that Presto co-ordinator wouldn't process column bound of k range frames, but would add project node before the window node in the query plan to do that process in workers?

@kagamiori :: Yes. Presto co-ordinator adds project node. It doesn't look at the column values.

kagamiori added a commit to kagamiori/velox that referenced this pull request Oct 29, 2024
)

Summary:

NaN could appear as the frame bound of k-range frames in window 
operations. When NaN appear in either of the frame bounds, the window 
operation should return NULL at this row.

This diff fixes facebookincubator#11213.

Differential Revision: D64510519
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

@@ -368,6 +369,23 @@ void WindowPartition::updateKRangeFrameBounds(
// reordered as per inputMapping_.
RowColumn frameRowColumn = columns_[frameColumn];
RowColumn orderByRowColumn = data_->columnAt(orderByColumn);

// Mark the frame invalid if the frame bound is NaN.
Copy link
Collaborator

@aditi-pandit aditi-pandit Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kagamiori : We should avoid 2 loops over all the rows. That is expensive.

The idea was to determine which NaN check (float or double or none) once. We could abstract a function for the loop over the rows and pass the Nan check as a template parameter to it. Using the compile time parameter we can pick up the right NaN check within the original loop.
Something like (this is just for pseudocode, not perfect code)

for (auto i = 0; i < numRows; i++) {
    auto currentRow = startRow + i;
    auto* partitionRow = partition_[currentRow];
    // The user is expected to set the frame column equal to NULL when the
    // ORDER BY value is NULL and not in any other case. Validate this
    // assumption.
    VELOX_DCHECK_EQ(
        RowContainer::isNullAt(
            partitionRow, frameRowColumn.nullByte(), frameRowColumn.nullMask()),
        RowContainer::isNullAt(
            partitionRow,
            orderByRowColumn.nullByte(),
            orderByRowColumn.nullMask()));

    if constexpr (floatNan) {
      if (data_->isNanAt<float>(partition_[i], mappedFrameColumn)) {
            validFrames.setValid(i, false);
            continue;
      }
   } else if constexpr (doubleNan) {
      if (data_->isNanAt<double>(partition_[i], mappedFrameColumn)) {
            validFrames.setValid(i, false);
            continue;
       }
    }
    // If the frame is NULL or 0 preceding or 0 following then the current row
    // has same values for order by and frame column. In that case
    // the bound matches the peer row for this row.
    if (data_->compare(
            partitionRow,
            partitionRow,
            orderByColumn,
            mappedFrameColumn,
            flags) == 0) {
       rawFrameBounds[i] = rawPeerBounds[i];
    } else {
         // If the search is for a preceding bound then rows between
         // [0, currentRow] are examined. For following bounds, rows between
         // [currentRow, numRows()) are checked.
        ....
    }

kagamiori added a commit to kagamiori/velox that referenced this pull request Nov 12, 2024
)

Summary:

NaN could appear as the frame bound of k-range frames in window 
operations. When NaN appear in either of the frame bounds, the window 
operation should return NULL at this row.

This diff fixes facebookincubator#11213.

Differential Revision: D64510519
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!

rawPeerBuffer,
rawFrameBounds,
validFrames);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use VELOX_DYNAMIC_SCALAR_TYPE_DISPATCH_ALL here instead of passing void?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @xiaoxmeng, I intentionally attempted to avoid using VELOX_DYNAMIC_SCALAR_TYPE_DISPATCH_ALL. This will cause the code instantiation for every type, but we don't really differentiate the code for non-floating-point types.

rawPeerBuffer,
rawFrameBounds,
validFrames);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use VELOX_DYNAMIC_SCALAR_TYPE_DISPATCH_ALL here instead of passing void?

rawPeerBuffer,
rawFrameBounds,
validFrames);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use VELOX_DYNAMIC_SCALAR_TYPE_DISPATCH_ALL here instead of passing void for non-float or double type?

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kagamiori thanks for the offline explanation!

for (auto i = 0; i < numRows; i++) {
auto currentRow = startRow + i;
auto* partitionRow = partition_[currentRow];

// Mark the frame invalid if the frame bound is NaN.
if constexpr (std::is_floating_point_v<T>) {
if (data_->isNanAt<T>(partitionRow, mappedFrameRowColumn) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment to describe this edge case as chat offline? thanks!

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to add a unit test to cover the new row container API as well as have a test to cover both frame bound and order by columns are nan case? thanks!

kagamiori added a commit to kagamiori/velox that referenced this pull request Nov 12, 2024
)

Summary:

NaN could appear as the frame bound of k-range frames in window 
operations. When NaN appear in either of the frame bounds, the window 
operation should return NULL at this row.

This diff fixes facebookincubator#11213.

Reviewed By: xiaoxmeng

Differential Revision: D64510519
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

kagamiori added a commit to kagamiori/velox that referenced this pull request Nov 12, 2024
)

Summary:

NaN could appear as the frame bound of k-range frames in window 
operations. When NaN appear in either of the frame bounds, the window 
operation should return NULL at this row.

This diff fixes facebookincubator#11213.

Reviewed By: xiaoxmeng

Differential Revision: D64510519
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

@kagamiori
Copy link
Contributor Author

Hi @aditi-pandit, I found that if the order-by value is NaN, it would cause the frame bounds to be NaN too, even if the original frame offsets are not NaN. E.g.,

SELECT SUM(c0) OVER (ORDER BY s0 RANGE BETWEEN 10.0 PRECEDING AND 10.0 FOLLOWING)
FROM (VALUES (1, NAN()), (2, NAN())) t(c0, s0) --- two rows 3, 3

In this case, Presto makes the frame to always contain only the peer rows of the current row, even if the frame is logically empty, e.g., range between 10.0 preceding and 20.0 preceding. However, the Velox window operator cannot tell whether NaN in frame bounds comes from the frame offset or the order-by value and will return NULL after this fix.

Would it make sense to disallow NaN in order-by column for window operations as well (given that logically empty frames are not handled correctly)? If not, the window operator will need to know the original frame offsets specified in the SQL queries.

@aditi-pandit
Copy link
Collaborator

SELECT SUM(c0) OVER (ORDER BY s0 RANGE BETWEEN 10.0 PRECEDING AND 10.0 FOLLOWING)
FROM (VALUES (1, NAN()), (2, NAN())) t(c0, s0)

Hi Wei,

I'm a bit confused by your question. You are saying that generally "range between 10.0 preceding and 20.0 preceding" is an empty frame, but Presto return results like (1, 3).

I see that other systems like DuckDB are also returning similar results

duckdb> SELECT SUM(c0) OVER (ORDER BY s0 RANGE BETWEEN 10.0 PRECEDING AND 10.0 FOLLOWING)
   ...> FROM (VALUES (1, CAST('NAN' AS FLOAT)), (2, CAST('NAN' AS FLOAT))) t(c0, s0) ;
┌────────────────────────────────────────────────────────────────────────────┐
│ sum(c0) OVER (ORDER BY s0 RANGE BETWEEN 10.0 PRECEDING AND 10.0 FOLLOWING) │
╞════════════════════════════════════════════════════════════════════════════╡
│                                                                          3 │
│                                                                          3 │
└────────────────────────────────────────────────────────────────────────────┘

duckdb> SELECT SUM(c0) OVER (ORDER BY s0 RANGE BETWEEN 10.0 PRECEDING AND 20.0 PRECEDING)
┌────────────────────────────────────────────────────────────────────────────┐0, s0) ;
│ sum(c0) OVER (ORDER BY s0 RANGE BETWEEN 10.0 PRECEDING AND 20.0 PRECEDING) │
╞════════════════════════════════════════════════════════════════════════════╡
│                                                                          1 │
│                                                                          3 │
└────────────────────────────────────────────────────────────────────────────┘

duckdb> SELECT SUM(c0) OVER (ORDER BY s0 RANGE BETWEEN 20.0 PRECEDING AND 10.0 PRECEDING)
┌────────────────────────────────────────────────────────────────────────────┐0, s0) ;
│ sum(c0) OVER (ORDER BY s0 RANGE BETWEEN 20.0 PRECEDING AND 10.0 PRECEDING) │
╞════════════════════════════════════════════════════════════════════════════╡
│                                                                          1 │
│                                                                          3 │
└────────────────────────────────────────────────────────────────────────────┘

So likely they are expected results.

Now coming to Velox:

Your fix is also doing:

if constexpr (std::is_floating_point_v<T>) {
      if (data_->isNanAt<T>(partitionRow, mappedFrameRowColumn) &&
          !data_->isNanAt<T>(partitionRow, orderByRowColumn)) {
        validFrames.setValid(currentRow, false);
        continue;
      }
    }

So it catches when the frame is NaN but the orderBy is not NaN then mark the frame as empty.. else it uses searchFrameValue NaN in orderBy which should always align to the ends of the partition. Does that not happen right now ? Does it give a value beyond the partition start or end causing the empty frames check to catch those bounds ?

@kagamiori
Copy link
Contributor Author

Hi @aditi-pandit, yeah, let me clarify. I added the condition && !data_->isNanAt<T>(partitionRow, orderByRowColumn) after you approved to handle the situation of order-by column containing NaN specifically. But the condition if (data_->isNanAt<T>(partitionRow, mappedFrameRowColumn) && !data_->isNanAt<T>(partitionRow, orderByRowColumn)) cannot differentiate whether the NaN in the frame column comes from the order-by column or from the original frame offset in the query. E.g., a query can have NaN both at the order-by column and in frame offsets. In this case, I think the frame is still invalid and we should reject the query. But this is not possible in the current Window operator because the original frame offsets are lost.

Thinking about it twice, I think we can return the same result as Presto for now. And when we reject the queries with NaN frame offsets, we can handle this situation together.

@aditi-pandit
Copy link
Collaborator

Hi @aditi-pandit, yeah, let me clarify. I added the condition && !data_->isNanAt<T>(partitionRow, orderByRowColumn) after you approved to handle the situation of order-by column containing NaN specifically. But the condition if (data_->isNanAt<T>(partitionRow, mappedFrameRowColumn) && !data_->isNanAt<T>(partitionRow, orderByRowColumn)) cannot differentiate whether the NaN in the frame column comes from the order-by column or from the original frame offset in the query. E.g., a query can have NaN both at the order-by column and in frame offsets. In this case, I think the frame is still invalid and we should reject the query. But this is not possible in the current Window operator because the original frame offsets are lost.

Thinking about it twice, I think we can return the same result as Presto for now. And when we reject the queries with NaN frame offsets, we can handle this situation together.

@kagamiori : In addition to this PR, we also need a Presto side change to reject frames with NaN as a frame bound. This Velox PR is only for cases where NaN appears in the frame column at certain individual rows. But yeah, we cannot distinguish if NaN in frame is coming from ORDER BY or the NaN frame bound.

See previous discussion #11293 (comment)

@aditi-pandit
Copy link
Collaborator

Hi @aditi-pandit, yeah, let me clarify. I added the condition && !data_->isNanAt<T>(partitionRow, orderByRowColumn) after you approved to handle the situation of order-by column containing NaN specifically. But the condition if (data_->isNanAt<T>(partitionRow, mappedFrameRowColumn) && !data_->isNanAt<T>(partitionRow, orderByRowColumn)) cannot differentiate whether the NaN in the frame column comes from the order-by column or from the original frame offset in the query. E.g., a query can have NaN both at the order-by column and in frame offsets. In this case, I think the frame is still invalid and we should reject the query. But this is not possible in the current Window operator because the original frame offsets are lost.

Thinking about it twice, I think we can return the same result as Presto for now. And when we reject the queries with NaN frame offsets, we can handle this situation together.

@kagamiori : I see. I think the new condition works well.

As far as being able to distinguish if NaN in frame column is from order by or frame column, Velox is not setup that way. We will have to explicitly reject NaN frames in Presto with a separate fix.

The inconsistency leads to issues like #11293 (comment)

@aditi-pandit
Copy link
Collaborator

aditi-pandit commented Nov 14, 2024

@kagamiori : Needn't be in this same PR (though better for it), but we should add documentation for this NaN behavior in the section on k range frames in the window function developer documentation at https://facebookincubator.github.io/velox/develop/window.html#k-range-frames

)

Summary:

NaN could appear as the frame bound of k-range frames in window 
operations. When NaN appear in either of the frame bounds, the window 
operation should return NULL at this row.

This diff fixes facebookincubator#11213.

Reviewed By: xiaoxmeng

Differential Revision: D64510519
kagamiori added a commit to kagamiori/velox that referenced this pull request Nov 14, 2024
)

Summary:

NaN could appear as the frame bound of k-range frames in window 
operations. When NaN appear in either of the frame bounds, the window 
operation should return NULL at this row.

This diff fixes facebookincubator#11213.

Reviewed By: xiaoxmeng

Differential Revision: D64510519
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64510519

@kagamiori
Copy link
Contributor Author

kagamiori commented Nov 14, 2024

@kagamiori : Needn't be in this same PR (though better for it), but we should add documentation for this NaN behavior in the section on k range frames in the window function developer documentation at https://facebookincubator.github.io/velox/develop/window.html#k-range-frames

@aditi-pandit, Thanks! I'll update the documentation in a separate PR.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in cce529e.

Copy link

Conbench analyzed the 1 benchmark run on commit cce529e0.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Window fuzzer test is flaky in nightly runs
4 participants