Skip to content

Refactor replace and sort #102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 18, 2025
Merged

Refactor replace and sort #102

merged 4 commits into from
Aug 18, 2025

Conversation

RAMitchell
Copy link
Contributor

No description provided.

Signed-off-by: Rory Mitchell <[email protected]>
Signed-off-by: Rory Mitchell <[email protected]>
Signed-off-by: Rory Mitchell <[email protected]>
@RAMitchell RAMitchell requested review from Copilot and seberg August 15, 2025 13:52
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the replace and sort operations by splitting CUDA-specific GPU code into separate files and modernizing the test implementation. The changes migrate from CUDF-based tests to Arrow-based tests while maintaining backward compatibility for CUDA operations.

Key changes:

  • Separates GPU variants into dedicated .cu files for better code organization
  • Modernizes test infrastructure from CUDF to Arrow
  • Exposes task classes and helper functions in public headers

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
cpp/tests/test_replace.cpp Migrates from CUDF to Arrow-based test framework
cpp/src/sort.cu New file containing GPU-specific sort implementation
cpp/src/sort.cpp Refactored to remove GPU code and expose shared functions
cpp/src/replace.cu New file containing GPU-specific replace implementation
cpp/src/replace.cpp Refactored to remove GPU code and update type checking
cpp/include/legate_dataframe/sort.hpp Exposes task class and helper functions
cpp/include/legate_dataframe/replace.hpp Exposes task class definition
cpp/include/legate_dataframe/reduction.hpp Removes unused variant options

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

auto split_indices = cudf::lower_bound(sorted_table_with_rank.select(keys_idxx),
global_split_values.select(value_keysx),
column_order_with_rank,
null_precendence_with_rank,
Copy link
Preview

Copilot AI Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name 'null_precendence_with_rank' contains a spelling error. It should be 'null_precedence_with_rank' (precedence, not precendence).

Suggested change
null_precendence_with_rank,
auto null_precedence_with_rank = null_precedence;
null_precedence_with_rank.push_back(cudf::null_order::AFTER);
auto split_indices = cudf::lower_bound(sorted_table_with_rank.select(keys_idxx),
global_split_values.select(value_keysx),
column_order_with_rank,
null_precedence_with_rank,

Copilot uses AI. Check for mistakes.

auto split_indices = cudf::lower_bound(sorted_table_with_rank.select(keys_idxx),
global_split_values.select(value_keysx),
column_order_with_rank,
null_precendence_with_rank,
Copy link
Preview

Copilot AI Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name 'null_precendence_with_rank' contains a spelling error. It should be 'null_precedence_with_rank' (precedence, not precendence).

Suggested change
null_precendence_with_rank,
auto null_precedence_with_rank = null_precedence;
null_precedence_with_rank.push_back(cudf::null_order::AFTER);
auto split_indices = cudf::lower_bound(sorted_table_with_rank.select(keys_idxx),
global_split_values.select(value_keysx),
column_order_with_rank,
null_precedence_with_rank,

Copilot uses AI. Check for mistakes.

auto split_indices = cudf::lower_bound(sorted_table_with_rank.select(keys_idxx),
global_split_values.select(value_keysx),
column_order_with_rank,
null_precendence_with_rank,
Copy link
Preview

Copilot AI Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name 'null_precendence_with_rank' contains a spelling error. It should be 'null_precedence_with_rank' (precedence, not precendence).

Suggested change
null_precendence_with_rank,
auto null_precedence_with_rank = null_precedence;
null_precedence_with_rank.push_back(cudf::null_order::AFTER);
auto split_indices = cudf::lower_bound(sorted_table_with_rank.select(keys_idxx),
global_split_values.select(value_keysx),
column_order_with_rank,
null_precedence_with_rank,

Copilot uses AI. Check for mistakes.

Signed-off-by: Rory Mitchell <[email protected]>
Copy link
Contributor

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of code that needed to move here, but looks straight forward. Nice, thanks!

@RAMitchell RAMitchell merged commit 632ec4c into rapidsai:main Aug 18, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants