Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Report all unsupported operations for a query in cudf.polars #16960

Open
wants to merge 18 commits into
base: branch-24.12
Choose a base branch
from

Conversation

Matt711
Copy link
Contributor

@Matt711 Matt711 commented Oct 1, 2024

Description

Closes #16690. The purpose of this PR is to list all of the unique operations that are unsupported by cudf.polars when running a query.

  1. Question: How to traverse the tree to report the error nodes? Should this be done upstream in Polars?
  2. Instead of traversing the query afterwards, we should probably catch each unsupported feature as we translate the IR.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@Matt711 Matt711 added feature request New feature or request 5 - DO NOT MERGE Hold off on merging; see PR for details non-breaking Non-breaking change labels Oct 1, 2024
@Matt711 Matt711 self-assigned this Oct 1, 2024
@github-actions github-actions bot added Python Affects Python cuDF API. cudf.polars Issues specific to cudf.polars labels Oct 1, 2024
python/cudf_polars/cudf_polars/dsl/ir.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/utils/other.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/callback.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
@Matt711 Matt711 force-pushed the fea/cudf-polars/report-all-unsupported-ops branch from 0310f26 to 3175a7e Compare October 9, 2024 03:46
@Matt711 Matt711 force-pushed the fea/cudf-polars/report-all-unsupported-ops branch from 3175a7e to 054b271 Compare October 9, 2024 15:28
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is looking really nice. Some smaller suggestions and a few small logic fixes

python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/callback.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/callback.py Outdated Show resolved Hide resolved
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny suggestions, debug_mode is now gone.

python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@Matt711 Matt711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the logic of assert_ir_translation_raises to integrate with the changes made in this PR. I made the changes because now that we're returning ErrorNodes and ErrorExprs instead of raising exceptions during translation, assert_ir_translation_raises(q, NotImplementedError) fails in a lot of places.

To solve this, I checked that the exception(s) being asserted in q.collect(...) are inside Translation.errors and treated any other exceptions raised during translation as cases where assert_ir_translation_raises fails. I this required me to hard-code a few cases where translation could fail.

Are there other cases I missed where translation could fail? WDYT of the changes @wence-?

python/cudf_polars/tests/test_config.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/testing/asserts.py Outdated Show resolved Hide resolved
@wence-
Copy link
Contributor

wence- commented Oct 30, 2024

@Matt711 This stalled a bit, I think it's a good one to get into 24.12, do you need some help with some bits?

@Matt711
Copy link
Contributor Author

Matt711 commented Oct 31, 2024

@Matt711 This stalled a bit, I think it's a good one to get into 24.12, do you need some help with some bits?

Hey @wence-, that should be doable. The last thing I needed to do with this PR is get test coverage to 100%. I'll address merge conflicts tomorrow, and try to do that too. I'll check in offline if I need help.

@Matt711 Matt711 force-pushed the fea/cudf-polars/report-all-unsupported-ops branch from ff7f2e1 to 9551c1f Compare October 31, 2024 15:02
Copy link

copy-pr-bot bot commented Oct 31, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Matt711
Copy link
Contributor Author

Matt711 commented Oct 31, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Oct 31, 2024

/ok to test

@Matt711 Matt711 removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Oct 31, 2024
@Matt711
Copy link
Contributor Author

Matt711 commented Nov 1, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 1, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 1, 2024

/ok to test

@@ -45,6 +45,7 @@ def pytest_configure(config: pytest.Config) -> None:


EXPECTED_FAILURES: Mapping[str, str] = {
"tests/unit/dataframe/test_df.py::test_extension": "AssertionError",
Copy link
Contributor Author

@Matt711 Matt711 Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could use some help understanding why this test is failing. It fails at the first ref count assertion. The test is here https://github.com/pola-rs/polars/blob/40f8f5d63c225cd3dcb4e220db1bd5622274b2ab/py-polars/tests/unit/dataframe/test_df.py#L1820
@wence-

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it fail for you on branch-24.12, I guess not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea, sorry

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 1, 2024

I'm opening this PR up for reviews now that tests are passing. One of the polars tests is failing and I need some helping debugging it.

Edit: I'll address merge conflicts once this PR is ready.

@Matt711 Matt711 marked this pull request as ready for review November 1, 2024 17:26
@Matt711 Matt711 requested a review from a team as a code owner November 1, 2024 17:26
@Matt711 Matt711 changed the title [WIP] Report all unsupported operations for a query in cudf.polars [FEA] Report all unsupported operations for a query in cudf.polars Nov 1, 2024
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @Matt711, I had one suggestion that may help your segfault issue

python/cudf_polars/cudf_polars/callback.py Outdated Show resolved Hide resolved
node = self.visitor.view_current_node()
except Exception as e:
self.errors.append(e)
return ir.ErrorNode({}, str(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one wants to use the schema we generate in the lines below. That way we won't get any false positive "schema mismatch" errors later in translation.

Suggested change
return ir.ErrorNode({}, str(e))
return ir.ErrorNode(schema, str(e))

Copy link
Contributor Author

@Matt711 Matt711 Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That schema is defined after this point.

schema = {k: dtypes.from_polars(v) for k, v in polars_schema.items()}
except Exception as e:
self.errors.append(e)
return ir.ErrorNode({}, str(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, a tricky one, we want to put schema here, but we didn't manage to make it.

python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/testing/asserts.py Outdated Show resolved Hide resolved
@@ -45,6 +45,7 @@ def pytest_configure(config: pytest.Config) -> None:


EXPECTED_FAILURES: Mapping[str, str] = {
"tests/unit/dataframe/test_df.py::test_extension": "AssertionError",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea, sorry

@Matt711
Copy link
Contributor Author

Matt711 commented Nov 7, 2024

/ok to test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf.polars Issues specific to cudf.polars feature request New feature or request non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Status: In Progress
Status: In Progress
Development

Successfully merging this pull request may close these issues.

[FEA] Report all unsupported operations for a query in cudf-polars
3 participants