-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Report all unsupported operations for a query in cudf.polars #16960
base: branch-24.12
Are you sure you want to change the base?
[FEA] Report all unsupported operations for a query in cudf.polars #16960
Conversation
0310f26
to
3175a7e
Compare
3175a7e
to
054b271
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is looking really nice. Some smaller suggestions and a few small logic fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny suggestions, debug_mode
is now gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the logic of assert_ir_translation_raises
to integrate with the changes made in this PR. I made the changes because now that we're returning ErrorNode
s and ErrorExpr
s instead of raising exceptions during translation, assert_ir_translation_raises(q, NotImplementedError)
fails in a lot of places.
To solve this, I checked that the exception(s) being asserted in q.collect(...)
are inside Translation.errors
and treated any other exceptions raised during translation as cases where assert_ir_translation_raises
fails. I this required me to hard-code a few cases where translation could fail.
Are there other cases I missed where translation could fail? WDYT of the changes @wence-?
@Matt711 This stalled a bit, I think it's a good one to get into 24.12, do you need some help with some bits? |
Hey @wence-, that should be doable. The last thing I needed to do with this PR is get test coverage to 100%. I'll address merge conflicts tomorrow, and try to do that too. I'll check in offline if I need help. |
ff7f2e1
to
9551c1f
Compare
/ok to test |
/ok to test |
/ok to test |
/ok to test |
/ok to test |
@@ -45,6 +45,7 @@ def pytest_configure(config: pytest.Config) -> None: | |||
|
|||
|
|||
EXPECTED_FAILURES: Mapping[str, str] = { | |||
"tests/unit/dataframe/test_df.py::test_extension": "AssertionError", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could use some help understanding why this test is failing. It fails at the first ref count assertion. The test is here https://github.com/pola-rs/polars/blob/40f8f5d63c225cd3dcb4e220db1bd5622274b2ab/py-polars/tests/unit/dataframe/test_df.py#L1820
@wence-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it fail for you on branch-24.12, I guess not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea, sorry
I'm opening this PR up for reviews now that tests are passing. One of the polars tests is failing and I need some helping debugging it. Edit: I'll address merge conflicts once this PR is ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @Matt711, I had one suggestion that may help your segfault issue
node = self.visitor.view_current_node() | ||
except Exception as e: | ||
self.errors.append(e) | ||
return ir.ErrorNode({}, str(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one wants to use the schema we generate in the lines below. That way we won't get any false positive "schema mismatch" errors later in translation.
return ir.ErrorNode({}, str(e)) | |
return ir.ErrorNode(schema, str(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That schema
is defined after this point.
schema = {k: dtypes.from_polars(v) for k, v in polars_schema.items()} | ||
except Exception as e: | ||
self.errors.append(e) | ||
return ir.ErrorNode({}, str(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, a tricky one, we want to put schema
here, but we didn't manage to make it.
@@ -45,6 +45,7 @@ def pytest_configure(config: pytest.Config) -> None: | |||
|
|||
|
|||
EXPECTED_FAILURES: Mapping[str, str] = { | |||
"tests/unit/dataframe/test_df.py::test_extension": "AssertionError", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea, sorry
/ok to test |
Description
Closes #16690. The purpose of this PR is to list all of the unique operations that are unsupported by
cudf.polars
when running a query.Checklist