Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support Ibis 6.x UDF annotations #718

Open
omriel1 opened this issue Aug 3, 2023 · 2 comments
Open

feat: support Ibis 6.x UDF annotations #718

omriel1 opened this issue Aug 3, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@omriel1
Copy link

omriel1 commented Aug 3, 2023

What happened?

Hi!
I've been trying to create a substrait plan with a udf using the new annotations (udf.scalar.python, starting of ibis 6.x). For some reason it fails. This is what I tried (this udf is taken from the ibis blog: https://ibis-project.org/blog/rendered/ibis-version-6.0.0-release/#udfs) :

import ibis
from ibis import _

from ibis import udf
from ibis_substrait.compiler.core import SubstraitCompiler

t = ibis.examples.penguins.fetch()

@udf.scalar.python
def num_vowels(s: str) -> int:
    return sum(map(s.lower().count, "aeiou"))

new_t = t.mutate(num_vowels=lambda t: num_vowels(t.species))
query = new_t.filter(_.num_vowels>5)

compiler = SubstraitCompiler(
    udf_uri="urn:arrow:substrait_simple_extension_function"
)
plan = compiler.compile(query)

I'd expect it to be compiled into a Substrait plan, which is similar to those created in your tests (https://github.com/ibis-project/ibis-substrait/blob/main/ibis_substrait/tests/integration/test_pyarrow.py)

I'll also mention that my end goal is just create an arbitrary udf using Ibis, create a query and compile it into a Substrait plan. Then register the udf in pyarrow/DuckDB and consume the plan. While it works fine for me with the legacy annotations, I wonder how should it be done now.
Thanks!

What version of ibis-substrait are you using?

ibis_substrait-2.29.1

What substrait consumer(s) are you using, if any?

Currently none, Ideally Acero and duckDB

Relevant log output

args = (<ibis.expr.operations.relations.DatabaseTable object at 0x14ff56b60>,)
kwargs = {'child_rel_field_offsets': {}, 'compiler': <ibis_substrait.compiler.core.SubstraitCompiler object at 0x13fb12e50>}

    @functools.singledispatch
    def translate(*args: Any, **kwargs: Any) -> Any:
>       raise NotImplementedError(*args)
E       NotImplementedError: <ibis.expr.operations.relations.DatabaseTable object at 0x14ff56b60>
@omriel1 omriel1 added the bug Something isn't working label Aug 3, 2023
@gforsyth
Copy link
Member

gforsyth commented Aug 3, 2023

Thanks for raising this, @OmriLevyTau -- there are a few things going on here:

  1. Explicit Ibis 6.x support was just released so you should upgrade to ibis_substrait 3.0
  2. ibis_substrait expects unbound tables, e.g. not tied to an existing backend. The example datasets are loaded in DuckDB -- that's the source of the error you are seeing -- it wants an UnboundTable not a DatabaseTable. There's an easy fix for this, which is to call unbind() on the expression before you try to compile, e.g. plan = compiler.compile(query.unbind())
  3. Once you do this, it still won't work, because I haven't yet implemented support for the new Ibis UDF annotations, so I'll mark this issue as a feature request for that support. In the interim, if you use the old annotation style (now located in ibis.legacy.udf) that should work.

@gforsyth gforsyth added enhancement New feature or request and removed bug Something isn't working labels Aug 3, 2023
@gforsyth gforsyth changed the title Cannot create a Substrait plan with the new udf annotations feat: support Ibis 6.x UDF annotations Aug 3, 2023
@omriel1
Copy link
Author

omriel1 commented Aug 3, 2023

Thank you! it makes now more sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: backlog
Development

No branches or pull requests

2 participants