Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(datafusion): No real support for temporary table creation #10066

Closed
1 task done
ncclementi opened this issue Sep 9, 2024 · 1 comment
Closed
1 task done

bug(datafusion): No real support for temporary table creation #10066

ncclementi opened this issue Sep 9, 2024 · 1 comment
Assignees
Labels
bug Incorrect behavior inside of ibis requires upstream support Feature or bug requires support from the upstream project

Comments

@ncclementi
Copy link
Contributor

What happened?

I was playing around on the cli trying to generate tables and views, and temp versions of them, and I noticed that adding TEMPORARY to the create statement still resulted in a table or view being categorized with table_type under information_schema.tables as BASE_TABLE and VIEW respectively, and not as LOCAL_TEMPORARY as I would expect, that table_type exists see here but it looks like that can only be generated via rust code according to this comment.

Then on our end when we create tables on datafusion and we provide temp=True we are not really creating a temporary table.
Unless I'm missing something, this would mean that we aren't and can't support creation of temporary tables/views.

I opened an issue upstream in datafusion, asking about creation of temporary tables with SQL: apache/datafusion#12363

Until we hear something, I think we should raise when temp=True in create_table saying it's not supported.

CLI findings

> CREATE TABLE my_table (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL
);

0 row(s) fetched. 
Elapsed 0.004 seconds.

> CREATE VIEW my_view AS
SELECT id, name FROM my_table;
0 row(s) fetched. 
Elapsed 0.002 seconds.

> CREATE TEMPORARY TABLE my_temp_table (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL
);
0 row(s) fetched. 
Elapsed 0.005 seconds.

> CREATE TEMPORARY VIEW my_temp_view AS
SELECT id, name FROM my_table;
0 row(s) fetched. 
Elapsed 0.002 seconds.

> SELECT * FROM information_schema.tables;
+---------------+--------------------+---------------+------------+
| table_catalog | table_schema       | table_name    | table_type |
+---------------+--------------------+---------------+------------+
| datafusion    | public             | my_temp_table | BASE TABLE |
| datafusion    | public             | my_temp_view  | VIEW       |
| datafusion    | public             | my_table      | BASE TABLE |
| datafusion    | public             | my_view       | VIEW       |
| datafusion    | information_schema | tables        | VIEW       |
| datafusion    | information_schema | views         | VIEW       |
| datafusion    | information_schema | columns       | VIEW       |
| datafusion    | information_schema | df_settings   | VIEW       |
| datafusion    | information_schema | schemata      | VIEW       |
+---------------+--------------------+---------------+------------+

On Ibis this is replicated, when we pass temp=True it also gets swallowed

In [1]: import ibis

In [2]: con = ibis.datafusion.connect()

In [3]: penguins = ibis.examples.penguins.fetch()

In [4]: con.create_table('penguins', penguins.to_pandas())
Out[4]: 
DatabaseTable: penguins
  species           string
  island            string
  bill_length_mm    float64
  bill_depth_mm     float64
  flipper_length_mm float64
  body_mass_g       float64
  sex               string
  year              int64

In [5]: con.raw_sql("SELECT * FROM information_schema.tables;")
Out[5]: 
DataFrame()
+---------------+--------------------+-------------+------------+
| table_catalog | table_schema       | table_name  | table_type |
+---------------+--------------------+-------------+------------+
| datafusion    | public             | penguins    | BASE TABLE |
| datafusion    | information_schema | tables      | VIEW       |
| datafusion    | information_schema | views       | VIEW       |
| datafusion    | information_schema | columns     | VIEW       |
| datafusion    | information_schema | df_settings | VIEW       |
| datafusion    | information_schema | schemata    | VIEW       |
+---------------+--------------------+-------------+------------+

In [6]: con.create_table('penguins_temp', penguins.to_pandas(), temp=True)
Out[6]: 
DatabaseTable: penguins_temp
  species           string
  island            string
  bill_length_mm    float64
  bill_depth_mm     float64
  flipper_length_mm float64
  body_mass_g       float64
  sex               string
  year              int64

In [7]: con.raw_sql("SELECT * FROM information_schema.tables;")
Out[7]: 
DataFrame()
+---------------+--------------------+---------------+------------+
| table_catalog | table_schema       | table_name    | table_type |
+---------------+--------------------+---------------+------------+
| datafusion    | public             | penguins      | BASE TABLE |
| datafusion    | public             | penguins_temp | BASE TABLE |
| datafusion    | information_schema | tables        | VIEW       |
| datafusion    | information_schema | views         | VIEW       |
| datafusion    | information_schema | columns       | VIEW       |
| datafusion    | information_schema | df_settings   | VIEW       |
| datafusion    | information_schema | schemata      | VIEW       |
+---------------+--------------------+---------------+------------+

In [8]: con.list_tables()
Out[8]: ['penguins', 'penguins_temp']

What version of ibis are you using?

main

What backend(s) are you using, if any?

datafusion

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@ncclementi ncclementi added the bug Incorrect behavior inside of ibis label Sep 9, 2024
@ncclementi ncclementi added the requires upstream support Feature or bug requires support from the upstream project label Sep 9, 2024
@ncclementi
Copy link
Contributor Author

Closed in favor of #10069

@github-project-automation github-project-automation bot moved this from backlog to done in Ibis planning and roadmap Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis requires upstream support Feature or bug requires support from the upstream project
Projects
Archived in project
Development

No branches or pull requests

1 participant