Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is duckdb out-of-core processing properly enabled? #1509

Open
binste opened this issue Apr 24, 2024 · 1 comment · May be fixed by #1510
Open

Is duckdb out-of-core processing properly enabled? #1509

binste opened this issue Apr 24, 2024 · 1 comment · May be fixed by #1510
Assignees
Labels
benchmarks bug Something isn't working tpch

Comments

@binste
Copy link

binste commented Apr 24, 2024

I just saw the great talk on Dask DataFrames 2.0 at PyData Berlin! I was a bit surprised that duckdb timed out for some of the queries. According to https://duckdb.org/docs/guides/performance/how_to_tune_workloads#larger-than-memory-workloads-out-of-core-processing, if you are not connected to a persistent duckdb database file, which I think is not the case based on the code in https://github.com/coiled/benchmarks/blob/63ca3c20cfd6c8352eebf880211e41a85793be32/tests/tpch/test_duckdb.py, you'd need to set a temporary directory so that duckdb can spill over to disk.

I'm not 100% if this is not set already somewhere else as I didn't dig through all the testing related code but thought you might want to know.

Related issues are #1488, #1214, and #1194.

@hendrikmakait
Copy link
Member

@binste: Thanks for creating this issue. It looks like we have indeed missed this and there's no directory available to DuckDB for storing its data. I've created a PR that sets the appropriate config value and will investigate the impact this has on the performance/scalability of DuckDB.

@hendrikmakait hendrikmakait self-assigned this Apr 24, 2024
@hendrikmakait hendrikmakait added bug Something isn't working benchmarks tpch labels Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmarks bug Something isn't working tpch
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants