Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OPFS (Origin Private File System) Support #1856

Merged
merged 35 commits into from
Jan 15, 2025

Conversation

e1arikawa
Copy link
Contributor

Description:

This PR implements OPFS (Origin Private File System) support in the latest version of duckdb-wasm based on PR #1490.
This allows database files to be read and written to the OPFS.

API:
When opening a database file, you can now use the following API:

await db.open({
    path: 'opfs://test.db',
    accessMode: duckdb.DuckDBAccessMode.READ_WRITE,
});
const conn = await db.connect();

Changes:

  • Added support for OPFS.
    • Bug Fixes:
      • Resolved issues related to COPY TO functionality.
      • Fixed support for signed S3 URLs.

@carlopi
Copy link
Collaborator

carlopi commented Sep 18, 2024

This is very very welcome, thanks.

I am in the process of releasing a new version of duckdb-wasm, connected to duckdb 1.1.0.
Once that is out, I will properly review this PR.

@e1arikawa
Copy link
Contributor Author

@carlopi
Thank you very much for your update and for taking the time to review this PR.
I look forward to your feedback once the release is out.
Please let me know if there's anything I can assist with in the meantime.

@carlopi
Copy link
Collaborator

carlopi commented Sep 18, 2024

One comment would be if you could have a look at the failing test

@e1arikawa e1arikawa force-pushed the feature/opfs_support branch 2 times, most recently from bbb2067 to 383ff1b Compare September 23, 2024 02:57
@e1arikawa
Copy link
Contributor Author

e1arikawa commented Sep 24, 2024

@carlopi
I have merged the latest main branch and confirmed that the tests using DuckDB 1.1.1 pass successfully.

https://github.com/AKABANAKK/duckdb-wasm/actions/runs/11016245572

@carlopi
Copy link
Collaborator

carlopi commented Sep 25, 2024

Thanks, I gave a proper look, this looks solid, thanks a lot for the contribution.

This PR does introduces some API changes that might be somewhat unexpected, so I think the proper way forward would be:

  • tag v1.29.0 (later today)
  • merge this PR / iterate
  • tag v1.30.0 in the next weeks

@e1arikawa
Copy link
Contributor Author

e1arikawa commented Sep 25, 2024

@carlopi
Thank you for reviewing my contribution. Regarding the API changes, I will carefully test them after tagging v1.29.0.
Please proceed with the work as per your proposed plan. I appreciate your continued support.

@e1arikawa e1arikawa changed the title Add OPFS (Origin Private File System) Support to the Latest Version of duckdb-wasm Add OPFS (Origin Private File System) Support Oct 5, 2024
@e1arikawa e1arikawa force-pushed the feature/opfs_support branch from ad9b192 to dedc2aa Compare November 14, 2024 01:30
@e1arikawa
Copy link
Contributor Author

e1arikawa commented Nov 16, 2024

I am planning to submit a follow-up PR after this one is merged, which will address the following:

  1. Support for the opfs:// protocol
    ex) FROM read_parquet('opfs://test.parquet');
  2. Multi-tab and multi-worker support within the same domain With Web Locks
  3. Compatibility with mobile browsers

These features are already implemented and are awaiting the merge of this PR.

@e1arikawa e1arikawa force-pushed the feature/opfs_support branch from 1da9a46 to eb6ec89 Compare November 17, 2024 01:57
@seanbirchall
Copy link

seanbirchall commented Nov 26, 2024

@e1arikawa to build your branch I just need to clone your repo and build from source?

cd duckdb-wasm
git submodule init
git submodule update
make apply_patches
make serve

Then I imagine there's somewhere I can find something like https://cdn.jsdelivr.net/npm/@duckdb/[email protected]/+esm except locally after I build that I can use in my web app?

@amiller-gh
Copy link

@e1arikawa love to hear this! I made a fork of your feature branch where I've implemented #1 for our team, but including #2 and #3 is above and beyond.

Are you able to you share your working branch on your fork? I'd like to pull it down and test it with our use cases. Obviously no need to PR it yet, but I'd prefer to work off of a single fork while we all wait for maintainer approval.

I am planning to submit a follow-up PR after this one is merged, which will address the following:

  1. Support for the opfs:// protocol
    ex) FROM read_parquet('opfs://test.parquet');
  2. Multi-tab and multi-worker support within the same domain With Web Locks
  3. Compatibility with mobile browsers

These features are already implemented and are awaiting the merge of this PR.

@bumberboy
Copy link

Hello @carlopi or team! I’ve been following this PR and wanted to mention how valuable it would be for our work. Just wondering if there’s an update or anything I can assist with to help get it merged?

Thanks!

@justin0mcateer
Copy link

I don't want to beat a dead horse, but we are extremely interested in this feature as well.

Comment on lines 486 to 488
/** Register a file object URL */
public registerFileHandle<HandleType>(
public async registerFileHandle<HandleType>(
name: string,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest, possibly only problem I am seeing for merning this is that I think it's not a good idea to move from a syncronous call to an asyncronous call.

Would it be possible to add a registerFileHandleAsync, and use that in the relevant places, while leaving the option (that can't support OPFS) for a simpler and backward compatible registerFileHandle?

Copy link
Collaborator

@carlopi carlopi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also another note: I tried (single threaded version) and got:

COPY (SELECT 32) TO 'opfs://myfile.csv';

to show a waring in console (No OPFS access handle registered with name: opfs://myfile.csv) but also trigger a Maximum call stack size exceeded.

Here having this either fail with a better failure mode would be enough.

@carlopi
Copy link
Collaborator

carlopi commented Jan 15, 2025

Actually, thanks A LOT @e1arikawa and congrats on pushing this in.

I might send a couple of PR to adapt some minor things, but it's a great step in the right direction.

Thanks a lot

@carlopi carlopi merged commit c1365cf into duckdb:main Jan 15, 2025
15 checks passed
carlopi added a commit to carlopi/duckdb-wasm that referenced this pull request Jan 15, 2025
@tobilg
Copy link

tobilg commented Jan 15, 2025

Thank you @e1arikawa, @carlopi et.al.! That’s a great new feature!

carlopi added a commit that referenced this pull request Jan 15, 2025
carlopi added a commit to carlopi/duckdb-wasm that referenced this pull request Jan 15, 2025
Needs first to avoid throwing in destructor (big no), AND to convert JS exception in C++ exception
More iteration on comments to duckdb#1856
carlopi added a commit to carlopi/duckdb-wasm that referenced this pull request Jan 15, 2025
Needs first to avoid throwing in destructor (big no), AND to convert JS exception in C++ exception
More iteration on comments to duckdb#1856
carlopi added a commit to carlopi/duckdb-wasm that referenced this pull request Jan 15, 2025
Needs first to avoid throwing in destructor (big no), AND to convert JS exception in C++ exception
More iteration on comments to duckdb#1856
carlopi added a commit that referenced this pull request Jan 15, 2025
Needs first to avoid throwing in destructor (big no), AND to convert JS exception in C++ exception
More iteration on comments to #1856
@ujaval403
Copy link

@carlopi @e1arikawa any progress around supporting
#1856 (comment)

adding support for
ex) FROM read_parquet('opfs://test.parquet') will be really helpful

@ilyabo
Copy link

ilyabo commented Jan 24, 2025

Thanks for working on the OPFS support, it's going to be super useful!

I'm trying to use it with @duckdb/duckdb-wasm@npm:1.29.1-dev47.0, the release which was published just after this PR had landed. I'm always getting this error, no matter the file name I try

Opening the database failed with error: 
{"exception_type":"IO","exception_message":"The file \"opfs://test123.db\" exists, but it is not a valid DuckDB database file!"} 

when running this code

await db.open({
    path: 'opfs://test123.db',
    accessMode: duckdb.DuckDBAccessMode.READ_WRITE
});

@tartufella
Copy link

Hello there, I too am very interested in OPFS support. My use case is to download a db file created on the server and work with it in the browser. I have simulated this as @ilyabo above and see the same issue. Just to be sure, I exported the file from OPFS back into the file system and opened it in duckdb cli and the file is OK.

Many thanks, really look forward to getting this working, great effort - love this repo :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.