Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error writing to parquet using kerchunk #115

Closed
TomNicholas opened this issue May 15, 2024 · 3 comments
Closed

Error writing to parquet using kerchunk #115

TomNicholas opened this issue May 15, 2024 · 3 comments
Labels
bug Something isn't working references formats Storing byte range info on disk

Comments

@TomNicholas
Copy link
Member

@jsignell it seems the parquet writer implementation in #72 is more fragile then it looked - I tried it in this notebook and got an error from fsspec. I'm not sure if this is fsspec's fault but it would be nice to make it a little more robust if possible.

@TomNicholas TomNicholas added bug Something isn't working references formats Storing byte range info on disk labels May 15, 2024
@jsignell
Copy link
Contributor

It looks like you have some alternative version of kerchunk. Can you confirm that you have 0.2.5 installed?

@jsignell
Copy link
Contributor

In particular your traceback looks like it is pointing to these lines:

    out = LazyReferenceMapper.create(
        record_size, root=url, fs=fs, categorical_threshold=categorical_threshold
    )

https://github.com/fsspec/kerchunk/blob/3c4e9fc960e159875e8f258ccd20fdbc565513df/kerchunk/df.py#L154-L156

when the current version looks more like:

    out = LazyReferenceMapper.create(
        record_size=record_size,
        root=url,
        fs=fs,
        categorical_threshold=categorical_threshold,
    )

https://github.com/fsspec/kerchunk/blob/0.2.5/kerchunk/df.py#L154-L159

@TomNicholas
Copy link
Member Author

You're right! Upgrading to kerchunk==0.2.5 fixed it. Sorry for the noise @jsignell .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working references formats Storing byte range info on disk
Projects
None yet
Development

No branches or pull requests

2 participants