Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError : Unable to walk dir: IO error for operation on folder/_delta_log #1952

Closed
Matthieusalor opened this issue Dec 8, 2023 · 13 comments
Closed
Labels
binding/python Issues for the Python package bug Something isn't working
Milestone

Comments

@Matthieusalor
Copy link

Environment

Delta-rs version: python 0.14.0

Environment:

  • OS: Linux

Bug

What happened:

When trying to create a new delta table with the latest python version I'm systematically getting an error

OSError : Generic LocalFilesystem Unable to walk dir: IO error for operation on folder/_delta_log: Success (os error 0)

At this stage, the table folder has been created, the data is here, only the _delta_log folder is missing.

I tried to create a table with the latest rust version 0.16.5 and haven't been able to reproduce.

The python code used to work on the 0.13.0 version

How to reproduce it:

import pandas as pd
import deltalake

df = pd.DataFrame({"A": [1, 2]})
deltalake.write_deltalake("test", df)
@Matthieusalor Matthieusalor added the bug Something isn't working label Dec 8, 2023
@ion-elgreco
Copy link
Collaborator

Does it work if you do write_deltalake(engine="rust")?

@Matthieusalor
Copy link
Author

No It fails both at write_deltalake_rust and write_deltlake_pyarrow with the same error depending on the engine parameter

@ion-elgreco
Copy link
Collaborator

I cannot reproduce this issue in WSL.

@rtyler
Copy link
Member

rtyler commented Dec 22, 2023

I am also unable to reproduce this on a Linux/amd64 machine:

❯ python
Python 3.11.4 (main, Jun 28 2023, 19:51:46) [GCC] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import deltalake
>>>
>>> df = pd.DataFrame({"A": [1, 2]})
>>> deltalake.write_deltalake("test", df)
>>>
❯ tree test
test
├── 0-8620cf44-3335-4675-8b64-ef286ce7e677-0.parquet
└── _delta_log
    └── 00000000000000000000.json

2 directories, 2 files

@Matthieusalor can you share more details about your filesystem? I am wondering if there's some nuance about the specific filesystem or architecture of your environment that could be causing this issue?

@rtyler rtyler added the binding/python Issues for the Python package label Dec 22, 2023
@rtyler rtyler added this to the Rust v0.18 milestone Feb 6, 2024
@J2OG
Copy link

J2OG commented Apr 5, 2024

@rtyler Im facing exact same issue on Machine Learning Studio.

@strawhl
Copy link

strawhl commented May 15, 2024

I also have the same issue from Azure Machine Learning

@empowerNate
Copy link

empowerNate commented May 23, 2024

It's an issue on the shared network drive in Azure ML compute instances which is mounted using CIFS. If you write to the local HDD of the machine (~/localfiles) or /mnt, it works. Unfortunately localfiles is only a few 10s of GB of space and mnt in temporary and gets deleted every time an instance shuts down.

@MoonKBRR
Copy link

I also have the same issue from Azure Machine Learning

did you solved your issue ? I've the same problem with a mounted volume in azure file share

@masc-it
Copy link

masc-it commented Jul 14, 2024

Hey, I am having a similar issue (the target storage is a SAMBA mount) when calling write_deltalake:

OSError: Generic LocalFileSystem error: Unable to copy file from /Volumes/datasets/.../_delta_log/_commit_6851eb42-d982-49a0-9468-b3d92657948c.json.tmp to /Volumes/datasets/.../_delta_log/00000000000000000000.json: Operation not supported (os error 45)

@moehmeni
Copy link

moehmeni commented Aug 7, 2024

Same error.
@masc-it , @MoonKBRR Did you find any solution for this?

@rtyler
Copy link
Member

rtyler commented Dec 1, 2024

Hiya folks! This seems to be a Windows-y type issue, and I don't have access to a Windowsy host.

Would one of you kind folks try this out with v0.22.2 and report back?

@PeterThramkrongart
Copy link

PeterThramkrongart commented Dec 12, 2024

I encounter a similar issue with azure file share: #3053 (comment)

also with v0.22.3

@rtyler

@ion-elgreco
Copy link
Collaborator

All of you are using mounted storage, to be able to write mounted storage you need to enable unsafe renames with MOUNT_ALLOW_UNSAFE_RENAME, see docs here:
https://delta-io.github.io/delta-rs/integrations/object-storage/special_configuration/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants