Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the bug after introducing ufs full path as page id #53

Merged
merged 7 commits into from
Jul 31, 2024

Conversation

YichuanSun
Copy link
Collaborator

In Alluxio 3.3, we use the ufs full path as the fileId, instead of a hash String. It breaks the fsspec/alluxiofs, this PR is used to fix the bugs.

@YichuanSun
Copy link
Collaborator Author

I have run the fixed alluxiofs with the Alluxio 3.3 locally, it passed the test_read_range.py. @LuQQiu could you please take a look?

@YichuanSun YichuanSun changed the title fix the bug after introducing ufs full path as page id. Fix the bug after introducing ufs full path as page id Jul 25, 2024
@LuQQiu
Copy link
Collaborator

LuQQiu commented Jul 25, 2024

@SibylYang please help review and see whether that will break backward compatibility? e.g. whether it can still work with AOS

@YichuanSun
Copy link
Collaborator Author

I believe that the "fake server" in test_fake_server.py is not compatible with modified code. But I'm not sure how to fix the fake server. By the way, test_fake_server.py is the only failed test.

@SibylYang
Copy link
Collaborator

SibylYang commented Jul 26, 2024

I tested it with alluxio open source cluster running locally. There's no problem with reading files. But there's an error when writing to files. Below code generated the error "io.UnsupportedOperation: not writable".

`
fsspec.register_implementation("alluxiofs", AlluxioFileSystem, clobber=True)
alluxio_fs = fsspec.filesystem("alluxiofs", etcd_hosts="localhost", etcd_port=2379, target_protocol='s3')

with alluxio_fs.open('s3://path/to/file', 'a') as f:
f.write("\n new line by alluxio\n")
`

@YichuanSun
Copy link
Collaborator Author

YichuanSun commented Jul 30, 2024

After using the code in #51 and my change in current PR, I test the write function manually with the latest alluxio os edition, I find the write works. Here is my test script, the same as yours above.

import sys
import fsspec
sys.path.append('.')
from alluxiofs import AlluxioClient, AlluxioFileSystem

if __name__ == "__main__":
    fsspec.register_implementation("alluxiofs", AlluxioFileSystem, clobber=True)
    alluxio_fs = fsspec.filesystem("alluxiofs", etcd_hosts="localhost", etcd_port=2379, target_protocol='s3')
    with alluxio_fs.open("s3://yichuan-test/testBucket/test.txt", 'a') as f:
        f.write("\n444 new line by alluxio\n")

After running the script and open the "s3://yichuan-test/testBucket/test.txt" in the S3 website, the content is as below.

Wow

 new line by alluxio

 new line by alluxio

 new line by alluxio

111 new line by alluxio

222 new line by alluxio

333 new line by alluxio

444 new line by alluxio

Maybe just any wrong configurations that cause the write error? FYI @LuQQiu @SibylYang

@SibylYang
Copy link
Collaborator

After using the code in #51 and my change in current PR, I test the write function manually with the latest alluxio os edition, I find the write works. Here is my test script, the same as yours above.

import sys
import fsspec
sys.path.append('.')
from alluxiofs import AlluxioClient, AlluxioFileSystem

if __name__ == "__main__":
    fsspec.register_implementation("alluxiofs", AlluxioFileSystem, clobber=True)
    alluxio_fs = fsspec.filesystem("alluxiofs", etcd_hosts="localhost", etcd_port=2379, target_protocol='s3')
    with alluxio_fs.open("s3://yichuan-test/testBucket/test.txt", 'a') as f:
        f.write("\n444 new line by alluxio\n")

After running the script and open the "s3://yichuan-test/testBucket/test.txt" in the S3 website, the content is as below.

Wow

 new line by alluxio

 new line by alluxio

 new line by alluxio

111 new line by alluxio

222 new line by alluxio

333 new line by alluxio

444 new line by alluxio

Maybe just any wrong configurations that cause the write error? FYI @LuQQiu @SibylYang

Right. If I used PR 51 --- the updated handler, then there's no problem with writing

@YichuanSun YichuanSun self-assigned this Jul 31, 2024
@YichuanSun YichuanSun added the bug Something isn't working label Jul 31, 2024
@YichuanSun YichuanSun requested a review from LuQQiu July 31, 2024 15:04
@LuQQiu
Copy link
Collaborator

LuQQiu commented Jul 31, 2024

@SibylYang Please help review the PR

@SibylYang
Copy link
Collaborator

This commit LGTM.
It won't break the alluxio open source, because in our alluxio java server, it process URL using something like
httpRequestUri.getParameters().get("offset")
So the added "ufsFullPath" in the URL will just be ignored by alluxio open source.

Copy link
Collaborator

@LuQQiu LuQQiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@LuQQiu LuQQiu merged commit 7f9641f into fsspec:main Jul 31, 2024
3 checks passed
@YichuanSun YichuanSun deleted the compatible-with-alluxio-3.3 branch August 1, 2024 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants