Trouble with NFS and ACL #454
Replies: 6 comments 3 replies
-
The posix filesystem backend requires xattr support. This is used to store object/bucket metadata such as s3 ACLs, tags, etag, etc. NFS has only fairly recently added xattr support. See this article: https://lwn.net/Articles/799185/. I haven't been able to try this out yet, but I think this requires linux kernel 5.9 or newer, and "user_xattr,vers=4.2" client mount options. The ACL references in the errors are related to storing the S3 bucket ACL information on the directory in the xattr "user.acl", and are not related to NFS ACLs. |
Beta Was this translation helpful? Give feedback.
-
I tested this on an EL8 system, and this version of NFS appears to work successfully.
host info:
The important part is the xattr support:
|
Beta Was this translation helpful? Give feedback.
-
#459 adds a check for better error messages when xattr is not supported for backend filesystem |
Beta Was this translation helpful? Give feedback.
-
Thank you so much for testing this. I was troubleshooting all day, even considering that I might have to upgrade the cluster nodes (currently we are running Rocky 8 on those machines). The good news is, we even tweaked some server settings and found a configuration that solved the ACL problem. At least it looks that way. I don't see those errors anymore. The bad news it that file uploads are still unreliable (using rclone). Sometimes it works, and then suddenly it doesn't. I need to do more testing on this one, because I can't rule out IO issues yet. I keep you updated (if you want) ... |
Beta Was this translation helpful? Give feedback.
-
This is not an rclone issue or a vgw problem. I can confirm low level I/O problems with NFS 4.1 and NFS 4.2. NFS 3 and 4.0 looks good though. I don't know when we can solve this, but for now this limits my options significantly. Without NFS 4.2 I won't be able to provide a filesystem with extended attributes. |
Beta Was this translation helpful? Give feedback.
-
Since this is not a vgw problem, I am closing this topic. Thank you so much for the help! |
Beta Was this translation helpful? Give feedback.
-
I've set up versitygw (v0.16) in a Kubernetes cluster. versitygw is configured to use an NFS share as posix backend. It looks good on a high level. However, when I started to execute some simple S3 commands (using rclone), like creating a bucket, it works but it also fails. The folder is created (in the filesystem, on the NFS share), but there are errors in the logs:
As I said: The directory is created nonetheless. But when I try to upload a file it gets bad. rclone seems to hang for quite a while and I can see lots of errors in the logs:
These are the NFS options being used (server default options):
I could also successfully verify that ACL is working, using
nfs4_setfacl
andnfs4_getfacl
. So I guess the NFS-side should be fine and the requirement "The filesystem must have the ability to store extended attributes" from the documentation should be satisfied. Additionally I played arrount with different NFS versions (3
,4
,4.2
), with no change whatsoever.Is there anything I can do to make this work? Do you have any ideas of things I could try?
I admit: I don't have a lot of experience with ACL, so I'm not exactly sure were to look for the root cause. This might as well be a Kubernetes volume mount issue or a root/non-root permission problem (even if the error states "operation not supported"). I running out of ideas.
I will try to do another test (soon), using the same NFS share but without Kubernetes. Maybe this helps to find the problem ...
Beta Was this translation helpful? Give feedback.
All reactions