question: can the proxyFS Fuse be mounted on multiple machines at same time? #223

segator · 2018-11-27T23:34:24Z

as far as I understand, proxyFS is very similar to https://github.com/s3ql/s3ql
but it does have the "segment" concept that is pretty nice, what i'm worried is that I'm not sure if you are forced to only have 1 single fuse mount at the same time, so only one machine can have the whole proxyFS mounted.(yes you can use NFS and API but all the traffic then pass thru this machine.
Could you explain me if this is correct or you are able to mount multiple machine's at same time so no traffic is passing through single machine?

Regards,

edmc-ss · 2018-11-28T00:37:36Z

He @segator - nice link to s3ql

While I think this discussion is probably of interest to those following the Slack Workspace, let me see if I can address it here. For how to join that Slack Workspace, see the README.md of this GitHub repo.

I've looked at s3ql before and note that it primarily uses S3 as a block store. It places the blocks of the file system in various S3 objects... tracking reference counts and such to enable free space reclamation when files/directories are modified/deleted and/or snapshots are removed. Great stuff. For performance, as you can imagine, it maintains the metadata of the file system locally in a SQL database it appears. As such, it's not something one would use for sharing data among multiple nodes of course.

The intention of ProxyFS is to go a bit beyond this use case. Consider, for instance, that adoption of Object/RESTful storage is often a journey for many workflows. As such, there is at least some time where sharing of data between file access modes (be those a local file system or a remotely mounted file system) and Object/RESTful APIs would be useful. That was really the impetus for the creation of ProxyFS. s3ql, by contrast, does not endeavor to share data between these access modes lest the consumer coming from the Object/RESTful API have to also understand the persisted form of the metadata for the file system. Perhaps the s3ql authors have thought about that but it does not appear so to me.

You are quite right, however, to point out that today the file system is only served by a single node... so the effect is somewhat the same limitation as for s3ql. There are various efforts (some underway) to extend ProxyFS beyond this limitation, but that is the current state of things... at least for file system access.

Swift architecture is such that middleware can be inserted in the WSGI processing stack at the Proxy tier. Being otherwise stateless, the Swift Proxy nodes scale out very nicely. Indeed, many Swift clusters have a Swift Proxy instance on every single storage node with a load balancing solution to distribute traffic among them. In this way, total bandwidth to Swift scales nicely. But for ProxyFS, this would seemingly not be possible. The way we regain scale out benefits is by use of this middleware (see the pfs_middleware subdirectory) that resides in every Swift Proxy instance and contacts the ProxyFS node only to exchange metadata. So while metadata operations may bottleneck through this single ProxyFS node, read and write data goes directly from the Swift Proxy instance directly to the Object Servers. For typically sized objects, the overhead of consulting with the lone ProxyFS instance is not noticeable and does not preclude true scale out performance gains.

Your question specifically about FUSE exposure is interesting. At this point, NFS presentation of volumes is by means of a FUSE (local) export on the ProxyFS node that NFSd is told, via /etc/exports, to export. For SMB, things are a bit more abstracted. ProxyFS has a jrpcclient/VFS plug-in to Samba that communicates with the ProxyFS instance. Today, this is one-to-one. Thus, like for NFS, SMB shares are only visible from a single node at a time. But work is well underway to remove this limitation. The goal is to instantiate any number of Samba instances all talking through the ProxyFS just like the pfs_middleware enables for Object/RESTful APIs. Indeed, that architecture will be used by the very similar FSAL mechanism of nfs-ganesha one day... bypassing the need to pass through FUSE on the ProxyFS node. The important thing to note is that metadata operations will continue to funnel through the lone ProxyFS instance (imminently HA) while reads and writes of file data will scale out just as for the pfs_middleware.

I hope this answers the core of your question as to how ProxyFS differentiates from the s3ql approach and also how true scale out performance is ultimately planned to be achieved.

Please join us in the Slack Workspace... love to have you.

segator · 2018-11-28T18:12:01Z

first of all, thank you for your fast answer!!
I just joined to the slack to talk about.
Anyway, you expose here a lot of information, I'm trying to understand all of it to avoid ask weird questions.

S3QL is compatible with multiple platforms not only S3, also swift and other custom implementations like gdrive,blackblaze..ovh and so on the developer have a very very simple interface to add new backends.

I don't have any doubt your architectural solution is better, you solved the problem with little files with the segment concept and segment garbage collector, it sounds pretty awesome to me.
You also got the S3 GW directly that is plugabble to newly cloud-native applications

But I'm not sure If i get the answer i was looking from your answer.
Do you have any architecture design I can look at?

does the proxyFS Fuse(is a single client) that is connected to a HA service that controls all posix metadata..locks?
fuse client connect directly to Swift to get the object or download the objects thru a central service?
what I trying to see is if you will have any bootleneck by bandwith for example.

Sorry I pretty new to openstack and swift world I only trying to understand the system from an architectural point of view.

edmc-ss · 2018-11-28T23:24:05Z

I attempted to answer the above questions in the Slack #general channel. Probably easiest to continue the discussion there...

edmc-ss · 2019-06-18T17:54:51Z

I thought the OP might be interested in a new piece of functionality that is in what I'd call "alpha" form in 1.11.0 (latest stable tag) - PFSAgent.

This is a tool that actually runs where ever you have both access to a Swift API cluster enabled with ProxyFS and ability to use Bazil FUSE. It provides a local FUSE mount point to a ProxyFS Volume complete with local read and write caching.

Warning: It's "alpha" mostly because the yet-to-be-completed "Lease Management" feature is non-functional and, hence, does not ensure coherency among multiple instances of PFSAgent (nor with users of either NFS or SMB mounting directly from the ProxyFS node). That's coming soon.

You can see how I run it by looking at the .conf file in the pfsagentd/ subdirectory. You basically just provide it Swift credentials and set up some tunable cache control values.

PFSAgent uses a new HTTP Method called "PROXYFS" that simply exposes the internal JSON RPC mechanism used by the pfs_middleware and Samba protocol layers to talk to ProxyFS. Swift performs the authorization and requires a setting in the [filter:pfs] section of your proxy-server.conf file(s)... typically bypass_mode = read-write.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question: can the proxyFS Fuse be mounted on multiple machines at same time? #223

question: can the proxyFS Fuse be mounted on multiple machines at same time? #223

segator commented Nov 27, 2018

edmc-ss commented Nov 28, 2018

segator commented Nov 28, 2018

edmc-ss commented Nov 28, 2018

edmc-ss commented Jun 18, 2019

question: can the proxyFS Fuse be mounted on multiple machines at same time? #223

question: can the proxyFS Fuse be mounted on multiple machines at same time? #223

Comments

segator commented Nov 27, 2018

edmc-ss commented Nov 28, 2018

segator commented Nov 28, 2018

edmc-ss commented Nov 28, 2018

edmc-ss commented Jun 18, 2019