Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for CernVM-FS graph driver #383

Closed
jwflory opened this issue Jul 11, 2019 · 13 comments
Closed

Add support for CernVM-FS graph driver #383

jwflory opened this issue Jul 11, 2019 · 13 comments

Comments

@jwflory
Copy link

jwflory commented Jul 11, 2019

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description

CernVM-FS (cvmfs) is a read-only, distributed filesystem with use cases specific to research / HPC computing. There is a CernVM-FS graph driver for Docker that enables Docker to use cvmfs as a storage back-end.

This is desirable for a number of reasons. For advanced computing environments with parallel file systems (see containers/podman#3478), there are a number of challenges for user namespaces in these environments. Using cvmfs as a storage back-end for containers works around the existing challenges with parallel filesystems and also allows administrators of these computing environments to store and distribute container images quickly.

A user story:

Systems Administrator Joe wants to put container image artifacts on a cheaper tier storage than GPFS (containers/podman#3478) and chose CernVM-FS. Would a container storage support for cvmfs make sense? Red Hat Quay supports bittorrent. Would cvmfs make sense for Red Hat Quay or should a Quay → cvmfs bridge such as DUCC be used?

Steps to reproduce the issue:

  1. podman --storage-driver=cvmfs pull registry.fedoraproject.org/fedora:latest

Additional environment details (AWS, VirtualBox, physical, etc.):

This use case is, as far as I am aware, exclusive to HPC / grid computing administrators for enabling rootless container support in parallel filesystem environments.

@rhatdan
Copy link
Member

rhatdan commented Jul 11, 2019

We would need a PR from someone with access to Cern VM-FS. I am not sure if Fedora or Ubuntu has native support for this.

@rhatdan
Copy link
Member

rhatdan commented Nov 1, 2019

@jwflory Is this still something you want to pursue?

@jwflory
Copy link
Author

jwflory commented Nov 2, 2019

@rhatdan Unfortunately it isn't something I'm able to work on right now, sorry!

@rhatdan
Copy link
Member

rhatdan commented Nov 3, 2019

@nalind is this something we could handle?

@nalind
Copy link
Member

nalind commented Nov 5, 2019

I'd have to do more research into it to be certain.

@SEJeff
Copy link

SEJeff commented Nov 7, 2019

If it helps, we've got an ok from cvmfs upstream to package it for Fedora and maybe get it into EPEL. I won't be able to work on that for a month or more likely two but did plan on doing that. It is mostly just some c code and libcurl

@rhatdan
Copy link
Member

rhatdan commented Nov 7, 2019

Sounds cool. If anyone else wants to jump on this in the mean time, it would be appreciated.

@siscia
Copy link

siscia commented Dec 25, 2019

Here from cvmfs team.

I am quite interested in making this happen and our friends in #498 seems to be interested as well.

Now the laboratory is close, but second week of January I will definitely take a closer look into this.

In the meantime, happy holidays everybody!

@rhatdan
Copy link
Member

rhatdan commented Aug 3, 2020

@jwflory @siscia Is there any chance of this moving forward?

@siscia
Copy link

siscia commented Aug 3, 2020

I am actually on vacation!

I am setting a reminder for next week as I come back to work.

@siscia
Copy link

siscia commented Aug 13, 2020

I am now having a quick look at the interface of the layers, that I am guessing is this one (

storage/drivers/driver.go

Lines 68 to 156 in 4176c81

// ProtoDriver defines the basic capabilities of a driver.
// This interface exists solely to be a minimum set of methods
// for client code which choose not to implement the entire Driver
// interface and use the NaiveDiffDriver wrapper constructor.
//
// Use of ProtoDriver directly by client code is not recommended.
type ProtoDriver interface {
// String returns a string representation of this driver.
String() string
// CreateReadWrite creates a new, empty filesystem layer that is ready
// to be used as the storage for a container. Additional options can
// be passed in opts. parent may be "" and opts may be nil.
CreateReadWrite(id, parent string, opts *CreateOpts) error
// Create creates a new, empty, filesystem layer with the
// specified id and parent and options passed in opts. Parent
// may be "" and opts may be nil.
Create(id, parent string, opts *CreateOpts) error
// CreateFromTemplate creates a new filesystem layer with the specified id
// and parent, with contents identical to the specified template layer.
CreateFromTemplate(id, template string, templateIDMappings *idtools.IDMappings, parent string, parentIDMappings *idtools.IDMappings, opts *CreateOpts, readWrite bool) error
// Remove attempts to remove the filesystem layer with this id.
Remove(id string) error
// Get returns the mountpoint for the layered filesystem referred
// to by this id. You can optionally specify a mountLabel or "".
// Optionally it gets the mappings used to create the layer.
// Returns the absolute path to the mounted layered filesystem.
Get(id string, options MountOpts) (dir string, err error)
// Put releases the system resources for the specified id,
// e.g, unmounting layered filesystem.
Put(id string) error
// Exists returns whether a filesystem layer with the specified
// ID exists on this driver.
Exists(id string) bool
// Status returns a set of key-value pairs which give low
// level diagnostic status about this driver.
Status() [][2]string
// Returns a set of key-value pairs which give low level information
// about the image/container driver is managing.
Metadata(id string) (map[string]string, error)
// Cleanup performs necessary tasks to release resources
// held by the driver, e.g., unmounting all layered filesystems
// known to this driver.
Cleanup() error
// AdditionalImageStores returns additional image stores supported by the driver
AdditionalImageStores() []string
}
// DiffDriver is the interface to use to implement graph diffs
type DiffDriver interface {
// Diff produces an archive of the changes between the specified
// layer and its parent layer which may be "".
Diff(id string, idMappings *idtools.IDMappings, parent string, parentIDMappings *idtools.IDMappings, mountLabel string) (io.ReadCloser, error)
// Changes produces a list of changes between the specified layer
// and its parent layer. If parent is "", then all changes will be ADD changes.
Changes(id string, idMappings *idtools.IDMappings, parent string, parentIDMappings *idtools.IDMappings, mountLabel string) ([]archive.Change, error)
// ApplyDiff extracts the changeset from the given diff into the
// layer with the specified id and parent, returning the size of the
// new layer in bytes.
// The io.Reader must be an uncompressed stream.
ApplyDiff(id string, parent string, options ApplyDiffOpts) (size int64, err error)
// DiffSize calculates the changes between the specified id
// and its parent and returns the size in bytes of the changes
// relative to its base filesystem directory.
DiffSize(id string, idMappings *idtools.IDMappings, parent string, parentIDMappings *idtools.IDMappings, mountLabel string) (size int64, err error)
}
// LayerIDMapUpdater is the interface that implements ID map changes for layers.
type LayerIDMapUpdater interface {
// UpdateLayerIDMap walks the layer's filesystem tree, changing the ownership
// information using the toContainer and toHost mappings, using them to replace
// on-disk owner UIDs and GIDs which are "host" values in the first map with
// UIDs and GIDs for "host" values from the second map which correspond to the
// same "container" IDs. This method should only be called after a layer is
// first created and populated, and before it is mounted, as other changes made
// relative to a parent layer, but before this method is called, may be discarded
// by Diff().
UpdateLayerIDMap(id string, toContainer, toHost *idtools.IDMappings, mountLabel string) error
// SupportsShifting tells whether the driver support shifting of the UIDs/GIDs in a
// image and it is not required to Chown the files when running in an user namespace.
SupportsShifting() bool
}
// Driver is the interface for layered/snapshot file system drivers.
type Driver interface {
ProtoDriver
DiffDriver
LayerIDMapUpdater
}
)

It does not seems an impossible work to do, I still have some open questions, but with enough work maybe we can make this happens.

Said so, we should understand who is interested in it and how much resource can be poured into it.

Maybe it would be better to focus our efforts on the use of additional storage that allow to mount layers from cvmfs, without the need of a custom driver. Our GSoC to ehance DUCC to support podman is in reasonably good shape.

Related work is done by our friend here: #644

And if we find a way to coordinate it would be great.

@jwflory feel free to reach me directly so that we can discuss better your needs.

@jwflory
Copy link
Author

jwflory commented Aug 13, 2020

Hi, I worked on this issue at a previous employer and no longer work with the CernVM-FS stack or in an H.P.C. environment. Other points of contact for follow-up might be @SEJeff, @LandonTClipp, or @lpezzaglia.

@rhatdan
Copy link
Member

rhatdan commented Jun 14, 2021

Since there has been no work on this for a while, I am going to close for lack of interest.

@rhatdan rhatdan closed this as completed Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants