Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce (global) default config file locking #1610

Closed
wants to merge 1 commit into from

Conversation

n1hility
Copy link
Member

This PR is a follow on to containers/podman#19557

It adds a shared locking facility for containers.conf, notably to ensure concurrent connection definition updates are safely performed. It is intended to replace the backend-specific approach used in containers/podman#19557, as well as support other future usage patterns.

This PR also introduces a unit test that relies on a helper executable. Unfortunately, this is necessary since Linux/POSIX fcntl record locking behavior forcefully shares locks at the process level, so different file descriptors can not be used to simulate multi-process locking. Technically storage's impl of LockFile could be refactored to utilize Linux-specific open fd locks, but this would only address one platform (although Windows behaved similarly - locks are exclusive to a handle). Ultimately, I concluded consistency is better, plus "if it ain't broke, don't fix it".

Adds LockDefault() and UnlockDefault(), which lock an associate file mapping to the same default containers file that Write() selects. This functionality allows for multiple read-and-write operations to be atomically executed by serializing access from multiple updaters.

As an example, if two parallel updaters are inserting a different entry into a map, one will likely be omitted from being overwritten with data from a stale read. Instead, using these functions, callers can coordinate, ensuring writes are never interposed between another caller's read-update pattern.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 11, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: n1hility
Once this PR has been reviewed and has the lgtm label, please assign rhatdan for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Adds LockDefault() and UnlockDefault() which lock an associate file
mapping to the same default containers file that Write() selects.
This functionality allows for multiple read-and-write operations
to be atomically executed by serializing access from multiple
updaters.

As an example, if two parallel updaters are inserting a different
entry into a map, one will likely be omitted from being
overwritten with data from a stale read. Instead, using these
functions, callers can coordinate, ensuring writes are never
interposed between another caller's read-update pattern.

Signed-off-by: Jason T. Greene <[email protected]>
@rhatdan
Copy link
Member

rhatdan commented Aug 11, 2023

LGTM
@vrothberg PTAL

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge this after #1599? #1599 cleans up the API a bit and I believe we can integrate the Lockfile a bit more.

I somehow prefer if *Config would expose the Lockfile interface. For instance, readers can then call RLock() . All Configs could share the same file lock.

I think this would also be faster since we only need to call getLockFile() once and avoid the mutex.

Independent of this change: it's another case showing the issues we run into when (ab)using containers.conf as a database. I'd much prefer this to be part of the libpod DB. But that predates this PR by many years, so I am just complaining.

@n1hility
Copy link
Member Author

Can we merge this after #1599? #1599 cleans up the API a bit and I believe we can integrate the Lockfile a bit more.

@vrothberg thanks for commenting. I wasn't aware of that PR. I need to dive into the design of it a bit more but, unless its a read-only model, any multi partial-file system that supports arbitrary updates is likely to introduce challenges with our existing model of atomicity we have today, as well as potentially require locking for all operations. More specifically I mean that today we are able to rely on inode atomicity so reads and writes in isolation all refer to state that is consistent. Anyway just an off the top of head comment, I'll take a look at your PR to see how much of this applies.

I somehow prefer if *Config would expose the Lockfile interface

The API design on reading precludes this (glancing at modules this looks the same). Since Config is created after a reading the contents, it's too late to lock. There could be a variant to take a filepath for the lock, but the overall API seems to go to great lengths to hide loading anything but the global default (reading a different file is not exported). Unlock could be moved to Config, much like Write(), but it would be asymmetrical, and it would mean you have to pass the config around, even if it not needed. Although happy to move Unlock if preferred.

For instance, readers can then call RLock() . All Configs could share the same file lock.

With our atomicity model today read locks aren't needed because the model is always consistent (inode swapping through renaming). Readers can happily read without locks, and only patterns which need to coordinate read-update-write have to care about locking, and those patterns will always need to acquire a write lock upfront before read anyway (since read-write upgrades, if even supported by the OS, can stall or deadlock). Introducing them would also mean making sure that all read operations have to go through them, and that can also lead to issues like readers starving writers and vice versa. Ideally, we could avoid introducing the complexity and problems they would bring.

Although as I was getting at above, we might have to incorporate them with the introduction modules, unless the design partitions them in a read-only way.

I think this would also be faster since we only need to call getLockFile() once and avoid the mutex.

So based on the need to lock before the file is read I don't think you can escape this cost for Lock(), although we could have combo operations like ReadCustomWithLock that would fold it into 1. I agree though that moving Unlock() to config would avoid it there. Although I don't think it would be a huge savings considering they already have to be aquired for lockfile itself and the flock operations will be the longer pole of the cost.

Independent of this change: it's another case showing the issues we run into when (ab)using containers.conf as a database. I'd much prefer this to be part of the libpod DB. But that predates this PR by many years, so I am just complaining.

Yeah I hear you. Ideally we would partition read/write state from read-only, since human flexibility (e.g. things like modules, includes,templates and so on) usually collides head-on with update mechanisms

@n1hility
Copy link
Member Author

n1hility commented Aug 14, 2023

@vrothberg OK so looking over the module's design it seems like there isn't a real conflict. IIUC (didnt have chance to try out the actual podman PR), it looks like modules are only used in a read-only fashion, and not brought into the picture when the various serialization operations are triggered (e.g. podman system connection *).

@vrothberg
Copy link
Member

With our atomicity model today read locks aren't needed because the model is always consistent (inode swapping through renaming).

Very good point. It's not needed for readers.

Since Config is created after a reading the contents, it's too late to lock.

What I had in mind is to have one global lock that all Configs share. This lock can be acquired before loading (i.e., https://github.com/containers/common/pull/1599/files#diff-b29f7c9b7a5773852e12eee34a37a3035ddba0dbf6c043000b223f606428257aR76). This would allow for exposing Lock and Unlock as methods of Config and the lockfile would be created only once.

Apologies for being vague initially. You can totally ignore the new "modules" concept. I mostly prefer to block this PR here for #1599 because of the refactoring it does. I think that the call paths are now more centralized and it facilitates integrating locking.

@n1hility
Copy link
Member Author

n1hility commented Aug 14, 2023

Since Config is created after a reading the contents, it's too late to lock.

What I had in mind is to have one global lock that all Configs share. This lock can be acquired before loading (i.e., https://github.com/containers/common/pull/1599/files#diff-b29f7c9b7a5773852e12eee34a37a3035ddba0dbf6c043000b223f606428257aR76). This would allow for exposing Lock and Unlock as methods of Config and the lockfile would be created only once.

Just to make sure I am following. Right now we use ReadCustomData() to read before writing which skips the Default() etc processing. This returns a config after the FS operation. ATM the ability to read after a config is constructed unexported:

func ReadCustomConfig() (*Config, error) {

With the PR as is this looks like

config.LockDefault()
cfg := config.ReadCustomData() // Reads from CONTAINERS_CONF location always
cfg.Write()
config.UnlockDefault()

Your thinking is that instead, we do something like

cfg.New() 
cfg.Lock() // locks CONTAINERS_CONF  file always
cfg.ReadCustom() // reads CONTAINERS_CONF  file always
cfg.Write() // write COONTAINERS_CONF always
cfg.Unlock() 

Following this approach, maybe it could increase flexibility further and better track the Config obj lifespan, instead of a global var lifespan:

cfg.New(containersConf)
cfg.Lock() // locks filename value passed in new 
cfg.Read() //  reads filename value passed in new
cfg.Write()  // writes filename value passed in new
cfg.Unlock()

Complementary to this or as an alternative to the above this could be a multi-operation like what i was mentioning earlier:

cfg := config.ReadWithLock(containersConf)  // does New(), Lock() and Read()
cfg.Write()
cfg.Unlock()

An advantage of this chain option, in addition to less calling code, is that if someone is using a lock, they aren't going to accidentally Read() before calling Lock()

Apologies for being vague initially. You can totally ignore the new "modules" concept. I mostly prefer to block this PR here for #1599 because of the refactoring it does. I think that the call paths are now more centralized and it facilitates integrating locking.

Gotcha makes sense to me!

@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vrothberg
Copy link
Member

Gotcha makes sense to me!

Thanks! I like the idea you shared on how to use (and extend) it

@rhatdan
Copy link
Member

rhatdan commented Oct 5, 2023

@n1hility What is going on with this PR?

@rhatdan
Copy link
Member

rhatdan commented Dec 16, 2023

Since we have not heard anything on this PR for many month, closing. Reopen if you plan on working on it.

@rhatdan rhatdan closed this Dec 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants