Introduce (global) default config file locking #1610

n1hility · 2023-08-11T07:13:16Z

This PR is a follow on to containers/podman#19557

It adds a shared locking facility for containers.conf, notably to ensure concurrent connection definition updates are safely performed. It is intended to replace the backend-specific approach used in containers/podman#19557, as well as support other future usage patterns.

This PR also introduces a unit test that relies on a helper executable. Unfortunately, this is necessary since Linux/POSIX fcntl record locking behavior forcefully shares locks at the process level, so different file descriptors can not be used to simulate multi-process locking. Technically storage's impl of LockFile could be refactored to utilize Linux-specific open fd locks, but this would only address one platform (although Windows behaved similarly - locks are exclusive to a handle). Ultimately, I concluded consistency is better, plus "if it ain't broke, don't fix it".

Adds LockDefault() and UnlockDefault(), which lock an associate file mapping to the same default containers file that Write() selects. This functionality allows for multiple read-and-write operations to be atomically executed by serializing access from multiple updaters.

As an example, if two parallel updaters are inserting a different entry into a map, one will likely be omitted from being overwritten with data from a stale read. Instead, using these functions, callers can coordinate, ensuring writes are never interposed between another caller's read-update pattern.

openshift-ci · 2023-08-11T07:13:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: n1hility
Once this PR has been reviewed and has the lgtm label, please assign rhatdan for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Adds LockDefault() and UnlockDefault() which lock an associate file mapping to the same default containers file that Write() selects. This functionality allows for multiple read-and-write operations to be atomically executed by serializing access from multiple updaters. As an example, if two parallel updaters are inserting a different entry into a map, one will likely be omitted from being overwritten with data from a stale read. Instead, using these functions, callers can coordinate, ensuring writes are never interposed between another caller's read-update pattern. Signed-off-by: Jason T. Greene <[email protected]>

rhatdan · 2023-08-11T08:39:04Z

LGTM
@vrothberg PTAL

vrothberg

Can we merge this after #1599? #1599 cleans up the API a bit and I believe we can integrate the Lockfile a bit more.

I somehow prefer if *Config would expose the Lockfile interface. For instance, readers can then call RLock() . All Configs could share the same file lock.

I think this would also be faster since we only need to call getLockFile() once and avoid the mutex.

Independent of this change: it's another case showing the issues we run into when (ab)using containers.conf as a database. I'd much prefer this to be part of the libpod DB. But that predates this PR by many years, so I am just complaining.

n1hility · 2023-08-11T16:03:27Z

Can we merge this after #1599? #1599 cleans up the API a bit and I believe we can integrate the Lockfile a bit more.

@vrothberg thanks for commenting. I wasn't aware of that PR. I need to dive into the design of it a bit more but, unless its a read-only model, any multi partial-file system that supports arbitrary updates is likely to introduce challenges with our existing model of atomicity we have today, as well as potentially require locking for all operations. More specifically I mean that today we are able to rely on inode atomicity so reads and writes in isolation all refer to state that is consistent. Anyway just an off the top of head comment, I'll take a look at your PR to see how much of this applies.

I somehow prefer if *Config would expose the Lockfile interface

The API design on reading precludes this (glancing at modules this looks the same). Since Config is created after a reading the contents, it's too late to lock. There could be a variant to take a filepath for the lock, but the overall API seems to go to great lengths to hide loading anything but the global default (reading a different file is not exported). Unlock could be moved to Config, much like Write(), but it would be asymmetrical, and it would mean you have to pass the config around, even if it not needed. Although happy to move Unlock if preferred.

For instance, readers can then call RLock() . All Configs could share the same file lock.

With our atomicity model today read locks aren't needed because the model is always consistent (inode swapping through renaming). Readers can happily read without locks, and only patterns which need to coordinate read-update-write have to care about locking, and those patterns will always need to acquire a write lock upfront before read anyway (since read-write upgrades, if even supported by the OS, can stall or deadlock). Introducing them would also mean making sure that all read operations have to go through them, and that can also lead to issues like readers starving writers and vice versa. Ideally, we could avoid introducing the complexity and problems they would bring.

Although as I was getting at above, we might have to incorporate them with the introduction modules, unless the design partitions them in a read-only way.

I think this would also be faster since we only need to call getLockFile() once and avoid the mutex.

So based on the need to lock before the file is read I don't think you can escape this cost for Lock(), although we could have combo operations like ReadCustomWithLock that would fold it into 1. I agree though that moving Unlock() to config would avoid it there. Although I don't think it would be a huge savings considering they already have to be aquired for lockfile itself and the flock operations will be the longer pole of the cost.

Independent of this change: it's another case showing the issues we run into when (ab)using containers.conf as a database. I'd much prefer this to be part of the libpod DB. But that predates this PR by many years, so I am just complaining.

Yeah I hear you. Ideally we would partition read/write state from read-only, since human flexibility (e.g. things like modules, includes,templates and so on) usually collides head-on with update mechanisms

n1hility · 2023-08-14T03:55:11Z

@vrothberg OK so looking over the module's design it seems like there isn't a real conflict. IIUC (didnt have chance to try out the actual podman PR), it looks like modules are only used in a read-only fashion, and not brought into the picture when the various serialization operations are triggered (e.g. podman system connection *).

vrothberg · 2023-08-14T07:08:49Z

With our atomicity model today read locks aren't needed because the model is always consistent (inode swapping through renaming).

Very good point. It's not needed for readers.

Since Config is created after a reading the contents, it's too late to lock.

What I had in mind is to have one global lock that all Configs share. This lock can be acquired before loading (i.e., https://github.com/containers/common/pull/1599/files#diff-b29f7c9b7a5773852e12eee34a37a3035ddba0dbf6c043000b223f606428257aR76). This would allow for exposing Lock and Unlock as methods of Config and the lockfile would be created only once.

Apologies for being vague initially. You can totally ignore the new "modules" concept. I mostly prefer to block this PR here for #1599 because of the refactoring it does. I think that the call paths are now more centralized and it facilitates integrating locking.

n1hility · 2023-08-14T16:44:20Z

Since Config is created after a reading the contents, it's too late to lock.

What I had in mind is to have one global lock that all Configs share. This lock can be acquired before loading (i.e., https://github.com/containers/common/pull/1599/files#diff-b29f7c9b7a5773852e12eee34a37a3035ddba0dbf6c043000b223f606428257aR76). This would allow for exposing Lock and Unlock as methods of Config and the lockfile would be created only once.

Just to make sure I am following. Right now we use ReadCustomData() to read before writing which skips the Default() etc processing. This returns a config after the FS operation. ATM the ability to read after a config is constructed unexported:

common/pkg/config/config.go

Line 1047 in 671ae57

func ReadCustomConfig() (*Config, error) {

With the PR as is this looks like

config.LockDefault()
cfg := config.ReadCustomData() // Reads from CONTAINERS_CONF location always
cfg.Write()
config.UnlockDefault()

Your thinking is that instead, we do something like

cfg.New() 
cfg.Lock() // locks CONTAINERS_CONF  file always
cfg.ReadCustom() // reads CONTAINERS_CONF  file always
cfg.Write() // write COONTAINERS_CONF always
cfg.Unlock()

Following this approach, maybe it could increase flexibility further and better track the Config obj lifespan, instead of a global var lifespan:

cfg.New(containersConf)
cfg.Lock() // locks filename value passed in new 
cfg.Read() //  reads filename value passed in new
cfg.Write()  // writes filename value passed in new
cfg.Unlock()

Complementary to this or as an alternative to the above this could be a multi-operation like what i was mentioning earlier:

cfg := config.ReadWithLock(containersConf)  // does New(), Lock() and Read()
cfg.Write()
cfg.Unlock()

An advantage of this chain option, in addition to less calling code, is that if someone is using a lock, they aren't going to accidentally Read() before calling Lock()

Apologies for being vague initially. You can totally ignore the new "modules" concept. I mostly prefer to block this PR here for #1599 because of the refactoring it does. I think that the call paths are now more centralized and it facilitates integrating locking.

Gotcha makes sense to me!

openshift-merge-robot · 2023-08-15T17:11:12Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

vrothberg · 2023-08-16T07:15:27Z

Gotcha makes sense to me!

Thanks! I like the idea you shared on how to use (and extend) it

rhatdan · 2023-10-05T12:42:37Z

@n1hility What is going on with this PR?

rhatdan · 2023-12-16T14:40:30Z

Since we have not heard anything on this PR for many month, closing. Reopen if you plan on working on it.

n1hility mentioned this pull request Aug 11, 2023

Implement automatic port reassignment on Windows containers/podman#19557

Merged

n1hility force-pushed the config-locking branch from 3c8c025 to c176a0b Compare August 11, 2023 07:20

n1hility force-pushed the config-locking branch from c176a0b to 4b0bff6 Compare August 11, 2023 07:42

vrothberg reviewed Aug 11, 2023

View reviewed changes

openshift-merge-robot added the needs-rebase label Aug 15, 2023

rhatdan closed this Dec 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce (global) default config file locking #1610

Introduce (global) default config file locking #1610

n1hility commented Aug 11, 2023

openshift-ci bot commented Aug 11, 2023

rhatdan commented Aug 11, 2023

vrothberg left a comment

n1hility commented Aug 11, 2023

n1hility commented Aug 14, 2023 •

edited

Loading

vrothberg commented Aug 14, 2023

n1hility commented Aug 14, 2023 •

edited

Loading

openshift-merge-robot commented Aug 15, 2023

vrothberg commented Aug 16, 2023

rhatdan commented Oct 5, 2023

rhatdan commented Dec 16, 2023

Introduce (global) default config file locking #1610

Introduce (global) default config file locking #1610

Conversation

n1hility commented Aug 11, 2023

openshift-ci bot commented Aug 11, 2023

rhatdan commented Aug 11, 2023

vrothberg left a comment

Choose a reason for hiding this comment

n1hility commented Aug 11, 2023

n1hility commented Aug 14, 2023 • edited Loading

vrothberg commented Aug 14, 2023

n1hility commented Aug 14, 2023 • edited Loading

openshift-merge-robot commented Aug 15, 2023

vrothberg commented Aug 16, 2023

rhatdan commented Oct 5, 2023

rhatdan commented Dec 16, 2023

n1hility commented Aug 14, 2023 •

edited

Loading

n1hility commented Aug 14, 2023 •

edited

Loading