Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-127921: Add RWF_ATOMIC constant to os module #127922

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

rruuaanng
Copy link
Contributor

@rruuaanng rruuaanng commented Dec 13, 2024

@rruuaanng rruuaanng marked this pull request as ready for review December 13, 2024 13:59
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to support this yet. There are very few people using RWF_ATOMIC, and as far as I can tell, it's not even documented by Linux yet. I'd like to wait until there's some actual usage before we throw around support for new things, because there is a maintenance burden that comes with it.

Furthermore, I'm not sure this flag would be useful in the first place. The GIL protects atomicity for most things, and the better option would be to use a mutex for the free-threaded build.

@rruuaanng
Copy link
Contributor Author

and as far as I can tell, it's not even documented by Linux yet

Usually the operating system documentation does not list all functions, but only lists the best practices for common operations. It is different from the documentation of the programming language.
(In fact, this is related to the community environment. Most people on the mailing list don't like to write documents, and there are a lot of missing documents in Linux.)

Furthermore, I'm not sure this flag would be useful in the first place. The GIL protects atomicity for most things, and the better option would be to use a mutex for the free-threaded build.

But the GIL cannot protect the pwritev system call, right? Maybe it should be called transactional atomicity.

@ZeroIntensity
Copy link
Member

Looking closer at the emails, RWF_ATOMIC is not for thread safety, it's for torn writes:

Torn write prevention means that for a power or any other HW failure, all
or none of the data will be committed to storage, but never a mix of old
and new.

So the GIL and free-threading are actually irrelevant here, my bad. The function itself is thread-safe.

If Linux decided that they don't have to document things (which seems lazy, but sure), then we should have tests for this, or at least investigate the implications of it. Are we sure that Python will be able to properly take advantage of RWF_ATOMIC?

@rruuaanng
Copy link
Contributor Author

Are we sure that Python will be able to properly take advantage of RWF_ATOMIC?

Some mini databases may use it as the underlying write mechanism because database systems often need to ensure transactional consistency. They do not want data to be written incompletely, where only part of the data is written to disk and the rest is lost.

@ZeroIntensity
Copy link
Member

What databases? The grep I sent earlier showed very little usage of it.

FWIW, I'm not opposed to adding this since it's not related to thread safety, but I would like you to experiment with it before we just support new (and possibly buggy) flags.

@rruuaanng
Copy link
Contributor Author

What databases?

I mean, for example (if it were me, I would choose it)

but I would like you to experiment with it before we just support new (and possibly buggy) flags.

Actually, this is difficult to test. I need to make the system crash during its writing process, but that's not something I can control :( So, I have provide the documentation of the patch it offers, as shown below.

+# The main selling point of atomic writes is that it is guaranteed writes
+# to storage will not be torn for a power failure or kernel crash.

Here, "torn" refers to an interruption caused by some external failure during the writing process. For example, a system crash occurring while a system call is performing I/O. In the case of non-atomic writes, data blocks that were successfully written may remain in the file, but this is not what the user intends. In the case of atomic writes, when the system crashes, the successfully written portions are automatically removed.
That say, when the write operation is successful, the entire data will be written. If the operation fails, no data will be written.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants