From cb325f99af01aa8b10bf060a95490246c3ce649a Mon Sep 17 00:00:00 2001 From: Benjamin Lamowski Date: Thu, 19 Sep 2024 16:28:38 +0200 Subject: [PATCH] WIP: libc: sync write seek fixup proposal This tries to solve the problems around multiple queued async writes to the same file for the non-append case, but still needs confirmation and testing. The append-case will work fine when async_write() is called from a synchronous write operation (because that blocks until the write is completed) but will likely end up with reordering or even intertwined writes for async writes to files opened with O_APPEND> POSIX.1-2024 2.8.2 Asynchronous I/O states: "If O_APPEND is set for the file descriptor, or if aio_fildes is associated with a device that is incapable of seeking, write operations append to the file in the same order as the calls were made, with the following exception: under implementation-defined circumstances, such as operation on a multi-processor or when requests of differing priorities are submitted at the same time, the ordering restriction may be relaxed." We could try to do a full write in the append case (just like with O_NONBLOCK), but in that case we would change the normal, synchronous case for a very slight improvment: Since under the hood the Job Registry reverses the order of all incomplete jobs after each run, such a reordering is even likely and by Posix' "relaxed" ordering, doing to do async writes to O_APPEND files doesn't seem like such a great idea anyway. To recap: the current code might even end up with interleaved writes for async writes to a file opened with O_APPEND. This could be mitigated by attempting to write out the write request in full, which would lead to possibly reordered but at least not interleaved writes to O_APPEND files. We currently don't do this, because it would change the write approach for synchronous writes to O_APPEND files as well, which seems like the much saner case, and thus it does not seem like a good tradeoff. Issue #5302 --- repos/libports/src/lib/libc/vfs_plugin.cc | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/repos/libports/src/lib/libc/vfs_plugin.cc b/repos/libports/src/lib/libc/vfs_plugin.cc index cd8b0d0ed61..2b045247f50 100644 --- a/repos/libports/src/lib/libc/vfs_plugin.cc +++ b/repos/libports/src/lib/libc/vfs_plugin.cc @@ -966,14 +966,12 @@ bool Libc::Vfs_plugin::async_write(File_descriptor *fd, Result out_result = Result::WRITE_OK; - if (offset && write_state.first_run) - handle->seek(offset); - - write_state.first_run = false; - if (fd->flags & O_NONBLOCK) { Const_byte_range_ptr const src { (char const *)buf, count }; + if (offset && !(fd->flags & O_APPEND)) + handle->seek(offset); + out_result = handle->fs().write(handle, src, write_state.bytes_written); if (out_result == Result::WRITE_OK) @@ -1007,6 +1005,13 @@ bool Libc::Vfs_plugin::async_write(File_descriptor *fd, Const_byte_range_ptr const src { (char const *)buf + write_state.bytes_written, remaining_count }; + /* + * Always seek to the write position in case async + * writes are intertwined. + */ + if (!(fd->flags & O_APPEND)) + handle->seek(offset + write_state.bytes_written); + out_result = handle->fs().write(handle, src, partial_out_count); if (out_result == Result::WRITE_ERR_WOULD_BLOCK) @@ -1041,7 +1046,6 @@ bool Libc::Vfs_plugin::async_write(File_descriptor *fd, } switch (out_result) { - /* Not reached */ case Result::WRITE_ERR_WOULD_BLOCK: result_errno = EWOULDBLOCK; retval = -1;