Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use O_APPEND and FILE_APPEND_DATA to allow for atomic file appends #55465

Closed
wants to merge 7 commits into from

Conversation

adamsitnik
Copy link
Member

@adamsitnik adamsitnik commented Jul 10, 2021

So far, our FileMode.Append implementation was always seeking to the end of file, saving it as the current position and later using this position for writes. It was assuming that nobody else can be writing to the same file (which does not need to always be the true).

Moreover, none of the FileStream ctors that accept handle accepts a FileMode. This information was missing, and FileStream created from handle opened with O_APPEND or FILE_APPEND_DATA could not recognize that it's supposed to append to the end of file.

By using O_APPEND and FILE_APPEND_DATA we can force the OS to perform atomic writes to the end of file for us.

This can be combined with disabled buffering (bufferSize == 0) and allow for atomic appends to file, which could be very useful for some loggers like serilog/serilog-sinks-file#221

fixes #53432

adamsitnik and others added 3 commits July 10, 2021 21:26
# Conflicts:
#	src/libraries/System.IO.FileSystem/tests/Net5CompatTests/System.IO.FileSystem.Net5Compat.Tests.csproj
@ghost
Copy link

ghost commented Jul 10, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

So far, our FileMode.Append implementation was always seeking to the end of file, saving it as the current position and later using this position for writes. It was assuming that nobody else can be writing to the same file (which does not need to always be the truth).

Moreover, none of the FileStream ctors that accept handle accepts a FileMode. This information was missing, and FileStream created from handle opened with O_APPEND or FILE_APPEND_DATA could not recognize that it's supposed to append to the end of file.

By using O_APPEND and FILE_APPEND_DATA we can force the OS to perform atomic writes to the end of file for us.

This can be combined with disabled buffering (bufferSize == 0) and allow for atomic appends to file, which could be very useful for some loggers like serilog/serilog-sinks-file#221

fixes #53432

Author: adamsitnik
Assignees: -
Labels:

area-System.IO

Milestone: -


// POSIX requires that pwrite should respect provided offset even for handles opened with O_APPEND.
// But Linux and BSD don't do that, moreover their behaviour is different. So we always use write for O_APPEND.
int result = handle.CanSeek && !handle.IsAppend ?
Copy link
Member

@stephentoub stephentoub Jul 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this add yet another syscall in some cases (in particular for a new SafeFileHandle(fd, ...)-constructed handle? Is that really worth it?

WriteAsyncCore(source, cancellationToken).AsValueTask();
#pragma warning restore CA2012

private ValueTask<int> WriteAsyncCore(ReadOnlyMemory<byte> source, CancellationToken cancellationToken)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does RandomAccess.WriteAtOffsetAsync guarantee to write out everything it was given, or might it write less? If the latter, I'm realizing now that both this and the UnixFileStreamStrategy are potentially broken, in that their WriteAsync methods could then end up doing only a partial write even though the stream contract guarantees everything will be written.


public override long Position
{
get => Length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this logic. On both .NET Framework 4.8 and .NET 5, this:

using System;
using System.IO;

class Program
{
    static void Main()
    {
        const string Path = @"tmp.txt";
        File.Delete(Path);

        using (var fs = new FileStream(Path, FileMode.Append))
        {
            fs.Write(new byte[] { 1, 2, 3, 4, 5 }, 0, 5);
            fs.Position = 1;
            Console.WriteLine(fs.Position);
            Console.WriteLine(fs.Length);
        }
    }
}

prints:

1
5

With the logic you have here, won't this print:

5
5

?

Comment on lines +113 to +119
// Prevent users from overwriting data in a file that was opened in append mode.
if (newLength < length)
{
throw new IOException(SR.IO_SeekAppendOverwrite);
}

FileStreamHelpers.SetFileLength(_fileHandle, newLength);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't understand this logic. This program:

using System;
using System.IO;
using System.Text;

class Program
{
    static void Main()
    {
        const string Path = @"tmp.txt";
        File.Delete(Path);

        using (var fs = new FileStream(Path, FileMode.Append))
        {
            fs.Write(Encoding.ASCII.GetBytes("hello"));
            fs.Position = 1;
            fs.Write(Encoding.ASCII.GetBytes("abc"));
        }

        Console.WriteLine(File.ReadAllText(Path));
    }
}

today prints:

habco

highlighting that a) you are allowed to seek backwards (just not before the position when the file was opened) and b) it's not losing all data after the seek point.

if (value < Length)
{
throw new IOException(SR.IO_SetLengthAppendTruncate);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, this seems wrong. Today this:

using System;
using System.IO;
using System.Text;

class Program
{
    static void Main()
    {
        const string Path = @"tmp.txt";
        File.Delete(Path);

        using (var fs = new FileStream(Path, FileMode.Append))
        {
            fs.Write(Encoding.ASCII.GetBytes("hello"));
            fs.SetLength(1);
        }

        Console.WriteLine(File.ReadAllText(Path));
    }
}

prints

h

but it seems like this code will end up throwing.

@adamsitnik
Copy link
Member Author

I am going to close the PR right now and re-open it when I am back from vacations.

@adamsitnik adamsitnik closed this Jul 11, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Aug 10, 2021
# Conflicts:
#	src/libraries/System.Private.CoreLib/src/System/IO/RandomAccess.Unix.cs
#	src/libraries/System.Private.CoreLib/src/System/IO/Strategies/FileStreamHelpers.cs
#	src/libraries/System.Private.CoreLib/src/System/IO/Strategies/OSFileStreamStrategy.cs
…Exception exception" - the test should verify the intended behavior, not buggy implementation
@adamsitnik adamsitnik reopened this Aug 19, 2021
…andles/SafeFileHandle.Unix.cs

Co-authored-by: Stephen Toub <[email protected]>
@adamsitnik
Copy link
Member Author

The official doc for FileMode.Append says:

Trying to seek to a position before the end of the file throws an IOException exception

So setting Postion to anything that is less that Length should throw. I believe that existing implementation was incomplete.

The position was captured when file was being opened:

private void Init(FileMode mode, string originalPath, FileOptions options)
{
Debug.Assert(!_useAsyncIO || _fileHandle.ThreadPoolBinding != null);
// For Append mode...
if (mode == FileMode.Append)
{
_appendStart = SeekCore(_fileHandle, 0, SeekOrigin.End);

We were trying to prevent from truncating the file:

if (_appendStart != -1 && value < _appendStart)
throw new IOException(SR.IO_SetLengthAppendTruncate);

// Prevent users from overwriting data in a file that was opened in
// append mode.
if (_appendStart != -1 && pos < _appendStart)
{
SeekCore(_fileHandle, oldPos, SeekOrigin.Begin);
throw new IOException(SR.IO_SeekAppendOverwrite);
}

The problem is that we were never updating the _appendStart value which should be happening after each write.

The way I understand FileMode.Append: always append to the end of file, don't allow for truncation or for overwriting the data.

@JeremyKuhne @stephentoub is my understanding correct?

@stephentoub
Copy link
Member

stephentoub commented Aug 19, 2021

I don't agree with the interpretation. I read the docs as being about seeking to earlier than the position when the file was opened, which is exactly what the implementation supports. Changing it is a breaking change, not only in behavior but also in required permissions.

@stephentoub stephentoub added the breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. label Aug 19, 2021
@ghost ghost added the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Aug 19, 2021
@dotnet dotnet unlocked this conversation Aug 19, 2021
@adamsitnik
Copy link
Member Author

also in required permissions.

Could you please explain what do you mean by that?

@snakefoot
Copy link

snakefoot commented Aug 19, 2021

@adamsitnik I can understand the wish to make the FileStream-contract predictable, when applying FILE_APPEND_DATA / O_APPEND where Operating System always appends to end (and ignore any file-seeks or change of file-position).

When the user has opened a file with FileMode.Append then I think the user will accept that even if the doing custom file-seeks, then all writes will be appended to the end (independent of current file-position).

But it will be confusing for the user if not possible to make random reads. For me FileMode.Append means always write to the end of the file, but still allow random reads.

  • When making a file-write then FileStream-Position should automatically move to the end when FileMode.Append (Perform the write and then update FileStream-Position)
    • For performance reasons then one could make internal filestream-position nullable, and file-write resets the filestream-position to null. When checking FileStream-Position and it is null, then it will assign it to FileStream-Length.
  • When checking current FileStream-Position then it will remain unchanged (even if other processes might have appended to the file).
  • When making a file-read then it will always use the current FileStream-Position

@JeremyKuhne
Copy link
Member

I think there is some unfortunate concept overlap between specific rights and "position at the end of the file". It looks as if we were trying to emulate FILE_APPEND_DATA's behavior over GENERIC_WRITE, and not quite getting it right.

I wouldn't change the existing behavior. My opinion is clarify the docs and add a new option if we think differing functionality is important.

@stephentoub
Copy link
Member

Could you please explain what do you mean by that?

I mean FILE_APPEND_DATA is a specific access right:
https://docs.microsoft.com/en-us/windows/win32/fileio/file-access-rights-constants
representing "the right to append data to the file". I don't know what the ramifications in this regard would be of suddently changing FileMode.Append to specify that for a file object that might not allow it.

@adamsitnik adamsitnik added this to the 7.0.0 milestone Aug 20, 2021
@snakefoot
Copy link

In NetFramework then FILE_APPEND_DATA is applied when using the FileStream-constructor with FileSystemRights.AppendData

@adamsitnik
Copy link
Member Author

I don't know what the ramifications in this regard would be of suddently changing FileMode.Append to specify that for a file object that might not allow it.

We are currently using GENERIC_WRITE:

int fAccess =
((access & FileAccess.Read) == FileAccess.Read ? Interop.Kernel32.GenericOperations.GENERIC_READ : 0) |
((access & FileAccess.Write) == FileAccess.Write ? Interop.Kernel32.GenericOperations.GENERIC_WRITE : 0);

which already contains FILE_APPEND_DATA amongst many other things:

/// <summary>
/// Maps internally to <see cref="FILE_APPEND_DATA"/> | <see cref="FILE_WRITE_ATTRIBUTES"/> | <see cref="FILE_WRITE_DATA"/>
/// | <see cref="FILE_WRITE_EA"/> | <see cref="STANDARD_RIGHTS_READ"/> | <see cref="SYNCHRONIZE"/>.
/// (For directories, <see cref="FILE_ADD_SUBDIRECTORY"/> | <see cref="FILE_WRITE_ATTRIBUTES"/> | <see cref="FILE_ADD_FILE"/> AddFile
/// | <see cref="FILE_WRITE_EA"/> | <see cref="STANDARD_RIGHTS_READ"/> | <see cref="SYNCHRONIZE"/>.)
/// </summary>
FILE_GENERIC_WRITE = 0x40000000, // GENERIC WRITE

I am not sure if contains is the right word, but it's more or less:

FILE_GENERIC_WRITE  =  FILE_APPEND_DATA | FILE_WRITE_ATTRIBUTES | FILE_WRITE_DATA | FILE_WRITE_EA | STANDARD_RIGHTS_READ and so on...

So permissions would not be a problem

@adamsitnik
Copy link
Member Author

I wouldn't change the existing behavior. My opinion is clarify the docs and add a new option if we think differing functionality is important.

You are right. Let me prepare a proposal. I am going to convert this PR into a draft and add NO MERGE label until then.

@adamsitnik adamsitnik added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Aug 20, 2021
@adamsitnik adamsitnik marked this pull request as draft August 20, 2021 05:56
@adamsitnik
Copy link
Member Author

Done: #53432 (comment)

@ghost
Copy link

ghost commented Sep 19, 2021

Draft Pull Request was automatically closed for inactivity. Please let us know if you'd like to reopen it.

@ghost ghost closed this Sep 19, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Nov 3, 2021
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow for atomic appends to end of file
4 participants