Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

draft solution for corrupted file while renaming when using Docker volumes on Windows #64

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kwkr
Copy link

@kwkr kwkr commented Oct 16, 2023

When running the library on a Docker volume with a Windows host, there seems to be some problem when the helper thread creates the Segment and the mmap that belongs to it. The mmap holds a file open, which leads to a file appearing to be corrupted when trying to rename it from another thread (this happens when creating closed-* files). This issue appears only in this specific setup. Even running a bare example with Rust on Windows doesn't allow to reproduce the issue.

Here is the code that allows to reproduce the situation:

use std::fs::{self, OpenOptions};
use std::thread;
use std::time::Duration;
use memmap2::MmapOptions;
use std::io::Write;

fn main() -> std::io::Result<()> {
    // Spawn a new thread that creates and maps a file
    let handle = thread::spawn(|| {
        // Create and write to a file
        fs::write("file.txt", b"Hello, World!").unwrap();
        
        // Open the file with read and write options
        let file = OpenOptions::new()
            .read(true)
            .write(true)
            .create(false)
            .open("file.txt").unwrap();

        // Memory-map the file
        let mmap = unsafe {
            MmapOptions::new()
                .map_mut(&file)
                .expect("Failed to map the file")
        };
        println!("File is mapped.");
        thread::sleep(Duration::from_secs(2));
        
        // Drop (close) the file handle
        drop(file);

        println!("File handle is dropped.");
        // leave a thread running
        thread::sleep(Duration::from_secs(120));
    });

    // Give the other thread a head start to ensure the file is created and mapped
    thread::sleep(Duration::from_secs(10));

    // Rename the file
    fs::rename("file.txt", "renamed_file.txt").unwrap();
    println!("File is renamed.");

    // Keep the main thread alive to observe the behavior
    thread::sleep(Duration::from_secs(120));

    Ok(())
}

When trying to list the file from some other terminal, it will appear as it is "corrupted".

Screenshot from 2023-10-14 17-01-14

(in the example, the file renamed_file.txt would be in this state)

This will happen when running on a Docker volume mounted on Windows.

I implemented a simple solution that involves dropping the mmap before renaming the file and the reloading it again on the main thread. I'm not sure whether it's the most clean way, but I'm not proficient with Rust yet so any suggestions on how to improve it are welcome.

Would adding some tests using Docker for it be required? I can see there is one workflow on Windows machine, but I'm not sure if the overhead for that isn't too big.

src/segment.rs Outdated

match result {
Ok(_) => break,
Err(e) if retries == MAX_RETRIES - 1 => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we expect that file will be eventually? Is there some kind of background process?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When testing, I noticed that after dropping the original mmap using the unload_mmap and renaming the file (+ the rename method swaps the path to the new file in the rename method), the file isn't immediately released/available (aka I got File not found errors, when trying to do it directly) so as a quick and dirty solution I implemented the retry mechanism which solved the issue.

The assumption is that the file exists (it just got copied to some different location), but I'm not actually sure why it isn't available immediately. I may just not be aware of some cleaner way to actually sync the changes.

Copy link
Author

@kwkr kwkr Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional thought is that it might be actually reasonable to include the whole reload (unload/reload) operation within the rename function as it doesn't seem like an operation that should be freely available on the public API of the object, because it's there only for this specific case.

@kwkr
Copy link
Author

kwkr commented Oct 18, 2023

@generall
After a few more tests it looks like calling .flush before unloading the mmap allows dropping the file handle immediately after unload, so the retries aren't necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants