Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeouts and unreliable progress reporting for large --set-datapack-growfactor #791

Closed
rmsc opened this issue Aug 3, 2023 · 5 comments
Closed

Comments

@rmsc
Copy link

rmsc commented Aug 3, 2023

This may well be a misunderstanding on my part of how the growfactor works, but please bear with me.

I'm backing up to storj, where they charge a small per-segment fee. To optimize for this fee, files should be multiples of 64MiB in size. I've set the datapack size to be 60MiB, and the datapack growfactor to be 64MiB (1024). I believe this ensures that files are a tad smaller than a multiple of 64MiB.

Unfortunately, after increasing the growfactor I get consistent timeouts:

[02:21:51] backing up...                  █████████████████████████░░░░░░░░░░░░░░░ 184.91 GiB/286.41 GiB 6.53 MiB/s   (ETA 78m)                                               [WARN] Error error sending request for url (http://127.0.0.1:33963/data/<redacted>): operation timed out at 290.004745ms, retrying
[02:22:22] backing up...                  █████████████████████████░░░░░░░░░░░░░░░ 184.91 GiB/286.41 GiB 64.48 KiB/s  (ETA 78m)                                               [WARN] Error error sending request for url (http://127.0.0.1:33963/data/<redacted>): operation timed out at 462.51274ms, retrying
[02:22:25] backing up...                  █████████████████████████░░░░░░░░░░░░░░░ 184.91 GiB/286.41 GiB 40.05 KiB/s  (ETA 78m)                                               [INFO] rclone output: <3>ERROR : FS sj://rustic: cp input ./data/<redacted>/<redacted> []: unexpected EOF

[02:22:25] backing up...                  █████████████████████████░░░░░░░░░░░░░░░ 184.91 GiB/286.41 GiB 39.44 KiB/s  (ETA 78m)                                               [INFO] rclone output: <3>ERROR : data/<redacted>/<redacted>: Post request put error: unexpected EOF

[INFO] rclone output: <3>ERROR : data/<redacted>/<redacted>: Post request rcat error: unexpected EOF

Does it make sense to use the growfactor this way? Are there any unexpected consequences, apart from the timeouts?

Increasing the repository timeout does prevent the errors:

[repository.options]
timeout = "5min"

Unfortunately it makes the progress bar completely unusable. I get short bouts of high data rate followed by long periods where it seems to stall:

[00:02:12] backing up...                  ███████████████████████░░░░░░░░░░░░░░░░░ 170.41 GiB/286.41 GiB 238.99 MiB/s (ETA 89s)                                               
[00:13:18] backing up...                  ████████████████████████░░░░░░░░░░░░░░░░ 176.50 GiB/286.41 GiB 0B/s         (ETA 8m)                                                
[00:14:51] backing up...                  ████████████████████████░░░░░░░░░░░░░░░░ 177.08 GiB/286.41 GiB 107.92 KiB/s (ETA 9m)                                                [00:16:04] backing up...                  ████████████████████████░░░░░░░░░░░░░░░░ 177.08 GiB/286.41 GiB 1B/s         (ETA 10m)

Measuring with vnstat -l reveals that the real datarate is actually ok at slightly above 100 Mbps (my line limit) with a few short periods at 50 or 60 Mbps.

@rmsc rmsc changed the title timeouts and unreliable progress reporting for large --set-datapack-growfactor Timeouts and unreliable progress reporting for large --set-datapack-growfactor Aug 3, 2023
@aawsome
Copy link
Member

aawsome commented Aug 3, 2023

Thanks for reporting!

Which version of rustic are you using?
Maybe related: #683, #720

@aawsome
Copy link
Member

aawsome commented Aug 3, 2023

About the packsize-tuning-options:

Does it make sense to use the growfactor this way? Are there any unexpected consequences, apart from the timeouts?

Actually not really. The targeted packsize for data packs is computed by: "datapack-size" + "datapack-growfactor" * sqrt(current reposize). This means, you'll nevertheless get all kind of pack sizes. That means: increasing target pack sizes which should settle at some sweet spots are currently not supported.

A better option in your case would be to pick a fixed multiple of 64 MiB (e.g. 64, 128 or 180) and substract some MiB (as you did) to ensure you'll really land below. But then use datapack-growfactor = 0 to ensure the pack sizes stay fixed.

@aawsome
Copy link
Member

aawsome commented Aug 3, 2023

About the progress: I think this is a known issue in indicatif which (IIRC) is solved in some newer versions - which now use a exponential estimator. So, maybe #790 already solves this - which would mean it will be in the upcoming 0.6.0 release.

Another possibility for older rustic versions is to adjust the progress bar update using progress-interval (supported as CLI option, env variable and within the config file)

@rmsc
Copy link
Author

rmsc commented Aug 3, 2023

Thanks for reporting!

Which version of rustic are you using? Maybe related: #683, #720

Thank you! It's 0.5.4.

About the packsize-tuning-options:

Does it make sense to use the growfactor this way? Are there any unexpected consequences, apart from the timeouts?

Actually not really. The targeted packsize for data packs is computed by: "datapack-size" + "datapack-growfactor" * sqrt(current reposize). This means, you'll nevertheless get all kind of pack sizes. That means: increasing target pack sizes which should settle at some sweet spots are currently not supported.

Ah I just realized something: my repo is currently at 256 GB, so the targeted packsize is actually about 1GiB.. It's probably a lot to transfer reliably over a 100 Mbps line..

A better option in your case would be to pick a fixed multiple of 64 MiB (e.g. 64, 128 or 180) and substract some MiB (as you did) to ensure you'll really land below. But then use datapack-growfactor = 0 to ensure the pack sizes stay fixed.

I never realized I could set the growfactor to zero! I'll try this and report back. Thanks!

@rmsc
Copy link
Author

rmsc commented Aug 4, 2023

A better option in your case would be to pick a fixed multiple of 64 MiB (e.g. 64, 128 or 180) and substract some MiB (as you did) to ensure you'll really land below. But then use datapack-growfactor = 0 to ensure the pack sizes stay fixed.

I never realized I could set the growfactor to zero! I'll try this and report back. Thanks!

I think this worked, thanks! The maximum datapack size ended up staying at 79MiB, which is less than it was with the default config. Perhaps deleting old snapshots and/or doing a full repack would bring it even closer to 64MiB, but this is good enough for now.

Would it make sense to explicitly say that it's possible to set the growth factor to zero?
EDIT: I mean, in the documentation

Another possibility for older rustic versions is to adjust the progress bar update using progress-interval (supported as CLI option, env variable and within the config file)

Unfortunately this didn't help much.. That said, I'll gladly wait for 0.6.0.

Closing now, thank you!

@rmsc rmsc closed this as completed Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants