Cache file names in fill_todo #144

osiewicz · 2024-04-27T11:56:31Z

Background:
While working with cargo, I've noticed that it takes ~30s to cargo clean -p with large enough target directory (~200GB). Under a profiler, it turned out that most of the time was spent retrieving paths for removal via glob::glob in rm_rf_glob (and not actually removing the files).

Change description:
In call to .sort_by, we repetitively parse the paths to obtain file names for comparison. This commit caches file names in PathWrapper object, akin to #135 that did so for dir info.

For my use case, a cargo build using that branch takes ~14s to clean files instead of previous 30s (I've measured against main branch of this repository, to account for changes made since glob 0.3.1). Still not ideal, but hey, we're shaving 50% of time off for a bit heavier memory use.

…ihanglo Clean package perf improvements ### What does this PR try to resolve? I've noticed that `cargo clean -p` execution time scales poorly with size of target directory; in my case (~250GB target directory on M1 Mac) running `cargo clean -p` takes circa 35 seconds. Notably, it's the file listing that takes that time, not deleting the package itself. That is, when running `cargo clean -p SOME_PACKAGE` twice in a row, both executions take roughly the same time. I've tracked it down to the fact that we seem quite happy to use `glob::glob` function, which iterates over contents of target dir. It also was a bit sub-optimal when it came to doing that, for which I've already filled a PR in rust-lang/glob#144 - that PR alone takes down cleaning time down to ~14 seconds. While it is a good improvement for a relatively straightforward change, this PR tries to take it even further. With glob PR applied + changes from this PR, my test case goes down to ~6 seconds. I'm pretty sure that we could squeeze this further, but I'd rather do so in a follow-up PR. Notably, this PR doesn't help with *just* super-large target directories. `cargo clean -p serde` on cargo repo (with ~7Gb target directory size) went down from ~380ms to ~100ms for me. Not too shabby. ### How should we test and review this PR? I've mostly tested it manually, running `cargo clean` against multiple different repos. ### Additional information TODO: - [x] [c770700](c770700) is not quite correct; we need to consider that it changes how progress reporting works; as is, we're gonna report all progress relatively quickly and stall at the end (when we're actually iterating over directories, globbing, removing files, that kind of jazz). I'll address that.

Kobzol · 2024-12-23T11:02:03Z

I benchmarked this locally and this improves the speed (on top of #135) on the rust-lang/rust repository by 20%, which is quite nice.

@the8472 Could you please take a look? The patch looks good to me.

src/lib.rs

osiewicz · 2024-12-23T12:02:54Z

Huh, the CI failure seems a bit spurious to me..

Kobzol · 2024-12-23T12:55:48Z

Uh-oh, this repo still runs CI on Rust 1.23.0, but one of the dev-dependencies has upgraded to a version that uses a too new Rust feature.

src/lib.rs

the8472

I wonder why we're even sorting. I can't find any mention in the API that guarantees any ordering of the returned paths.
So for the performance issue it'd be even more efficient to not sort at all.

I guess it's inherited from the libc's glob, but at least that has GLOB_NOSORT to opt-out... oh well.

the8472 · 2024-12-30T00:49:29Z

src/lib.rs

 struct PathWrapper {
    path: PathBuf,
    is_directory: bool,
+    file_name: Option<OsString>,


It seems odd to me that an extra allocation for each path speeds things up. This is just to avoid the work of finding the filename in a path, right?

Have you tried storing the indexes into path and reconstituting an &OsStr by slicing into path?

This is just to avoid the work of finding the filename in a path, right?

Yes, with a caveat that on the current main version we end up performing a filename search multiple times per each path, as it's done in a sort comparator. That's where the slowdown source is - an extra allocation is prolly slower than extracting a filename once, but if we extract it 100's of times it might end up being worth it.

No, I have not tried to store the subrange of a filename. I'll give it a go, thank you for the suggestion!

Ah the necessary methods needed to do the slicing don't seem to be MSRV-compatible, a pity.

A possible workaround for that is to do the slicing on &str and fall back to pulling the file_name out of the pathbuf on every access when it's not utf8-compatible (which should be rare).

That'd require either running UTF-8 validation on PathWrapper::path whenever we try to use a filename slice (or introducing unsafe into the lib to avoid the repeated validation); I wonder if the extra complexity is worth it in the first place.
FWIW, b7dbb48 increases peak memory usage of a rust-lang/rust benchmark by about 23%. I've managed to reduce this to 6% by replacing OsString with Box<OsStr> in 778365d; I agree that allocating here is not ideal, but I'm unsure about our options here.

I think we can get away without unsafe. We can go Path -> Osstr -> Option<&str> -> Option<&[u8]> and then compare the [u8]s most of the time and only in non-utf8 cases use the old code path.

We'd have to do that each time we need to inspect the filename, right? We can't store a reference (Option<&[u8]> one) to PathWrapper::path within PathWrapper itself.
I believe that OsStr -> Option<&str> might entail a UTF-8 validation. It'd be fine to do it once, but if above is correct, we'd have to do it on each filename access.

I might also be missing something and thus be totally off the base.

Ah right. I guess we can store a pointer instead of a reference. That'd require a bit of unsafe after all, but should be ok since Pathbuf contents are heap-allocated and don't move.

Tbh I'm not sold on this change; I did get it to work with transmutes, but the difference in runtime for rust-lang benchmark is about 3% (40.0ms vs 41.5ms). I'm honestly not sure if that's worth additional unsafes and all that.

osiewicz · 2024-12-30T01:11:09Z

Could we follow suit and add a sort_paths field to MatchOptions that'd default to true?

the8472 · 2024-12-30T01:45:17Z

Yeah that'd make sense I think but should be done in a separate PR since the fields of MatchOptions are all pub so it'd be a breaking change and needs a new version.

What you can try for this PR is to check if unstable sort would provide another speedup. Since filenames in a directory should be unique (barring pathological filesystems) we don't need a stable sort.

the8472 · 2024-12-30T01:56:42Z

You should also rebase your PR to get the CI fixes.

tgross35 · 2024-12-30T01:58:36Z

Yeah that'd make sense I think but should be done in a separate PR since the fields of MatchOptions are all pub so it'd be a breaking change and needs a new version.

The next breaking change should probably also mark this #[non_exhaustive]. Which requires bumping the MSRV to at least 1.40, but I think that's unobjectionable anyway.

Background: While working with cargo, I've noticed that it takes ~30s to cargo clean -p with large enough target directory (~200GB). With a profiler, it turned out that most of the time was spent retrieving paths for removal in https://github.com/rust-lang/cargo/blob/eee4ea2f5a5fa1ae184a44675315548ec932a15c/src/cargo/ops/cargo_clean.rs#L319 (and not actually removing the files). Change description: In call to .sort_by, we repetitively parse the paths to obtain file names for comparison. This commit caches file names in PathWrapper object, akin to rust-lang#135 that did so for dir info. For my use case, a cargo build using that branch takes ~14s to clean files instead of previous 30s (I've measured against main branch of this directory, to account for changes made since 0.3.1). Still not ideal, but hey, we're shaving 50% of time off for a bit heavier memory use.

osiewicz · 2024-12-30T09:22:00Z

sort_unstable_by doesn't seem to be substantially faster for rust-lang/rust repro used for #135.

osiewicz · 2025-10-21T08:12:50Z

Is there anything I could do to push this PR forward?

Kobzol · 2025-10-22T08:01:42Z

Sorry, I totally forgot about this. I iterated on the design a little bit, and there's probably a lot of improvements that could be done here, e.g. using arena allocation. But what I noticed in my benchmarks that really the most expensive part is not the allocation itself, but rather iterating over the components of a path, especially if it has a bunch of nested directories.

I think that one improvement could also be to get rid of sorting or use some smarter sorting mechanism, such as popping things in a sorted order, using a heap or something like that.

But as the first step, I would like to propose a middle ground that is smaller than this PR and achieves almost the same gain. The thing to note is that we only ever need the filename for sorting in fill_todo, so it is wasteful to store it in PathWrapper for all paths. Furthermore, if we extract the filename from DirEntry, it's actually a pretty cheap operation (as the filename is already known, and we "just" need to heap allocate it). So we can only extract the filename in this specific situation.

I tried to implement this here: https://github.com/rust-lang/glob/compare/master...Kobzol:glob:cache-filename?expand=1 This reduced the find to grep for all .rs files in a rustc checkout from ~1000ms to ~800ms. Let me know what you think.

osiewicz · 2025-10-22T10:03:48Z

That sounds ok to me.

Kobzol · 2025-10-23T07:10:09Z

I did also benchmarks on this branch and my solution seems to be faster on rust-lang/rust. So I'd rather go forward with that, I'll open a PR.

Kobzol · 2025-10-23T07:33:00Z

Thanks for finding this, btw. Eventually we might want to remove the sorting altogether in 1.0, but this is a pretty nice speedup for a small code change.

osiewicz mentioned this pull request Apr 28, 2024

Clean package perf improvements rust-lang/cargo#13818

Merged

1 task

Kobzol reviewed Dec 23, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

the8472 self-assigned this Dec 23, 2024

osiewicz commented Dec 23, 2024

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

the8472 reviewed Dec 30, 2024

View reviewed changes

osiewicz added 2 commits December 30, 2024 09:31

review: Remove extraenous .unwrap call

b7dbb48

osiewicz force-pushed the cache-file-name branch from 76cafd1 to b7dbb48 Compare December 30, 2024 08:31

osiewicz added 2 commits December 30, 2024 11:11

Use Box<OsStr> instead of OsString for filename storage

778365d

Use Box<Path> instead of PathBuf

3b0c35a

tgross35 mentioned this pull request Jan 10, 2025

[1.0] Mark MatchOptions #[non_exhaustive] #158

Open

osiewicz requested a review from the8472 March 20, 2025 12:28

osiewicz mentioned this pull request Oct 21, 2025

Support cargo clean --workspace rust-lang/cargo#14720

Open

Kobzol mentioned this pull request Oct 23, 2025

Cache filename for sorting in fill_todo #181

Open

osiewicz closed this Oct 23, 2025

Cache file names in fill_todo #144

Cache file names in fill_todo #144

Uh oh!

Conversation

osiewicz commented Apr 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kobzol commented Dec 23, 2024

Uh oh!

Uh oh!

osiewicz commented Dec 23, 2024

Uh oh!

Kobzol commented Dec 23, 2024

Uh oh!

Uh oh!

the8472 left a comment

Choose a reason for hiding this comment

Uh oh!

the8472 Dec 30, 2024

Choose a reason for hiding this comment

Uh oh!

osiewicz Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

the8472 Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

osiewicz Dec 30, 2024

Choose a reason for hiding this comment

Uh oh!

the8472 Dec 30, 2024

Choose a reason for hiding this comment

Uh oh!

osiewicz Dec 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

the8472 Dec 30, 2024

Choose a reason for hiding this comment

Uh oh!

osiewicz Jan 4, 2025

Choose a reason for hiding this comment

Uh oh!

osiewicz commented Dec 30, 2024

Uh oh!

the8472 commented Dec 30, 2024

Uh oh!

the8472 commented Dec 30, 2024

Uh oh!

tgross35 commented Dec 30, 2024

Uh oh!

osiewicz commented Dec 30, 2024

Uh oh!

osiewicz commented Oct 21, 2025

Uh oh!

Kobzol commented Oct 22, 2025

Uh oh!

osiewicz commented Oct 22, 2025

Uh oh!

Kobzol commented Oct 23, 2025

Uh oh!

Kobzol commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

osiewicz commented Apr 27, 2024 •

edited

Loading

osiewicz Dec 30, 2024 •

edited

Loading

the8472 Dec 30, 2024 •

edited

Loading

osiewicz Dec 30, 2024 •

edited

Loading