Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use intermediate satisfier causes in priority statistics #291

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

Eh2406
Copy link
Member

@Eh2406 Eh2406 commented Dec 5, 2024

This provides more information to the DP::prioritize call. Inspired by @x-hgg-x investigation #274 (comment), it provides the number of times the package has been involved in any backtracking conflict. Inspired by UV's prioritization, it provides the order the package was discovered. When using Priority = (conflict_count, Reverse<matched versions>, Reverse<discovery_order>) Resolving all crates (including Solana) on 10 threads went from 162.45min to 88.96min! The slowest individual crate went from [email protected] 95s to [email protected] 21s! (Note that the runtime for any individual crate may vary run to run due to system load and scheduling.)

The new fields are hidden behind a PackageResolutionStatistics with getters, to make it easier to add more in the future without additional breaking changes. Getters are used in case some value we want to return is expensive to calculate and so we only want to calculate it if someone can read it. I attempted to use that pattern to avoid the hash map look up for conflict_count, but then PackageResolutionStatistics ends up generic over P. Which seemed unpleasant. If people have other suggestions for how to future proof this API I'm all ears!

Theoretically the package name and range could also become getters on PackageResolutionStatistics. But I consider those two exponentially more important and more useful than the ones where adding now. Deserving of always being computed and getting dedicated API surface. But opinions may vary.

Copy link

codspeed-hq bot commented Dec 6, 2024

CodSpeed Performance Report

Merging #291 will not alter performance

Comparing Eh2406:conflict_count (49f3f9d) with dev (3bef331)

Summary

✅ 6 untouched benchmarks

src/provider.rs Outdated Show resolved Hide resolved
src/solver.rs Outdated Show resolved Hide resolved
@Eh2406
Copy link
Member Author

Eh2406 commented Dec 10, 2024

@konstin what do you think? Do you want me to reduce a benchmark where this makes a big difference?

Copy link
Member

@konstin konstin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea how they'll benchmark on uv (as in, will we see a perf hit or not?), but i like giving users the information and the better default prioritize.

src/solver.rs Outdated
/// it is having the most problems with.
///
/// Note: The exact values depend on implementation details of PubGrub. So should not be relied on and may change.
pub fn conflict_count(&self) -> u32 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be very useful for something i'm writing right now 👀

I'm tracking two different properties atm: Which package previously chosen package A is conflicting with the currently chosen package B, because i want to deprioritize A (it's always making the latter choices fail) while prioritizing B (if i pick B first, I have a good chance at getting the right version of A, without walking through all versions of B first), which requires some mechanism of knowing which is which.

src/solver.rs Outdated
/// Prioritizing based on this value directly allows the resolver to focus on the packages
/// it is having the most problems with.
///
/// Note: The exact values depend on implementation details of PubGrub. So should not be relied on and may change.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be clearer about what this value means. Something like:

Every time an incompatibility is fulfilled, the package involved are marked as conflicting. PubGrub builds incompatibilities lazily, and there construction may change over time, so the value may also change across PubGrub releases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that a user would know what a incompatibility is. I'm not finding a good balance between too much detail and explaining why things could change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a consumer of the library probably needs to know what an incompatibility is, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not every user. But one that start playing with conflict counts should definitely IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use a two-tiered explanantion: For basic users, this number is a measure for conflictingn-ess, the higher it is, the more you want to prioritize the package. For advanced users, it's helpful to know that this is the number of incompatibilities that were satisfied that this package was involved in, so a user can implement heuristics on top of it (as we want in uv), e.g. weighing of bounds-strictness vs. conflicting-ness.

src/solver.rs Outdated
impl PackageResolutionStatistics {
fn new<P: Package>(pid: Id<P>, conflict_count: &Map<Id<P>, u32>) -> Self {
Self {
discovery_order: pid.into_raw() as u32,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you manually call alloc in some order, e.g. for dependencies before your actual package for a pre-fetching related reason, would that break this order?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it's the order alloc is called. So the actual values would certainly be affected by prefetching. However, prioritizes is only be called on packages that are available for decision-making, so the impact on prioritization should not be dramatically affected.

@x-hgg-x
Copy link
Contributor

x-hgg-x commented Dec 13, 2024

Counting the root cause instead of the first incompatibility is around 5% faster in CPU time for my implementation (https://github.com/x-hgg-x/pubgrub-bench) to resolve all crates (2789s -> 2663s), and 12% faster to resolve solana crates (385s -> 341s).

@konstin

This comment was marked as outdated.

@arielb1
Copy link

arielb1 commented Dec 15, 2024

The latest solver change doesn't 100% fix the problem:

       Bucket:[email protected]/dep:ed25519-dalek SemverPubgrub { norml: 1.18.23 | 1.18.25 | 1.18.26, pre: <E2><88><85> }  depends on Bucket:[email protected]/default=true SemverPubgrub { norml: >=1.0.1, <1.0.2, pre: <E2><88><85> }
[2024-12-15T19:42:11Z INFO  pubgrub::internal::core] backtrack to DecisionLevel(2)
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(46) = 'Bucket:[email protected]/default=true' @ Some(Version { major: 1, minor: 18, patch: 22 })
[2024-12-15T19:42:11Z INFO  pubgrub::internal::partial_solution] add_decision: Id::<Names>(46) @ 1.18.22
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(46) = 'Bucket:[email protected]/default=true'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(279) = 'Bucket:[email protected]/default' @ Some(Version { major: 1, minor: 0, patch: 0, pre: Prerelease("pre.3") })
[2024-12-15T19:42:11Z INFO  pubgrub::solver] add_decision (not first time): Id::<Names>(279) = 'Bucket:[email protected]/default' @ 1.0.0-pre.3
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(279) = 'Bucket:[email protected]/default'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(280) = 'Bucket:[email protected]/u64_backend' @ Some(Version { major: 1, minor: 0, patch: 0, pre: Prerelease("pre.3") })[2024-12-15T19:42:11Z INFO  pubgrub::solver] add_decision (not first time): Id::<Names>(280) = 'Bucket:[email protected]/u64_backend' @ 1.0.0-pre.3
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(280) = 'Bucket:[email protected]/u64_backend'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(281) = 'Bucket:[email protected]/std' @ Some(Version { major: 1, minor: 0, patch: 0, pre: Prerelease("pre.3") })
[2024-12-15T19:42:11Z INFO  pubgrub::solver] add_decision (not first time): Id::<Names>(281) = 'Bucket:[email protected]/std' @ 1.0.0-pre.3
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(281) = 'Bucket:[email protected]/std'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(284) = 'Bucket:[email protected]/rand' @ Some(Version { major: 1, minor: 0, patch: 0, pre: Prerelease("pre.3") })
[2024-12-15T19:42:11Z INFO  pubgrub::solver] add_decision (not first time): Id::<Names>(284) = 'Bucket:[email protected]/rand' @ 1.0.0-pre.3
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(284) = 'Bucket:[email protected]/rand'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(286) = 'Bucket:[email protected]/dep:rand' @ Some(Version { major: 1, minor: 0, patch: 0, pre: Prerelease("pre.3") })
[2024-12-15T19:42:11Z INFO  pubgrub::solver] add_decision (not first time): Id::<Names>(286) = 'Bucket:[email protected]/dep:rand' @ 1.0.0-pre.3
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(286) = 'Bucket:[email protected]/dep:rand'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(355) = 'Bucket:[email protected]/default' @ Some(Version { major: 1, minor: 18, patch: 22 })
[2024-12-15T19:42:11Z INFO  pubgrub::internal::partial_solution] add_decision: Id::<Names>(355) @ 1.18.22
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(355) = 'Bucket:[email protected]/default'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(356) = 'Bucket:[email protected]/full' @ Some(Version { major: 1, minor: 18, patch: 22 })
[2024-12-15T19:42:11Z INFO  pubgrub::internal::partial_solution] add_decision: Id::<Names>(356) @ 1.18.22
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(356) = 'Bucket:[email protected]/full'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(357) = 'Bucket:[email protected]/byteorder' @ Some(Version { major: 1, minor: 18, patch: 22 })
[2024-12-15T19:42:11Z INFO  pubgrub::internal::partial_solution] add_decision: Id::<Names>(357) @ 1.18.22
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(357) = 'Bucket:[email protected]/byteorder'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(358) = 'Bucket:[email protected]/ed25519-dalek-bip32' @ Some(Version { major: 1, minor: 18, patch: 22 })
[2024-12-15T19:42:11Z INFO  pubgrub::internal::partial_solution] add_decision: Id::<Names>(358) @ 1.18.22
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(358) = 'Bucket:[email protected]/ed25519-dalek-bip32'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(359) = 'Bucket:[email protected]/ed25519-dalek' @ Some(Version { major: 1, minor: 18, patch: 22 })
[2024-12-15T19:42:11Z INFO  pubgrub::internal::partial_solution] add_decision: Id::<Names>(359) @ 1.18.22
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(359) = 'Bucket:[email protected]/ed25519-dalek'
[2024-12-15T19:42:11Z INFO  pubgrub::solver] DP chose: Id::<Names>(373) = 'Bucket:[email protected]/dep:ed25519-dalek' @ Some(Version { major: 1, minor: 18, patch: 22 })
[2024-12-15T19:42:11Z INFO  pubgrub::internal::partial_solution] not adding Id::<Names>(373) @ 1.18.22 because of its dependencies
[2024-12-15T19:42:11Z INFO  pubgrub::solver] unit_propagation: Id::<Names>(373) = 'Bucket:[email protected]/dep:ed25519-dalek'
[2024-12-15T19:42:11Z INFO  pubgrub::internal::core] Start conflict resolution because incompat satisfied:
       Bucket:[email protected]/dep:ed25519-dalek SemverPubgrub { norml: 1.18.22 | 1.18.23 | 1.18.25 | 1.18.26, pre: <E2><88><85> }  depends on Bucket:[email protected]/default=true SemverPubgrub { norml: >=1.0.1, <1.0.2, pre: <E2><88><85> }

I would expect it to be able to prioritize this chain without "wandering":

[email protected]/default=true -> Bucket:[email protected]/default -> [email protected]/full -> [email protected]/ed25519-dalek -> [email protected]/dep:ed25519-dalek

The "root cause" it finds is

[2024-12-15T19:44:45Z INFO  pubgrub::solver] DP chose: Id::<Names>(39) = 'Bucket:[email protected]/default=true'
[2024-12-15T19:44:45Z INFO  pubgrub::solver] DP chose: Id::<Names>(46) = 'Bucket:[email protected]/default=true'

Which is insufficient since it could find the rest of the packages in the chain to load them.

konstin added a commit that referenced this pull request Dec 16, 2024
Whenever we either discard a version due to its dependencies or perform conflict resolution, we return the last conflict that led to discarding them.

In cargo, we use this information for prioritization, which speeds up resolution (`cargo run -r -- -m pub --with-solana --filter solana-archiver-lib -t 16` goes from 90s to 20s on my machine).

Configurations that are noticeably slower for the solana test case:
* All incompatibilities unit propagation
* Only the last root cause in unit propagation
* No incompatibilities from unit propagation
* No incompatibilities from `add_version`
* Only affect counts (without culprit counts)
* Backtracking with the same heuristic as astral-sh/uv#9843 (backtracking once after affected hits 5)

In uv, we use this to re-prioritize and backtrack when a package decision accumulated to many conflicts. Since we have our own solver loop, we add the incompatibility to our own tracking instead.

Built on #291

## Benchmarks

Main:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:  1215.49s ==  20.26min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    80.58s ==   1.34min
```

With #291:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   467.73s ==   7.80min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    34.76s ==   0.58min
```

This PR:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   271.79s ==   4.53min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    20.17s ==   0.34min
```
konstin added a commit to astral-sh/pubgrub that referenced this pull request Dec 16, 2024
Whenever we either discard a version due to its dependencies or perform conflict resolution, we return the last conflict that led to discarding them.

In cargo, we use this information for prioritization, which speeds up resolution (`cargo run -r -- -m pub --with-solana --filter solana-archiver-lib -t 16` goes from 90s to 20s on my machine).

Configurations that are noticeably slower for the solana test case:
* All incompatibilities unit propagation
* Only the last root cause in unit propagation
* No incompatibilities from unit propagation
* No incompatibilities from `add_version`
* Only affect counts (without culprit counts)
* Backtracking with the same heuristic as astral-sh/uv#9843 (backtracking once after affected hits 5)

In uv, we use this to re-prioritize and backtrack when a package decision accumulated to many conflicts. Since we have our own solver loop, we add the incompatibility to our own tracking instead.

Built on pubgrub-rs#291

## Benchmarks

Main:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:  1215.49s ==  20.26min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    80.58s ==   1.34min
```

With pubgrub-rs#291:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   467.73s ==   7.80min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    34.76s ==   0.58min
```

This PR:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   271.79s ==   4.53min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    20.17s ==   0.34min
```
konstin added a commit to astral-sh/pubgrub that referenced this pull request Dec 16, 2024
This PR is the child of #36 and
pubgrub-rs#291, providing an
implementation that works for both cargo and uv. Upstream PR:
pubgrub-rs#298.

Specifically, we use the returned incompatibility in
astral-sh/uv#9843, but not
`PackageResolutionStatistics`.

---

Whenever we either discard a version due to its dependencies or perform
conflict resolution, we return the last conflict that led to discarding
them.

In cargo, we use this information for prioritization, which speeds up
resolution (`cargo run -r -- -m pub --with-solana --filter
solana-archiver-lib -t 16` goes from 90s to 20s on my machine).

Configurations that are noticeably slower for the solana test case:
* All incompatibilities unit propagation
* Only the last root cause in unit propagation
* No incompatibilities from unit propagation
* No incompatibilities from `add_version`
* Only affect counts (without culprit counts)
* Backtracking with the same heuristic as
astral-sh/uv#9843 (backtracking once after
affected hits 5)

In uv, we use this to re-prioritize and backtrack when a package
decision accumulated to many conflicts. Since we have our own solver
loop, we add the incompatibility to our own tracking instead.

Built on pubgrub-rs#291

## Benchmarks

Main:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:  1215.49s ==  20.26min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    80.58s ==   1.34min
```

With pubgrub-rs#291:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   467.73s ==   7.80min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    34.76s ==   0.58min
```

This PR:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   271.79s ==   4.53min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    20.17s ==   0.34min
```
@Eh2406
Copy link
Member Author

Eh2406 commented Dec 16, 2024

Having spent the morning staring at the logs @arielb1 found. It's (now) clear that one of the problems here is a difference in perspective from before and after things are "loaded". Once a conflict is found, we are in the "after loading" perspective where Bucket:[email protected]/default=true is incompatible with the prior decision for Bucket:[email protected]/default=true is an obvious implication of Bucket:[email protected]/default=true depends on ... depends on Bucket:[email protected]/default=true. Because the entire chain is built out of singleton requirements (that can only match one thing), unit_propagation would be able to see that they are the same. Therefore, root_cause is generated as the shorter and clearer version. Similarly, because wearing the "after loaded" perspective we can see that we need to do one back jump all the way to Bucket:[email protected]/default=true, instead of back jumping one (inevitable) decision at a time. But after that background, we are now looking at a new version of Bucket:[email protected]/default=true and so are in the "before loading" situation. So the simplified version does not have enough information for us to tell which of these not yet evaluated packages actually reduce down to the problem.

That would suggest that counting the packages that were involved in each step of the loop in conflict_resolution would be more effective. It didn't work yesterday, but none of my performance numbers are looking the same as yesterday. So I should try again.

@Eh2406 Eh2406 force-pushed the conflict_count branch 2 times, most recently from 61c8aee to 6502125 Compare December 18, 2024 17:30
src/internal/core.rs Outdated Show resolved Hide resolved
github-merge-queue bot pushed a commit that referenced this pull request Dec 18, 2024
* Return and track affected and culprit on conflicts

Whenever we either discard a version due to its dependencies or perform conflict resolution, we return the last conflict that led to discarding them.

In cargo, we use this information for prioritization, which speeds up resolution (`cargo run -r -- -m pub --with-solana --filter solana-archiver-lib -t 16` goes from 90s to 20s on my machine).

Configurations that are noticeably slower for the solana test case:
* All incompatibilities unit propagation
* Only the last root cause in unit propagation
* No incompatibilities from unit propagation
* No incompatibilities from `add_version`
* Only affect counts (without culprit counts)
* Backtracking with the same heuristic as astral-sh/uv#9843 (backtracking once after affected hits 5)

In uv, we use this to re-prioritize and backtrack when a package decision accumulated to many conflicts. Since we have our own solver loop, we add the incompatibility to our own tracking instead.

Built on #291

## Benchmarks

Main:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:  1215.49s ==  20.26min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    80.58s ==   1.34min
```

With #291:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   467.73s ==   7.80min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    34.76s ==   0.58min
```

This PR:

```
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   271.79s ==   4.53min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    20.17s ==   0.34min
```

* Use smallvec for root causes

* Add more docs

* Review
@Eh2406 Eh2406 changed the title provide more data for prioritization use intermediate satisfier causes in priority statistics Dec 19, 2024
@Eh2406
Copy link
Member Author

Eh2406 commented Dec 19, 2024

Rebased, which removed most of the code. Added in extensively wordy comment. I'm not so sure about the quality of this comment. It may have too many words and say too little. Edits are welcome.

Overall performance, measured with cargo r -r -- --with-solana -m pub -t 36 --filter solana went from Pub CPU time: 29.64min to 21.98min. --filter solana-archiver-lib went from 2.78min to 2.46min.

@konstin
Copy link
Member

konstin commented Dec 19, 2024

I'm seeing a massive speedup with solana-archiver-lib. I'm testing with 8 threads since 16 overflow with dev.

Before (dev, 3bef331)

$ cargo run -r -- -m pub --with-solana --filter solana-archiver-lib -t 8
Running in mode Pub on 8 rayon threads.
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   257.76s ==   4.30min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    37.05s ==   0.62min

After (PR, e08db34)

$ cargo run -r -- -m pub --with-solana --filter solana-archiver-lib -t 8
Running in mode Pub on 8 rayon threads.
        index commit hash: 82086e46740d7a9303216bfac093e7268a95121f
        index commit time: 2024-11-30T18:18:14Z
               index size: 32
          solana in index: 32
             Pub CPU time:   143.82s ==   2.40min
           Cargo CPU time: skipped
Cargo check lock CPU time: skipped
  Pub check lock CPU time: skipped
                Wall time:    20.17s ==   0.34min

It's an improvement in both measurements, thought I'm getting a very different speedup.

Counterintuitively, the should cancel count is way worse while the runtime performance improves. Is there a good non-walltime metric?

image

Patch:

    type Priority = (u32, Reverse<usize>);

    fn prioritize(
        &self,
        package: &Names<'c>,
        range: &RcSemverPubgrub,
        package_conflicts_counts: &PackageResolutionStatistics,
    ) -> Self::Priority {
        let matches_count = match package {
            Names::Links(_name) => {
                // PubGrub automatically handles when any requirement has no overlap. So this is only deciding a importance of picking the version:
                //
                // - If it only matches one thing, then adding the decision with no additional dependencies makes no difference.
                // - If it can match more than one thing, and it is entirely equivalent to picking the packages directly which would make more sense to the users.
                //
                // So only rubberstamp links attributes when all other decisions are made, by setting the priority as low as it will go.
                usize::MAX
            }

            Names::Wide(_, req, _, _) => self.count_wide_matches(range, &package.crate_(), req),
            Names::WideFeatures(_, req, _, _, _) | Names::WideDefaultFeatures(_, req, _, _) => self
                .count_wide_matches(range, &package.crate_(), req)
                .saturating_add(1),

            Names::Bucket(_, _, _) => self.count_matches(range, &package.crate_()),
            Names::BucketFeatures(_, _, _) | Names::BucketDefaultFeatures(_, _) => self
                .count_matches(range, &package.crate_())
                .saturating_add(1),
        };
        (
            package_conflicts_counts.conflict_count(),
            Reverse(matches_count),
        )
    }

(I'll still have to give the updated PR a proper read)

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 20, 2024

I'm seeing a massive speedup with solana-archiver-lib. I'm testing with 8 threads since 16 overflow with dev.

Has memory consumption gotten worse? 16 was working for you before, right? I will need to come back to the oddly high memory usage.

It's an improvement in both measurements, thought I'm getting a very different speedup.

I mostly watch the CPU time. But it is odd how different % improved is between my computer and yours.

Counterintuitively, the should cancel count is way worse while the runtime performance improves. Is there a good non-walltime metric?

I have not found one yet. Sorry.

Patch

And I forgot to push the code I was running before I logged off the work computer. Logged back on to push this for you. I was using Eh2406/pubgrub-crates-benchmark@80048df

@Eh2406
Copy link
Member Author

Eh2406 commented Dec 20, 2024

cargo run -r -- -m pub --with-solana --filter solana-archiver-lib -t 8

with 3bef331 with Eh2406/pubgrub-crates-benchmark@80048df

             Pub CPU time:   143.91s ==   2.40min
                Wall time:    18.97s ==   0.32min

with e08db34 with Eh2406/pubgrub-crates-benchmark@80048df

             Pub CPU time:   126.50s ==   2.11min
                Wall time:    16.67s ==   0.28min

with 3bef331 and your patch

             Pub CPU time:   437.86s ==   7.30min
                Wall time:    63.07s ==   1.05min

with e08db34 with your patch

             Pub CPU time:   251.98s ==   4.20min
                Wall time:    35.26s ==   0.59min

@konstin
Copy link
Member

konstin commented Dec 20, 2024

With Eh2406/pubgrub-crates-benchmark@80048df i'm getting the same numbers too now :)

konstin pushed a commit to astral-sh/pubgrub that referenced this pull request Dec 20, 2024
@Eh2406
Copy link
Member Author

Eh2406 commented Dec 20, 2024

Feel free to merge when you had a chance to review the code.

/// This will prevent us going down this path again. However when we start looking at version 2 of A,
/// and discover that it depends on version 2 of B, we will want to prioritize the chain of intermediate steps
/// to confirm if it has a problem with the same shape.
/// The `satisfier_causes` argument keeps track of these intermediate steps so that the caller can use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried some editing:

Return the root cause or the terminal incompatibility. CF
<https://github.com/dart-lang/pub/blob/master/doc/solver.md#unit-propagation>

When we found a conflict, we want to learn as much as possible from it, to avoid making (or
keeping) decisions that will be rejected. Say we found that the decision for X and the
decision for Y are incompatible. We may find that the decisions on earlier packages B and C
require us to make incompatible decisions on X and Y, so we backtrack until either B or C
can be revisited. To make it practical, we really only need one of the terms to be a
decision. We may as well leave the other terms general. Something like "the dependency on
the package X is incompatible with the decision on C" tends to work out pretty well. Then if
A turns out to also have a dependency on X the resulting root cause is still useful. Of
course, this is more heuristics than science. If the output is too general, then
`unit_propagation` will handle the confusion by calling us again with the next most specific
conflict it comes across. If the output is too specific, then the outer `solver` loop will
eventually end up calling us again until all possibilities are enumerated.

To end up with a more useful incompatibility, this function combines incompatibilities into
derivations. Fulfilling this derivation implies the later conflict. By banning it, we
prevent the intermediate steps from occurring again, at least in the exact same way.
However, the statistics collected for `prioritize` may want to analyze those intermediate
steps. For example we might start with "there is no version 1 of Z", and
`conflict_resolution` may be able to determine that "that was inevitable when we picked
version 1 of X" which was inevitable when we picked W and so on, until version 1 of B, which
was depended on by version 1 of A. Therefore the root cause may simplify all the way down to
"we cannot pick version 1 of A". This will prevent us going down this path again. However
when we start looking at version 2 of A, and discover that it depends on version 2 of B, we
will want to prioritize the chain of intermediate steps to check if it has a problem with
the same shape. The `satisfier_causes` argument keeps track of these intermediate steps so
that the caller can use them for prioritization.

Then if A turns out to also have a dependency on X the resulting root cause is still
useful.

How will we apply the root cause?

If the output is too general, then unit_propagation will handle the confusion by calling
us again with the next most specific conflict it comes across. If the output is too
specific, then the outer solver loop will eventually end up calling us again until all
possibilities are enumerated.

In which direction are "too general" and "too specific" to be understood here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like the edits. I have a nit about

Say we found that the decision for X and the decision for Y are incompatible.

I'm going to quibble about the term decision in this sentence. In my brain decision implies something that became constrained because of a call to choose_version/add_decision. Because we check for conflicts in add_version simple cases end up handled there. We generally do not end up here with a conflict that is already in terms of decisions. Instead We start with something abstract like X (in some range) depends on Y (in some range). It's the job of this function to peel back the layers, figure out why the partial solution has an assignment (which is not necessarily decision) that is disjoint with the mentioned range for Y and why the partial solution has an assignment that is a subset of the mentioned range for X.

How will we apply the root cause?

When we discover that A has a dependency on X unit_propagation will rescan the root cause
"the dependency on the package X is incompatible with the decision on C", and remove that version of C from the available range stored in partial solution.

In which direction are "too general" and "too specific" to be understood here?

I intended for to specific to mean referring to exact decided versions. "B == 123 is incompatible with C == 456" would be very specific. Extremely actionable, unit_propagation needs to do very little work to figure out if it's relevant. But also not likely to be reusable. We are likely to end up with O(# versions of B * # versions of C) of these incompatibilities laying around. "any version of X is incompatible with any version of Y" would be very general. There will only ever need to be one, and it will be relevant and useful any time X or Y come up again. But, it takes a lot of work to figure out how it relates to decisions or where we need to backtrack to.

Obviously the wording of the paragraphs needs work, thank you for helping me with it!

#[allow(clippy::type_complexity)]
#[cold]
fn conflict_resolution(
&mut self,
incompatibility: IncompDpId<DP>,
satisfier_causes: &mut SmallVec<(Id<DP::P>, IncompDpId<DP>)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this turns out to show up in anyone's profile while they don't use the output, we can replace it by a callback, but for now :shipit:

@konstin
Copy link
Member

konstin commented Dec 20, 2024

Code looks good, we can iterate over the comment now or later, it's not blocking merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants